Skip to content

Policy

Policy is a Resource in DataOS that defines a set of rules or guardrails governing the behavior of users, be it individuals or applications/services. Within DataOS, Policies are enforced using Attribute Based Access Control (ABAC) and define what predicates a user (a subject) can perform on a dataset, API Path, or a Resource (an object), thus defining the constraints of the relationship between the subject and object.

The Policy Resource operates on "never trust, always verify" ethos. It enforce a default denial stance, requiring users must explicitly request access to perform any action within the system. It establishes a continuous authorization mechanism, where access permissions are dynamically evaluated each time a user attempts an action. Access is granted only if the user has the requisite permissions at that precise moment.

In DataOS, a Policy is expressed as a distinct Resource in a declarative YAML format. This approach allows for the separation of policies from application code, promoting modularity, easy maintenance, and reducing the need for extensive redeployment. Additionally, DataOS distinguishes between the Policy Decision Point (PDP) and the Policy Enforcement Point (PEP). The PDP makes policy decisions based on predefined rules and attributes, while the PEP enforces these decisions, ensuring access control and compliance. This clear separation of concerns simplifies policy management, fostering scalability, extensibility, and maintainability in the DataOS ecosystem.

  • Types of Policy in DataOS


    Explore various policy types within the DataOS platform comprehensively.

    Types of Policies

  • How to create and manage a Policy?


    Learn how to create and manage a Policy in DataOS.

    Create and manage Policy

  • Policy Configuration Templates


    Enhance Policy management with Configuration Templates

    Policy Templates

  • Policy Implementation


    Explore a detailed walkthrough to implement Policy

    Hands on Guide

Understanding ABAC, PDP and PEP

Types of Policies

DataOS offers two primary categories of policies: Access Policy and Data Policy. These policies play essential roles in governing user access, actions, and data management within the DataOS system.

Access Policies serve as the initial layer of defense, overseeing user access and actions within the system. They establish a set of well-defined rules that determine whether a user, known as the subject, is authorized to perform a specific action, referred to as a predicate, on a given dataset, API path, or other resources, known as objects. These policies serve as regulatory mechanisms, effectively governing user interactions and ensuring that access to specific actions is either granted or denied. This decision is based on the evaluation of attributes associated with the subjects and objects involved in the access request.

Configuration of Access Policy

Access Policy YAML configuration

In contrast, Data Policy operates as a secondary layer of control, regulating the visibility and interaction with specific data once access has been granted. It involves the implementation of techniques such as data masking or filtering to obscure or restrict the visibility of sensitive or restricted data based on predefined rules or conditions.

For example, when working with a dataset that includes a column labeled credit_card_number, it is crucial to protect the sensitive information it contains from unintended exposure. Employing data masking policies or applying data anonymization methods becomes essential to secure the contents of this specific column.

Data Policy YAML configuration

Data Policy YAML configuration

Within Data Policy, we have two separate types one is the Data Masking Policy, and another is the Data Filtering Policy.

Data masking policies are designed to protect sensitive information by replacing original data with fictitious yet structurally similar data. This ensures that the privacy of sensitive data is maintained while keeping the data useful for purposes such as testing and development.

Data masking is particularly beneficial for Personally Identifiable Information (PII), where original data can be represented through masking, replacement with a placeholder (such as "####"), or obfuscation through a hash function.

Upon the application of a data masking policy, the original data is transformed, as illustrated in the sample example table below:

Data Category Original Value Masked Value
Email ID john.smith\@gmail.com bkfgohrnrtseqq85\@bkgiplpsrhsll16.com
Social Security Number (SSN) 987654321 867-92-3415
Credit Card Number 8671 9211 3415 4546 #### #### #### ####

Such a policy ensures the protection of sensitive details, including but not limited to names, titles, addresses, etc.

You can apply the Mask policy, say to mask the column with ‘customer name’ in it, directly from the Metis UI via policy tags or via DataOS CLI by applying a YAML file.

Data filtering policies establish parameters for determining which data elements should be accessible to various users or systems. Proper implementation of data filtering (or simply row filtering) policies ensures that only authorized individuals can access specific data segments.

For Example, in an e-commerce system, customer support agents may need access to customer order details. By applying a data filtering policy:

  • Agent A can only view order records associated with customers they are assigned to.
  • Agent B can only access order records for a specific geographic region.
  • Agent C, as a supervisor, has unrestricted access to all order records.

With the data filtering policy in place, each agent can efficiently access the necessary information while maintaining data security and confidentiality.

How to create and manage a Policy

In DataOS, both access and data policies are configured via the singular Policy Resource. However, the two policy-types have their own YAML configuration and different underlying implementation.

Create a Policy manifest

To create a Policy, the first step is to create a Policy manifest file. A sample Policy manifest is given below:

Example Policy manifest
# Resource meta section (1)
name: myaccesspolicy
version: v1 
type: policy 
tags: 
  - access_policy
description: This is a sample policy YAML configuration 
owner: iamgroot
layer: user

# Policy specific section (2)
policy:
  access:
    subjects:
      tags:
        - roles:id:user
        - roles:id:pii-reader
    predicates:
      - "read"
    objects:
      paths:
        - "dataos://icebase:retail/city"
    allow: true
  1. Resource meta section within a manifest file comprises metadata attributes universally applicable to all Resource-types. To learn more about how to configure attributes within this section, refer to the link: Attributes of Resource meta section.

  2. Policy-specific section within a manifest file comprises attributes specific to the Policy Resource. This section is different for Access and Data Policy .To learn more about how to configure attributes of Policy-specific section, refer to the link: Attributes of Policy manifest.

name: mydatapolicy
version: v1 
type: policy 
tags: 
  - policy
description: This is a sample policy YAML configuration
owner: iamgroot
layer: user
policy:
  data:
    type: filter
    name: "filtericebasecity"
    description: 'data policy to filter data on zip code'
    dataset_id: "icebase.retail.city"
    priority: 1
    selector:
      user:
        match: all
        tags:
          - "users:id:aayushisolanki"
    filters:
      - column: city_name
        operator: equals
        value: "Verbena"
name: bucketage
version: v1
type: policy
layer: user
description: "data policy to filter zip data"
policy:
  data:
    priority: 1
    type: mask
    depot: icebase
    collection: retail
    dataset: customer
    selector:
      column:
        tags:
          - PII.Age
      user:
        match: any
        tags:
          - "users:id:iamgroot"
    mask:
      operator: bucket_number
      bucket_number:
        buckets:
          - 5
          - 12
          - 18
          - 25
          - 45
          - 60
          - 70
    name: age_masking_policy
    description: An age bucket is formed by grouping the ages together. Based on defined
      age buckets, the age of individuals is redacted and anonymized. If an
      individual’s age falls under a defined bucket, it is replaced with the
      lowest value of the bucket.

The Policy manifest file is structurally comprised of the following sections:

Resource meta Section

To create a Policy YAML in DataOS, the initial step involves configuring the Resource Section in a YAML file. This section defines various properties of the Policy Resource. The following is an example YAML configuration for the Resource Section:

name: ${{my-policy}}
version: v1 
type: policy 
tags: 
  - ${{dataos:type:resource}}
  - ${{dataos:type:cluster-resource}}
description: ${{This is a sample policy YAML configuration}} 
owner: ${{iamgroot}}
layer: ${{user}}
name: my_policy
version: v1 
type: policy 
tags: 
  - policy
  - access
description: Policy manifest
owner: iamgroot
layer: users

Info

The layer field can have value either user/system in case of Policy.

For policies that govern authorization for system level resources such as API Paths, layer is system, while for user layer authorization such as access to UDL addresses it is user.

Additionally, the Resource section offers various configurable attributes, which can be explored further on the link: Attributes of Resource section.

Policy-specific section

The Policy-specific Section focuses on the configurations specific to the Policy Resource. Each Policy-type has its own YAML syntax.

Access Policies are defined using a subject-predicate-object triad. The YAML syntax for an Access Policy is as follows:

policy:
  access:
    subjects:
      tags:
        - ${{roles:id:user}}
        - ${{roles:id:pii-reader}}
    predicates:
      - ${{read}}
    objects:
      <tags/paths>:
        - ${{tag/path}}
    allow: ${{true}}

policy:
  access:
    subjects:
      tags:
        - roles:id:user
        - roles:id:pii-reader
    predicates:
      - "read"
    objects:
      path:
        - "dataos://icebase:retail/city"
    allow: true
Policy-specific Section YAML configuration (Access Policy Syntax)

The table below summarizes varioues attributes/fields within the access policy YAML.

Field Data Type Default Value Possible Value Requirement
policy object none none mandatory
access object none none mandatory
subjects object none none mandatory
tags list of strings none a valid DataOS tag mandatory
predicates list of strings none http or crud operations mandatory
objects object none none mandatory
paths list of strings none api paths, udl paths mandatory
allow boolean false true/false optional

Here, the subjects represents the user, the objects denotes the target (such as an API path or resource) that the user interacts with, and the predicates represents the action performed. The allow field determines whether the policy grants or restricts access for the user to perform the specified action on the designated object. Refer to the Attributes of Policy-specific section for more details on configuring subjects, predicates, and objects.

policy:
  data:
    type: filter
    name: ${filterpolicyname}
    description: ${sample data policy to filter data}
    dataset_id: ${{depot.collection.dataset_name}}
    priority: ${100}}
    selector:
      user:
      match: ${all|any}
      tags:
          - ${roles:id:user}
          - ${roles:id:pii_reader}
    filters:
      - column: ${column_name}
        operator: ${equals}
        value: ${"value"}
policy:
  data:
    type: filter
    name: "filtericebasecity"
    description: 'data policy to filter data on zip code'
    dataset_id: "icebase.retail.city"
    priority: 100
    selector:
      user:
        match: any
        tags:
          - "roles:id:user"
          - "roles:id:pii_reader"
    filters:
      - column: zip_code
        operator: not_equals
        value: "452001"

The table below summarizes the various attributes within a Filter Data Policy manifest.

Field Data Type Default Value Possible Value Requirement
policy object none none mandatory
data object none none mandatory
depot string none any valid depot name or regex pattern optional
collection string none any valid collection name or regex pattern optional
dataset_id string none any valid dataset identifier mandatory
priority number none 0-100 mandatory
selector object none none mandatory
user object none none mandatory
tags list of strings none a valid DataOS tag mandatory
column object none true/false optional
names list of strings none valid column name optional
tags list of tags none valid column tag defined under a tag group optional
type string none mask/filter mandatory
filters list none none mandatory
column string none any valid column name mandatory
operator string none any valid comparison operator (e.g., equals) mandatory
value string none any valid value for comparison mandatory

policy:
  data:
    type: mask
    name: ${email_masking_policy}
    description: to mask private mail address
    priority: 1

    depot: ${depot name}
    collection: ${collection name}
    dataset: ${dataset name}
    selector:
      column:
        tags:
          - ${PII.Email}
      user:
        match: ${all}
        tags:
          - ${"users:id:iamgroot"}
    mask:
      operator: ${hash}
      ${hash}:
        algo: sha256
policy:
  data:
    type: mask
    name: email_masking_policy
    description: to mask private mail address
    priority: 1
    depot: icebase
    collection: retail
    dataset: customer
    selector:
      column:
        tags:
          - PII.Email
      user:
        match: all
        tags:
          - "users:id:iamgroot"
    mask:
      operator: hash
      ${hash}:
        algo: sha256

The table below summarizes the various attributes within a Mask Data Policy manifest.

Field Data Type Default Value Possible Value Requirement
policy object none none mandatory
data object none none mandatory
depot string none any valid depot name or regex pattern optional
collection string none any valid collection name or regex pattern optional
dataset string none any valid dataset name or regex pattern optional
priority number none 0-100 mandatory
selector object none none mandatory
user object none none mandatory
tags list of strings none a valid DataOS tag mandatory
column object none true/false optional
names list of column names none valid column name optional
tags list of tags none valid column tag defined under a tag group optional
type string none mask/filter mandatory
mask object none none mandatory

For detailed information on configuring the YAML file for a Data Policy, refer to the link: Attributes of Policy-specific section.

Applying the YAML File

After creating the YAML configuration file for the Policy Resource, it's time to apply it to instantiate the resource in the DataOS environment. To apply the Policy YAML file, utilize the apply command.

dataos-ctl resource apply -f ${yaml-file-path} -w ${workspace-name}

Replace the ${yaml-file-path} and ${workspace-name} with respective absolute or relative file path of the Policy manifest and the Workspace name in which the Resource is to be instantiated.

dataos-ctl resource apply -f resources/policy.yaml -w public
# Expected Output
INFO[0000] 🛠 apply...                                   
INFO[0000] 🔧 applying filtericebasecity:v1:policy...    
INFO[0001] 🔧 applying filtericebasecity:v1:policy...created 
INFO[0001] 🛠 apply...complete  

Verify Policy Creation

To confirm that your Policy has been successfully created, you can verify it using two methods:

  • Check Policy in a Workspace: Use the following command to list the Policy created by you in a specific Workspace:
dataos-ctl get -t policy -w ${workspace-name}
dataos-ctl get -t policy -w public
  • Retrieve All Policy in a Workspace: To retrieve the list of all Policy created in the Workspace, add the -a flag to the command:
dataos-ctl get -t policy -w ${workspace-name} -a
dataos-ctl get -t policy -w public -a

You can also access the details of any created Policy through the DataOS GUI in the Resource tab of the Operations App.

Managing a Policy

Debugging a Policy

When a Policy encounters errors, data developers can employ various tactics to diagnose and resolve issues effectively. Here are the recommended debugging techniques:

  • Get Policy details

    • Retrieve detailed information about the Policy to gain deeper insights into its configuration and execution status. This can be accomplished using the following command:

      dataos-ctl resource get -t policy -w ${workspace-name} -n ${policy-name} -d
      
      dataos-ctl resource get -t policy -w public -n access_policy -d
      
    • Review the output to identify any discrepancies or misconfigurations in the Policy that could be contributing to the error.

Policy Implementation Mechanism

In the DataOS ecosystem, the Heimdall governance engine operates as the Policy Decision Point (PDP) for Access Policies, while the Minerva Gateway serves as the PDP for Data Policies. Both these elements jointly supervise the enforcement of policies across a range of Policy Enforcement Points (PEP), distributed throughout the DataOS ecosystem. Learn more about Policy implementation in DataOS, here.

Policy Configuration Templates

In this section, a collection of pre-configured Policy Resource Templates are provided, tailored to meet the requirements of different scenarios. To know more navigate to the following link

Case Scenarios

For detailed examples and practical implementations of Policy Resource, refer to the following Policy Resource Case Scenarios: