Skip to content

Policy

Policy is a Resource in DataOS that defines a set of rules or guardrails governing the behavior of users, be it individuals or applications/services. Within DataOS, Policies are enforced using Attribute Based Access Control (ABAC) and define what predicates a user (a subject can perform on a dataset, API Path, or a Resource (an object, thus defining the constraints of the relationship between the subject and object. To understand the key characteristics of Policy, refer to the following link: Core Concepts.

Types of Policies

DataOS offers two primary categories of policies: Access Policy and Data Policy. These policies play essential roles in governing user access, actions, and data management within the DataOS system.

Access Policies serve as the initial layer of defense, overseeing user access and actions within the system. They establish a set of well-defined rules that determine whether a user, known as the subject, is authorized to perform a specific action, referred to as a predicate, on a given dataset, API path, or other resources, known as objects. These policies serve as regulatory mechanisms, effectively governing user interactions and ensuring that access to specific actions is either granted or denied. This decision is based on the evaluation of attributes associated with the subjects and objects involved in the access request.

Access Policy

In contrast, Data Policy operates as a secondary layer of control, regulating the visibility and interaction with specific data once access has been granted. It involves the implementation of techniques such as data masking or filtering to obscure or restrict the visibility of sensitive or restricted data based on predefined rules or conditions.

For example, when working with a dataset that includes a column labeled credit_card_number, it is crucial to protect the sensitive information it contains from unintended exposure. Employing data masking policies or applying data anonymization methods becomes essential to secure the contents of this specific column.

Data Policy

Within Data Policy, we have two separate types one is the Data Masking Policy, and another is the Data Filtering Policy.

Data masking policies are designed to protect sensitive information by replacing original data with fictitious yet structurally similar data. This ensures that the privacy of sensitive data is maintained while keeping the data useful for purposes such as testing and development.

Data masking is particularly beneficial for Personally Identifiable Information (PII), where original data can be represented through masking, replacement with a placeholder (such as "####"), or obfuscation through a hash function.

Upon the application of a data masking policy, the original data is transformed, as illustrated in the sample example table below:

Data Category Original Value Masked Value
Email ID john.smith\@gmail.com bkfgohrnrtseqq85\@bkgiplpsrhsll16.com
Social Security Number (SSN) 987654321 867-92-3415
Credit Card Number 8671 9211 3415 4546 #### #### #### ####

Such a policy ensures the protection of sensitive details, including but not limited to names, titles, addresses, etc.

You can apply the Mask policy, say to mask the column with ‘customer name’ in it, directly from the Metis UI via policy tags or via DataOS CLI by applying a manifest file.

Data filtering policies establish parameters for determining which data elements should be accessible to various users or systems. Proper implementation of data filtering (or simply row filtering) policies ensures that only authorized individuals can access specific data segments.

For Example, in an e-commerce system, customer support agents may need access to customer order details. By applying a data filtering policy:

  • Agent A can only view order records associated with customers they are assigned to.
  • Agent B can only access order records for a specific geographic region.
  • Agent C, as a supervisor, has unrestricted access to all order records.

With the data filtering policy in place, each agent can efficiently access the necessary information while maintaining data security and confidentiality.

Structure of Policy manifest

Access Policies are defined using a subject-predicate-object triad. The YAML syntax for an Access Policy is as follows:

Access Policy manifest

policy_manifest_structure.yml
#Resource-specific section
version: v1
name: test-policy-allowing-access-01
type: policy
layer: user
description: "Policy allowing all users"
#Policy-specific section
policy:
  access:
    name:  test-policy-01
    description: this is to test policy
    collection: default       #optional
    subjects:
      tags:
       - "roles:id:operator"          
    predicates:
      - "read"
    objects:
      paths:                 # Sample dataset 
        - "dataos://postgresdp:public/product_data"
    allow: true              # Granting access

The Data Policy is further divided in two section: filter data policy and masking data policy.

Filter Data Policy manifest

sample_data_filter_policy.yml
name: filter-icebase-city
version: v1
type: policy
layer: user
description: "data policy to filter zip data"
policy:
  data:
    type: filter
    name: "filtericebasecity"
    priority: 10
    selector:
      user:
        match: any
        tags:
          - "users:id:iamgroot"
    filters:
      - column: county_name
        operator: not_equals
        value: "Autauga County"
    dataset_id: icebase.retail.city
    description: 'data policy to filter data on zip code'

Masking Data Policy manifest

policy_manifest_structure.yml
name: bucketage
version: v1
type: policy
layer: user
description: "data policy to filter zip data"
policy:
  data:
    priority: 1
    type: mask
    depot: icebase
    collection: retail
    dataset: customer
    selector:
      column:
        tags:
          - PII.mycustomtag
      user:
        match: any
        tags:
          - "roles:id:user"
    mask:
      operator: bucket_number
      bucket_number:
        buckets:
          - 15
          - 25
          - 30
          - 35
          - 40
          - 45
          - 50
          - 55
          - 60
    name: age_masking_policy
    description: An age bucket is formed by grouping the ages together. Based on defined
      age buckets, the age of individuals is redacted and anonymized. If an
      individual’s age falls under a defined bucket, it is replaced with the
      lowest value of the bucket.

First Steps

Policy Resource in DataOS can be created by applying the manifest file using the DataOS CLI. To learn more about this process, navigate to the link: First steps.

Configuration

Policy can be configured depending on the use case. For a detailed breakdown of the configuration options and attributes, please refer to the documentation: Attributes of Policy manifest.

Recipes