Policy¶
Policy is a Resource in DataOS that defines a set of rules or guardrails governing the behavior of users, be it individuals or applications/services. Within DataOS, Policies are enforced using Attribute Based Access Control (ABAC) and define what predicates a user (a subject) can perform on a dataset, API Path, or a Resource (an object), thus defining the constraints of the relationship between the subject and object.
The Policy Resource operates on "never trust, always verify" ethos. It enforce a default denial stance, requiring users must explicitly request access to perform any action within the system. It establishes a continuous authorization mechanism, where access permissions are dynamically evaluated each time a user attempts an action. Access is granted only if the user has the requisite permissions at that precise moment.
In DataOS, a Policy is expressed as a distinct Resource in a declarative YAML format. This approach allows for the separation of policies from application code, promoting modularity, easy maintenance, and reducing the need for extensive redeployment. Additionally, DataOS distinguishes between the Policy Decision Point (PDP) and the Policy Enforcement Point (PEP). The PDP makes policy decisions based on predefined rules and attributes, while the PEP enforces these decisions, ensuring access control and compliance. This clear separation of concerns simplifies policy management, fostering scalability, extensibility, and maintainability in the DataOS ecosystem.
-
Types of Policy in DataOS
Explore various policy types within the DataOS platform comprehensively.
-
How to create and manage a Policy?
Learn how to create and manage a Policy in DataOS.
-
Policy Configuration Templates
Enhance Policy management with Configuration Templates
-
Policy Implementation
Explore a detailed walkthrough to implement Policy
Understanding ABAC, PDP and PEP
Types of Policies¶
DataOS offers two primary categories of policies: Access Policy and Data Policy. These policies play essential roles in governing user access, actions, and data management within the DataOS system.
Access Policies serve as the initial layer of defense, overseeing user access and actions within the system. They establish a set of well-defined rules that determine whether a user, known as the subject, is authorized to perform a specific action, referred to as a predicate, on a given dataset, API path, or other resources, known as objects. These policies serve as regulatory mechanisms, effectively governing user interactions and ensuring that access to specific actions is either granted or denied. This decision is based on the evaluation of attributes associated with the subjects and objects involved in the access request.
In contrast, Data Policy operates as a secondary layer of control, regulating the visibility and interaction with specific data once access has been granted. It involves the implementation of techniques such as data masking or filtering to obscure or restrict the visibility of sensitive or restricted data based on predefined rules or conditions.
For example, when working with a dataset that includes a column labeled credit_card_number
, it is crucial to protect the sensitive information it contains from unintended exposure. Employing data masking policies or applying data anonymization methods becomes essential to secure the contents of this specific column.
Within Data Policy, we have two separate types one is the Data Masking Policy, and another is the Data Filtering Policy.
Data masking policies are designed to protect sensitive information by replacing original data with fictitious yet structurally similar data. This ensures that the privacy of sensitive data is maintained while keeping the data useful for purposes such as testing and development.
Data masking is particularly beneficial for Personally Identifiable Information (PII), where original data can be represented through masking, replacement with a placeholder (such as "####"), or obfuscation through a hash function.
Upon the application of a data masking policy, the original data is transformed, as illustrated in the sample example table below:
Data Category | Original Value | Masked Value |
---|---|---|
Email ID | john.smith\@gmail.com | bkfgohrnrtseqq85\@bkgiplpsrhsll16.com |
Social Security Number (SSN) | 987654321 | 867-92-3415 |
Credit Card Number | 8671 9211 3415 4546 | #### #### #### #### |
Such a policy ensures the protection of sensitive details, including but not limited to names, titles, addresses, etc.
You can apply the Mask policy, say to mask the column with ‘customer name’ in it, directly from the Metis UI via policy tags or via DataOS CLI by applying a YAML file.
Data filtering policies establish parameters for determining which data elements should be accessible to various users or systems. Proper implementation of data filtering (or simply row filtering) policies ensures that only authorized individuals can access specific data segments.
For Example, in an e-commerce system, customer support agents may need access to customer order details. By applying a data filtering policy:
- Agent A can only view order records associated with customers they are assigned to.
- Agent B can only access order records for a specific geographic region.
- Agent C, as a supervisor, has unrestricted access to all order records.
With the data filtering policy in place, each agent can efficiently access the necessary information while maintaining data security and confidentiality.
How to create and manage a Policy¶
In DataOS, both access and data policies are configured via the singular Policy Resource. However, the two policy-types have their own YAML configuration and different underlying implementation.
Create a Policy manifest¶
To create a Policy, the first step is to create a Policy manifest file. A sample Policy manifest is given below:
Example Policy manifest
# Resource meta section (1)
name: myaccesspolicy
version: v1
type: policy
tags:
- access_policy
description: This is a sample policy YAML configuration
owner: iamgroot
layer: user
# Policy specific section (2)
policy:
access:
subjects:
tags:
- roles:id:user
- roles:id:pii-reader
predicates:
- "read"
objects:
paths:
- "dataos://icebase:retail/city"
allow: true
-
Resource meta section within a manifest file comprises metadata attributes universally applicable to all Resource-types. To learn more about how to configure attributes within this section, refer to the link: Attributes of Resource meta section.
-
Policy-specific section within a manifest file comprises attributes specific to the Policy Resource. This section is different for Access and Data Policy .To learn more about how to configure attributes of Policy-specific section, refer to the link: Attributes of Policy manifest.
name: mydatapolicy
version: v1
type: policy
tags:
- policy
description: This is a sample policy YAML configuration
owner: iamgroot
layer: user
policy:
data:
type: filter
name: "filtericebasecity"
description: 'data policy to filter data on zip code'
dataset_id: "icebase.retail.city"
priority: 1
selector:
user:
match: all
tags:
- "users:id:aayushisolanki"
filters:
- column: city_name
operator: equals
value: "Verbena"
name: bucketage
version: v1
type: policy
layer: user
description: "data policy to filter zip data"
policy:
data:
priority: 1
type: mask
depot: icebase
collection: retail
dataset: customer
selector:
column:
tags:
- PII.Age
user:
match: any
tags:
- "users:id:iamgroot"
mask:
operator: bucket_number
bucket_number:
buckets:
- 5
- 12
- 18
- 25
- 45
- 60
- 70
name: age_masking_policy
description: An age bucket is formed by grouping the ages together. Based on defined
age buckets, the age of individuals is redacted and anonymized. If an
individual’s age falls under a defined bucket, it is replaced with the
lowest value of the bucket.
The Policy manifest file is structurally comprised of the following sections:
Resource meta Section¶
To create a Policy YAML in DataOS, the initial step involves configuring the Resource Section in a YAML file. This section defines various properties of the Policy Resource. The following is an example YAML configuration for the Resource Section:
Info
The layer
field can have value either user/system in case of Policy.
For policies that govern authorization for system level resources such as API Paths, layer
is system, while for user layer
authorization such as access to UDL addresses it is user.
Additionally, the Resource section offers various configurable attributes, which can be explored further on the link: Attributes of Resource section.
Policy-specific section¶
The Policy-specific Section focuses on the configurations specific to the Policy Resource. Each Policy-type has its own YAML syntax.
Access Policies are defined using a subject-predicate-object triad. The YAML syntax for an Access Policy is as follows:
The table below summarizes varioues attributes/fields within the access policy YAML.
Field | Data Type | Default Value | Possible Value | Requirement |
---|---|---|---|---|
policy |
object | none | none | mandatory |
access |
object | none | none | mandatory |
subjects |
object | none | none | mandatory |
tags |
list of strings | none | a valid DataOS tag | mandatory |
predicates |
list of strings | none | http or crud operations | mandatory |
objects |
object | none | none | mandatory |
paths |
list of strings | none | api paths, udl paths | mandatory |
allow |
boolean | false | true/false | optional |
Here, the subjects
represents the user, the objects
denotes the target (such as an API path or resource) that the user interacts with, and the predicates
represents the action performed. The allow
field determines whether the policy grants or restricts access for the user to perform the specified action on the designated object. Refer to the Attributes of Policy-specific section for more details on configuring subjects, predicates, and objects.
policy:
data:
type: filter
name: ${filterpolicyname}
description: ${sample data policy to filter data}
dataset_id: ${{depot.collection.dataset_name}}
priority: ${100}}
selector:
user:
match: ${all|any}
tags:
- ${roles:id:user}
- ${roles:id:pii_reader}
filters:
- column: ${column_name}
operator: ${equals}
value: ${"value"}
The table below summarizes the various attributes within a Filter Data Policy manifest.
Field | Data Type | Default Value | Possible Value | Requirement |
---|---|---|---|---|
policy |
object | none | none | mandatory |
data |
object | none | none | mandatory |
depot |
string | none | any valid depot name or regex pattern | optional |
collection |
string | none | any valid collection name or regex pattern | optional |
dataset_id |
string | none | any valid dataset identifier | mandatory |
priority |
number | none | 0-100 | mandatory |
selector |
object | none | none | mandatory |
user |
object | none | none | mandatory |
tags |
list of strings | none | a valid DataOS tag | mandatory |
column |
object | none | true/false | optional |
names |
list of strings | none | valid column name | optional |
tags |
list of tags | none | valid column tag defined under a tag group | optional |
type |
string | none | mask/filter | mandatory |
filters |
list | none | none | mandatory |
column |
string | none | any valid column name | mandatory |
operator |
string | none | any valid comparison operator (e.g., equals) | mandatory |
value |
string | none | any valid value for comparison | mandatory |
policy:
data:
type: mask
name: ${email_masking_policy}
description: to mask private mail address
priority: 1
depot: ${depot name}
collection: ${collection name}
dataset: ${dataset name}
selector:
column:
tags:
- ${PII.Email}
user:
match: ${all}
tags:
- ${"users:id:iamgroot"}
mask:
operator: ${hash}
${hash}:
algo: sha256
The table below summarizes the various attributes within a Mask Data Policy manifest.
Field | Data Type | Default Value | Possible Value | Requirement |
---|---|---|---|---|
policy |
object | none | none | mandatory |
data |
object | none | none | mandatory |
depot |
string | none | any valid depot name or regex pattern | optional |
collection |
string | none | any valid collection name or regex pattern | optional |
dataset |
string | none | any valid dataset name or regex pattern | optional |
priority |
number | none | 0-100 | mandatory |
selector |
object | none | none | mandatory |
user |
object | none | none | mandatory |
tags |
list of strings | none | a valid DataOS tag | mandatory |
column |
object | none | true/false | optional |
names |
list of column names | none | valid column name | optional |
tags |
list of tags | none | valid column tag defined under a tag group | optional |
type |
string | none | mask/filter | mandatory |
mask |
object | none | none | mandatory |
For detailed information on configuring the YAML file for a Data Policy, refer to the link: Attributes of Policy-specific section.
Applying the YAML File¶
After creating the YAML configuration file for the Policy Resource, it's time to apply it to instantiate the resource in the DataOS environment. To apply the Policy YAML file, utilize the apply
command.
Replace the ${yaml-file-path}
and ${workspace-name}
with respective absolute or relative file path of the Policy manifest and the Workspace name in which the Resource is to be instantiated.
Verify Policy Creation¶
To confirm that your Policy has been successfully created, you can verify it using two methods:
- Check Policy in a Workspace: Use the following command to list the Policy created by you in a specific Workspace:
- Retrieve All Policy in a Workspace: To retrieve the list of all Policy created in the Workspace, add the
-a
flag to the command:
You can also access the details of any created Policy through the DataOS GUI in the Resource tab of the Operations App.
Managing a Policy¶
Debugging a Policy¶
When a Policy encounters errors, data developers can employ various tactics to diagnose and resolve issues effectively. Here are the recommended debugging techniques:
-
Get Policy details
-
Retrieve detailed information about the Policy to gain deeper insights into its configuration and execution status. This can be accomplished using the following command:
-
Review the output to identify any discrepancies or misconfigurations in the Policy that could be contributing to the error.
-
Policy Implementation Mechanism¶
In the DataOS ecosystem, the Heimdall governance engine operates as the Policy Decision Point (PDP) for Access Policies, while the Minerva Gateway serves as the PDP for Data Policies. Both these elements jointly supervise the enforcement of policies across a range of Policy Enforcement Points (PEP), distributed throughout the DataOS ecosystem. Learn more about Policy implementation in DataOS, here.
Policy Configuration Templates¶
In this section, a collection of pre-configured Policy Resource Templates are provided, tailored to meet the requirements of different scenarios. To know more navigate to the following link
Case Scenarios¶
For detailed examples and practical implementations of Policy Resource, refer to the following Policy Resource Case Scenarios: