Secret¶
In DataOS, Secrets are Resources designed for the secure storage of sensitive information, including usernames, passwords, certificates, tokens, or keys within the confines of a specific DataOS Workspace.
To mitigate the risk of exposing confidential data, Secrets in DataOS separate sensitive information from application code or configuration files. This practice minimizes the chance of accidental exposure during resource management phases like creation, viewing, or editing. By leveraging Secrets, data developers safeguard sensitive information, thus reducing security vulnerabilities in their data workflows.
Operators can exercise precise control over who can retrieve credentials from Secrets, if in your organisation any data developer need access to secrets you can assign them a 'read secret' use case using Bifrost.
-
How to create and manage a Secret?
Learn how to create and manage a Secret in DataOS.
-
How to configure a Secret manifest file?
Discover how to configure a Secret manifest file by adjusting its attributes.
-
Different types of Secrets
Different types of Secret securely store diverse sensitive data, addressing specific needs like docker credentials, certificates, etc.
-
How to refer to Secrets in other DataOS Resources?
Learn how to leverage DataOS Secrets to securely refer sensitive information in other DataOS Resources.
How to create a Secret?¶
Secrets are deployed using manifest files through the Command Line Interface (CLI). During this deployment, Poros, the Resource Manager, orchestrates the forwarding of Secret Resource YAMLs to Heimdall, the Governance Engine within DataOS. To create a Secret Resource in DataOS, follow these steps. This guide assumes you have the necessary permissions and access to the DataOS CLI.
Create a Secret manifest file¶
Begin by creating a manifest file that will hold the configuration details for your Secret.The structure of the Secret manifest file is provided in the image given below:
The manifest file of a Secret Resource can be broken down into two separate sections - Resource meta section and Secret-specific section.
Resource meta section¶
The Resource meta section of the manifest configuration file encompasses attributes that maintain uniformity across all resource types. The provided manifest snippet illustrates the key-value pairs that must be declared in this section:
name: ${{resource-name}}
version: v1
type: ${{resource-type}}
tags:
- ${{tag1}}
- ${{tag2}}
description: ${{description of the secret}}
owner: ${{owner_username}}
Secret-specific section¶
The Secret-specific Section of the manifest configuration file includes key-value pairs specific to the type of Secret being created. The following manifest snippet illustrates the key values to be declared in this section:
Secret manifest Fields¶
The table below provides a summary of the various attributes of the Secret-specific section:
Field | Data Type | Default Value | Possible Value | Requirement |
---|---|---|---|---|
secret |
object | none | none | mandatory |
type |
string | none | cloud-kernel, cloud-kernel-image-pull, key-value, key-value-properties, certificates | mandatory |
acl |
string | none | r, rw | mandatory |
data |
mapping | none | none | mandatory |
files |
string | none | file-path | optional |
For more information about the various attributes in Secret specific section, refer to the Attributes of Secret specific section.
Apply the Secret manifest¶
To apply the Secret manifest, utilize the DataOS CLI by explicitly specifying the path to the manifest file and the designated workspace. The apply command is provided below:
Alternative to the above apply command.
How to manage a Secret?¶
Validate the Secret¶
To validate the proper creation of the Secret Resource within the DataOS environment, employ the get
command. Execute the following command to ascertain the existence and correctness of the Secret Resource:
Alternative command:
Delete the Secret¶
To remove the Secret Resource from the DataOS environment, utilize the delete
command within the CLI. Execute the following command to initiate the deletion process:
delete command structure for -t (type) and -n (name)
Altenative command:
delete command structure for -i (identifier)
How to refer Secrets in other DataOS Resources?¶
To access the stored secret data in DataOS, you can reference them in your code using the secrets
and dataosSecrets
identifier. These identifiers ensure secure referencing of Secrets across different resources, enhancing system security and operational integrity.
Syntax
dataosSecrets:
- name: ${your-secret-name} # Mandatory
workspace: ${secret-workspace} # Optional
key: ${key of your secret} # Optional, used when only single key is required.
keys: # Optional, used when multiple key is required.
- ${secret_key}
- ${secret-key}
allKeys: ${true-or-false} # Optional
consumptionType: ${envVars} # Optional, possible values: envVars, propfile and file.
secrets:
- name: ${your-secret-name} # Mandatory
workspace: ${secret-workspace} # Optional
key: ${key of your secret} # Optional, used when only single key is required.
keys: # Optional, used when multiple key is required.
- ${secret_key}
- ${secret-key}
allKeys: ${true-or-false} # Optional
consumptionType: ${envVars} # Optional, possible values: envVars, propfile and file.
Let's see how you can refer secrets in various resources:
In addition to serving as a conduit for real-time and streaming data exchanges, the Service Resource within DataOS incorporates Secrets for secure access to confidential information. This ensures data privacy, and regulatory compliance, and facilitates timely insights and responses to dynamic information.
name: service-secret
version: v1
type: secret
tags:
- dataos:type:secret
description: This is a sample Secret YAML configuration
owner: iamgroot
secret:
type: key-value
acl: r
data:
MSTEAM_WEBHOOK_URL: ${MSTEAM_WEBHOOK_URL}
DATAOS_API_TOKEN: ${DATAOS_API_TOKEN}
DATAOS_ENV_LINK: ${DATAOS_ENV_LINK}
DATAOS_PULSAR_TOPIC_SUB_ID: ${DATAOS_PULSAR_TOPIC_SUB_ID}
version: v1
name: ${resource-name}
type: ${resource-type}
service:
title: ${workflow-alerts}
replicas: 1
stack: container
compute: runnable-default
resources:
requests:
cpu: 100m
memory: 500m
limits:
cpu: 1
memory: 1Gi
dataosSecrets: # Referencing the Secret
- name: ${secret-name}
workspace: public
keys:
- ${MSTEAM_WEBHOOK_URL}
- ${DATAOS_API_TOKEN}
- ${DATAOS_ENV_LINK}
- ${DATAOS_PULSAR_TOPIC_SUB_ID}
stackSpec:
image: labs/ls_workflow_alerts:2.0
imagePullSecret: modern-docker-secret
command:
- python
arguments:
- -u
- ./wf-failed-alerts.py
The Workflow in DataOS serves as a Resource for orchestrating data processing tasks with dependencies. It enables the creation of complex data workflows by defining a hierarchy based on a dependency mechanism some requiring access to sensitive information such as API keys, authentication tokens, or database credentials. Instead of embedding these secrets directly in the workflow configuration, it is advisable to leverage references to the Secret Resource.
name: service-secret
version: v1
type: secret
tags:
- dataos:type:secret
description: This is a sample Secret YAML configuration
owner: iamgroot
secret:
type: key-value
acl: r
data:
API_KEY: ${API_KEY}
DATAOS_API_TOKEN: ${DATAOS_API_TOKEN}
DATAOS_ENV_LINK: ${DATAOS_ENV_LINK}
DATAOS_PULSAR_TOPIC_SUB_ID: ${DATAOS_PULSAR_TOPIC_SUB_ID}
version: v1
name: {workflow-name}
type: workflow
workflow:
dag:
- name: ${alpha-wf-mail-alert}
spec:
resources:
requests:
cpu: 250m
memory: 500m
limits:
cpu: 1
memory: 1Gi
dataosSecrets: # Referencing the Secret
- name: ${secret-name}
workspace: public
keys:
- ${API_KEY}
- ${DATAOS_API_TOKEN}
- ${DATAOS_ENV_LINK}
- ${DATAOS_PULSAR_TOPIC_SUB_ID}
stack: ${stack-name}
compute: runnable-default
stackSpec:
image: rubiklabs/workflow_lobos_mail_alert:1.0
imagePullSecret: modern-docker-secret
command:
- python
arguments:
- -u
- ./email_alert_script.py
A Worker Resource in DataOS is a long-running process responsible for performing specific tasks or computations indefinitely. Workers are capable of securely accessing confidential information, such as API keys, through the referencing of secrets, thereby ensuring the safeguarding of sensitive data.
name: benthos-worker-secret
version: v1
type: secret
tags:
- dataos:type:secret
description: This is a sample Secret YAML configuration
owner: iamgroot
secret:
type: key-value
acl: r
data:
runAsApiKey: dthtyurZW5fY29tbW9ubHllX21vcmF5LmFlNmI2YzBkLTI0ZGEtNDI0NDFmhgfghfdrZQ
runAsUser: iamgroot
name: benthos3-worker-sample-replicas
version: v1beta
type: worker
tags:
- worker
- dataos:type:resource
- dataos:resource:worker
- dataos:layer:user
- dataos:workspace:public
description: Random User Console
owner: iamgroot
workspace: public
worker:
tags:
- worker
- random-user
replicas: 3
stack: benthos
logLevel: DEBUG
compute: runnable-default
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 1000m
memory: 1024Mi
dataosSecrets:
- name: benthos-worker-secret
workspace: public
stackSpec:
input:
http_client:
url: https://randomuser.me/api/
verb: GET
headers:
Content-Type: application/JSON
pipeline:
processors:
- label: my_blobl
bloblang: |
page = this.info.page
age = this.results.0.dob.age
dob = this.results.0.dob.date
seed = this.info.seed
email = this.results.0.email
gender = this.results.0.gender
name = this.results.0.id.name
city = this.results.0.location.city
output:
broker:
outputs:
- broker:
pattern: fan_out
outputs:
- plugin:
address: dataos://fastbase:default/test001
metadata:
auth:
token:
enabled: true
token: dthtyurZW5fYbW9ubHlfccmdlX21vcmF5LmFlNmI2YzBkLTI0ZGEtNDI0Ny1hMjUyLTk0YTdjNDFmhgfghfdrZQ==
description: Random users data
format: AVRO
schema: "{\"name\":\"default\",\"type\":\"record\",\"namespace\":\"defaultNamespace\",\"fields\":[{\"name\":\"age\",\"type\":\"int\"},{\"name\":\"city\",\"type\":\"string\"},{\"name\":\"dob\",\"type\":\"string\"},{\"name\":\"email\",\"type\":\"string\"},{\"name\":\"gender\",\"type\":\"string\"},{\"name\":\"name\",\"type\":\"string\"},{\"name\":\"page\",\"type\":\"int\"},{\"name\":\"seed\",\"type\":\"string\"}]}"
schemaLocation: http://registry.url/schemas/ids/12
title: Random Uses Info
type: STREAM
type: dataos_depot
- stdout: {}
A Cluster in DataOS is a Resource that encompasses a set of computational resources and configurations necessary for executing data engineering and analytics tasks. Clusters are capable of securely accessing confidential information through the referencing of secrets, thereby ensuring the safeguarding of sensitive data.
version: v1
name: mycluster
type: cluster
tags:
- cluster
- minerva
cluster:
compute: query-default
type: minerva
minerva:
replicas: 1
resources:
limits:
cpu: 2000m
memory: 4Gi
requests:
cpu: 2000m
memory: 4Gi
debug:
logLevel: INFO
trinoLogLevel: ERROR
depots:
- address: dataos://grootbigquery
dataosSecrets:
- name: cluster-secrets
workspace: public
allKeys: true
Referencing Secrets to Pull Images from Private Container Registry¶
Following the successful creation of a Secret Resource, it can seamlessly pull images from the container registries. This approach obviates the need to embed sensitive authentication information directly within the resource configuration.
Container registries, pivotal for storing and managing images, including essential details like registry type, access credentials, and repository information, can efficiently reference pertinent secrets. This ensures a secure and streamlined process for pulling images from a private container registry without exposing sensitive authentication data within the configuration files.
name: example-alpha
version: v1
type: workflow
workflow:
dag:
- name: example
spec:
resources:
requests:
cpu: 250m
memory: 500m
limits:
cpu: 1
memory: 1Gi
dataosSecrets:
- name: workflow-user-secret
workspace: public
keys:
- DATAOS_USER_NAME
- CLUSTER_NAME
- DATAOS_API_KEY
- DATAOS_ENV_NAME
stack: container
compute: runnable-default
stackSpec:
image: docker.io/helloworldimage/helloworldimage:tag
imagePullSecret: dockers-secrets
command:
- python