Skip to content

Deploy Lens in DataOS

In this topic, you'll learn how to deploy your semantic model (Lens) in DataOS. This step ensures your model is accessible, reliable, and ready for real-time use, enabling stakeholders to derive insights and make data-driven decisions.

Scenario

After thorough testing of your semantic model (Lens) in a local environment, the next critical step is to deploy it in DataOS. This process ensures that your model is accessible and reliable for broader usage, enabling stakeholders to leverage it for real-time data insights and decision-making.

Prerequisites

Before diving into configuring Lens, make sure you have everything ready:

  1. Check required Permissions: Some tasks require specific permissions typically assigned to DataOS Operators. Ensure you have access to one of the following permission sets either via use-cases or via tags:

    Access Permission (via use-cases) Access Permissions (via tags)
    Read Workspace roles:id:data-dev
    Create Update and Delete Lens in user layer specified workspace roles:id:system-dev
    Read all secrets from Heimdall
  2. Check CLI installation and initialization: You need this text-based interface that allows you to interact with the DataOS context via command prompts. Click here to learn more.

  3. Manage Credentials Securely: Use Instance Secrets for storing your data source credentials, ensuring sensitive information remains protected.

    Important: To prevent credential exposure, contact DataOS administrator and understand the best practices for handling sensitive data.

  4. Organize Your Code Repository: Place Lens manifests in a private, permission-controlled repository to maintain security and compliance.

Steps

You follow the below steps to deploy Lens on DataOS.

Step 1: Prepare the semantic model folder in the Data Product directory

The semantic model is organized within the build/semantic_model directory of the Data Product. You don’t need to push the semantic model individually. Instead, you will push the entire Data Product directory, which includes the semantic model, to a code repository such as GitHub, AWS CodeCommit, or Bitbucket, once all the Data Product Resources are prepared. This ensures that the Lens, along with all Data Product Resources (such as Talos), is included in the overall deployment, with proper synchronization for version tracking and collaboration.

The following structure illustrates how the Lens will be organized within the Data Product directory.

build/                          # resources folder in dp
├── semantic_model/                           # Lens  folder
   ├── deployment.yaml             # Deployment file for Lens
   ├── model/                      # Lens model folder
      ├── sql/                    # SQL files for the Lens model
      ├── tables/                 # Tables used in the Lens model
      ├── views/                  # Views defined for the Lens model
      └── user_groups.yml         # User groups for access control

Step 2: Create Instance Secrets for code repository credentials

To secure your code repository credentials, you need to create and configure an Instance Secret. This secret is necessary because the Lens deployment file contains Git/Bitbucket credentials and repo address to fetch all Lens artifacts. To create and configure an Instance Secret follow the below steps:

a. Create an Instance Secret manifest file

Define the Instance Secret Resource in a YAML file. Below is a template you can use for Bitbucket and Github, substituting ${USERNAME} and ${PASSWORD} with your actual credentials of Bitbucket or Github:

# RESOURCE META SECTION
name: bitbucket-r # Secret Resource name (mandatory)
version: v1 # Secret manifest version (mandatory)
type: instance-secret # Type of Resource (mandatory)
description: Bitbucket read secrets for code repository # Secret Resource description (optional)
layer: user # DataOS Layer (optional)

# INSTANCE SECRET-SPECIFIC SECTION
instance-secret: 
  type: key-value # Type of Instance-secret (mandatory)
  acl: r # Access control list (mandatory)
  data: # Data (mandatory)
    GITSYNC_USERNAME: iamgroot
    GITSYNC_PASSWORD: <GIT_TOKEN>

b. Apply the Instance Secret manifest

Deploy the Instance Secret to DataOS using the apply command.

You apply the manifest file as follows:

dataos-ctl apply -f ./lens/instance_secret.yml
# Expected output
INFO[0000] 🛠 apply...                                   
INFO[0000] 🔧 applying bitbucket-r:v1:instance-secret... 
INFO[0001] 🔧 applying bitbucket-r:v1:instance-secret...created 
INFO[0001] 🛠 apply...complete

Step 3: Create Lens manifest file

You begin by creating a manifest file that holds the configuration details for your Lens. The structure of the Lens manifest file is provided below.

The manifest file of Lens can be broken down into two sections:

  1. Resource meta section
  2. Lens-specific section

The following YAML excerpt illustrates the attributes specified within this section:

To configure Lens, replace namelayertagsdescription, and owner values with appropriate values. For additional configuration information about the attributes of the Resource meta section, refer to the link: Attributes of Resource meta section.

Lens manifest file

# RESOURCE META SECTION
name: cross-sell-affinity # Lens Resource name (mandatory)
version: v1alpha # Lens manifest version (mandatory)
layer: user # DataOS Layer (optional)
type: lens # Type of Resource (mandatory)
tags: # Tags (optional)
  - lens
description: This data model provides comprehensive insights for cross-sell and product affinity analysis. # Lens Resource description (optional)

# LENS-SPECIFIC SECTION
lens:
  compute: runnable-default # Compute Resource that Lens should utilize (mandatory)
  secrets: # Referred Instance-secret configuration (**mandatory for private code repository, not required for public repository)
    - name: bitbucket-cred # Referred Instance Secret name (mandatory)
      allKeys: true # All keys within the secret are required or not (optional)

# Data Source configuration      
  source: 
    type: depot # Source type (could be themis, minerva flash as well)
    name: lakehouse # Source name (name of the depot)
    catalog: lakehouse # Catalog name for the depot
  repo: # Lens model code repository configuration (mandatory)
    url: https://bitbucket.org/tmdc/dataproducts # URL of repository containing the Lens model (mandatory)
    lensBaseDir: dataproducts/setup/resources/lens2/model # Relative path of the Lens 'model' directory in repository (mandatory)
    syncFlags: # Additional flags used during synchronization (optional)
      - --ref=main # Repository Branch (optional)

# API Instances configuration
  api:  #(optional)
    replicas: 1 # Number of API instance replicas (optional)
    logLevel: info # Logging granularity (optional)
    resources: # CPU and memory configurations for API Instances (optional)
      requests:
        cpu: 100m
        memory: 256Mi
      limits:
        cpu: 1000m
        memory: 1048Mi

# Worker configuration
  worker:  #(optional)
    replicas: 1 # Number of Worker replicas (optional)
    logLevel: debug # Logging level for Worker (optional)
    resources: # CPU and memory configurations for Worker (optional)
      requests:
        cpu: 100m
        memory: 256Mi
      limits:
        cpu: 1000m
        memory: 1248Mi

# Router configuration

  router:  #(optional)
    logLevel: info # Level of log detail for Router (optional)
    resources: # CPU and memory resource specifications for the router (optional)
      requests:
        cpu: 100m
        memory: 256Mi
      limits:
        cpu: 1000m
        memory: 2548Mi

# Iris configuration 
  iris: #(optional)
    logLevel: info # Log level for Iris (optional)
    resources: # CPU and memory resource specifications for the iris board (optional)
      requests:
        cpu: 200m
        memory: 256Mi
      limits:
        cpu: 1600m
        memory: 2240Mi
# Metric configuration 
  metric:                         #(optional)
    logLevel: info # Log level for metrics (optional)

A typical deployment of Lens includes the following components:

Section Description
Source Specifies the source configuration from which the Lens will be mapped. The Lens support the depot, themis, minerva and flash as source.
Repo Outlines the configuration of the code repository where the model used by Lens resides.
API Configures an API service that processes incoming requests, connecting to the database for raw data. A single instance is provisioned by default, but the system can auto-scale to add more instances based on workload demands, with a recommendation of one instance for every 5-10 requests per second.
Worker When LENS2_REFRESH_WORKER is set to true, a Refresh Worker manages and refreshes the memory cache in the background, keeping refresh keys for all data models up-to-date. It invalidates the in-memory cache but does not populate it, which is done lazily during query execution.
Router Configures a Router Service responsible for receiving queries from Lens, managing metadata, and handling query planning and distribution to the workers. Lens communicates only with the Router, not directly with the workers.
Iris Manages interaction with Iris dashboards.
Metrics Populates the metrics in the metric section of the Data Product Hub.

For more information on how to configure Lens manifest file, refer to the link: Configuration Fields of the Deployment Manifest File for Lens

Step 4: Apply the Lens manifest file

Apply the Lens manifest file using the apply command as shown below, or reference the Lens manifest path in the Bundle Resource along with other Resources.

dataos-ctl apply -f /home/data_product/resources/lens/deployment.yaml #path