Themis¶
Connecting to Themis using Depot/Cluster¶
Prerequisite¶
Ensure you have an active and running Minerva Cluster.
Step 1: Prepare the Lens model folder¶
Organize the Lens model folder with the following structure to define tables, views, and governance policies:
model
├── sqls
│ └── sample.sql # SQL script for table dimensions
├── tables
│ └── sample_table.yml # Logical table definition (joins, dimensions, measures, segments)
├── views
│ └── sample_view.yml # Logical views referencing tables
└── user_groups.yml # User group policies for governance
-
SQL Scripts (
model/sqls
): Add SQL files defining table structures and transformations. -
Tables (
model/tables
): Define logical tables in separate YAML files. Include dimensions, measures, segments, and joins. -
Views (
model/views
): Define views in YAML files, referencing the logical tables. -
User Groups (
user_groups.yml
): Define access control by creating user groups and assigning permissions.
Step 2: Create a deployment manifest file¶
After preparing the Lens semantic model create a lens_deployemnt.yml
parallel to the model
folder.
version: v1alpha
name: "themis-lens"
layer: user
type: lens
tags:
- lens
description: themis lens deployment on lens2
lens:
compute: runnable-default
secrets:
- name: bitbucket-cred
allKeys: true
source:
type: themis #minerva/themis/depot
name: lenstestingthemis
catalog: icebase
repo:
url: https://bitbucket.org/tmdc/sample
lensBaseDir: sample/lens/source/themis/model
# secretId: lens2_bitbucket_r
syncFlags:
- --ref=main #repo-name
api: # optional
replicas: 1 # optional
logLevel: info # optional
resources: # optional
requests:
cpu: 100m
memory: 256Mi
limits:
cpu: 2000m
memory: 2048Mi
worker: # optional
replicas: 2 # optional
resources: # optional
requests:
cpu: 100m
memory: 256Mi
limits:
cpu: 6000m
memory: 6048Mi
router: # optional
resources: # optional
requests:
cpu: 100m
memory: 256Mi
limits:
cpu: 6000m
memory: 6048Mi
iris:
logLevel: info
resources: # optional
requests:
cpu: 100m
memory: 256Mi
limits:
cpu: 6000m
memory: 6048Mi
The YAML manifest provided is designed for a cluster named minervacluster
, created on the Minerva
source, with a data catalog named icebase
. To utilize this manifest, duplicate the file and update the source details as needed.
Each section of the YAML template outlines essential elements of the Lens deployment. Below is a detailed breakdown of its components:
-
Defining the Source:
-
type
: Thetype
attribute in thesource
section must be explicitly set tothemis
. -
name
: Thename
attribute in thesource
section should specify the name of the Themis Cluster. For example, if the name of your Themis Cluster isclthemis
the Source name would beclthemis
. -
catalog
: Thecatalog
attribute must define the specific catalog name within the Themis Cluster that you intend to use. For instance, if the catalog is namedlakehouse_retail
, ensure this is accurately reflected in the catalog field.
-
-
Defining Repository:
-
url
Theurl
attribute in the repo section specifies the Git repository where the Lens model files are stored. For instance, if your repo name is lensTutorial then the repourl
will be https://bitbucket.org/tmdc/lensTutorial -
lensBaseDir
: ThelensBaseDir
attribute refers to the directory in the repository containing the Lens model. Example:sample/lens/source/depot/awsredshift/model
. -
secretId
: ThesecretId
attribute is used to access private repositories (e.g., Bitbucket, GitHub). It specifies the secret needed to authenticate and access the repository securely. -
syncFlags
: Specifies additional flags to control repository synchronization. Example:--ref=dev
specifies that the Lens model resides in the dev branch.
-
-
Configure API, Worker, and Metric Settings (Optional): Set up replicas, logging levels, and resource allocations for APIs, workers, routers, and other components.
The above manifest is intended for a cluster named lenstestingthemis
, created on the themis source, with the depot or data catalog named icebase
. To use this manifest, copy the file and update the source details accordingly.
Docker compose manifest file¶
Docker compose manifest file for local testing
version: "2.2"
x-lens2-environment: &lens2-environment
# DataOS
DATAOS_FQDN: liberal-donkey.dataos.app
# Overview
LENS2_NAME: themislens
LENS2_DESCRIPTION: Description
LENS2_TAGS: Provide tags
LENS2_AUTHORS: creator of lens
LENS2_SCHEDULED_REFRESH_TIMEZONES: "UTC,America/Vancouver,America/Toronto"
# Data Source
LENS2_SOURCE_TYPE: themis #minerva, depot
LENS2_SOURCE_NAME: lenstestingthemis #cluster name
LENS2_SOURCE_CATALOG_NAME: icebase #depot name, specify any catalog
DATAOS_RUN_AS_APIKEY: ***** #dataos apikey
# Log
LENS2_LOG_LEVEL: error
CACHE_LOG_LEVEL: "trace"
# Operation
LENS2_DEV_MODE: true
LENS2_DEV_MODE_PLAYGROUND: false
LENS2_REFRESH_WORKER: true
LENS2_SCHEMA_PATH: model
LENS2_PG_SQL_PORT: 5432
CACHE_DATA_DIR: "/var/work/.store"
NODE_ENV: production
LENS2_ALLOW_UNGROUPED_WITHOUT_PRIMARY_KEY: "true"
services:
api:
restart: always
image: rubiklabs/lens2:0.35.60-20
ports:
- 4000:4000
- 25432:5432
- 13306:13306
environment:
<<: *lens2-environment
volumes:
- ./model:/etc/dataos/work/model
Check Query Stats for Themis¶
To check the query statistics, please follow the steps below:
-
Access the Themis Cluster
Navigate to the Themis cluster. You should see a screen similar to the image below:
-
Select the Running Driver
Choose the running driver. This driver will always be the same, regardless of the user, as queries will be directed to the creator of the Themis cluster. The running driver remains consistent for all users.
-
View the Spark UI
Go to terminal and use the following command to view the spark UI :
dataos-ctl -t cluster -w public -n themislens --node themis-themislens-iamgroot-default-a650032d-ad6b-4668-b2d2-cd372579020a-driver view sparkui
**dataos-ctl -t cluster -w public -n themis_cluster_name --node driver_name view sparkui**
You should see the following interface: