How to create Lens model with cached dataset?¶
To optimize data retrieval and enhance performance, cached datasets in Flash can be leveraged to construct Lens logical tables. This strategy minimizes data querying from the source by storing frequently accessed or queried Lenses in Flash, which avoids repeated source access and scanning of large datasets. As a result, queries can be processed faster, delivering responses in seconds.
When to cache logical tables in Flash?¶
Consider the following criteria to determine if the logical tables of a Lens should be cached in Flash:
- Complexity of the SQL View: Caching is beneficial for SQL operations involving complex aggregate functions, multiple joins, and subqueries, which may result in resource-intensive queries.
- Data Volume: Lens models that handle large datasets from the source can benefit significantly from caching, as it expedites query processing.
- Source Optimization: If the source system experiences prolonged query execution times or frequent timeouts, caching in Flash can improve overall performance.
Steps to use Flash datasets in Lens¶
1. Define Flash as the data source¶
Configure the Flash service as the data source in the Lens deployment manifest file. Below is an example configuration:
source:
type: flash # Specifies the data source type as Flash
name: flash-test # Name of the Flash service
catalog: icebase
2. Add environment variables¶
To enable Lens to interact with the Flash service, specify the following environment variables in the Worker
, API
, and Router
sections of the Lens deployment manifest:
3. Sample Lens deployment manifest¶
Below is a sample Lens deployment manifest that uses Flash as the data source for further clarity:
version: v1alpha
name: "lens-test01"
layer: user
type: lens
tags:
- lens
description: A sample Lens containing entities, views, and measures for testing
lens:
compute: runnable-default
secrets:
- name: gitsecret-r
allKeys: true
source:
type: flash # Specifies Flash as the data source
name: flash-test01 # Name of the Flash service
repo:
url: https://github.com/iamgroot/lens-flash
lensBaseDir: lens-flash/flash/model # Directory where Lens models are stored
syncFlags:
- --ref=main
api:
replicas: 1
logLevel: debug
envs:
LENS2_SCHEDULED_REFRESH_TIMEZONES: "UTC,America/Vancouver,America/Toronto"
LENS2_SOURCE_WORKSPACE_NAME: public
LENS2_SOURCE_FLASH_PORT: 5433
resources:
requests:
cpu: 100m
memory: 256Mi
limits:
cpu: 2000m
memory: 2048Mi
worker:
replicas: 1
logLevel: debug
envs:
LENS2_SCHEDULED_REFRESH_TIMEZONES: "UTC,America/Vancouver,America/Toronto"
LENS2_SOURCE_WORKSPACE_NAME: public
LENS2_SOURCE_FLASH_PORT: 5433
resources:
requests:
cpu: 100m
memory: 256Mi
limits:
cpu: 6000m
memory: 6048Mi
router:
logLevel: info
envs:
LENS2_SCHEDULED_REFRESH_TIMEZONES: "UTC,America/Vancouver,America/Toronto"
LENS2_SOURCE_WORKSPACE_NAME: public
LENS2_SOURCE_FLASH_PORT: 5433
resources:
requests:
cpu: 100m
memory: 256Mi
limits:
cpu: 6000m
memory: 6048Mi
iris:
logLevel: info
envs:
LENS2_SCHEDULED_REFRESH_TIMEZONES: "UTC,America/Vancouver,America/Toronto"
LENS2_SOURCE_WORKSPACE_NAME: public
LENS2_SOURCE_FLASH_PORT: 5433
resources:
requests:
cpu: 100m
memory: 256Mi
limits:
cpu: 6000m
memory: 6048Mi
3. Create and deploy the Lens model¶
Once the manifest file is ready and Flash is configured as the data source, proceed to create and deploy the Lens model according to your deployment procedures.
This approach ensures optimal performance by leveraging cached datasets, minimizing data retrieval times, and improving query efficiency.