Best practices¶

By following these best practices, teams can reduce errors, improve communication, and create more maintainable and scalable systems. Below are several key best practices to consider:

Naming conventions¶

Names must start with a letter.
Names can include letters, numbers, and underscore (_) symbols only.
Use snake_case for naming.

Examples

Tables: orders, stripe_invoices, base_payments
Views: opportunities, cloud_accounts, arr
Measures: count, avg_price, total_amount_shipped
Dimensions: name, is_shipped, created_at

SQL expressions¶

Data source dialect¶

When defining tables, you often provide SQL snippets in the sql parameter. These SQL expressions should match your data-source SQL dialect.

Examples

In Snowflake, use the LISTAGG function to aggregate a list of strings.
In BigQuery, use the STRING_AGG function.

Here’s an example for defining a table with SQL snippets:

tables:
  - name: order
    sql: {{ load_sql('order') }}
    description: Table containing information about product
    public: true
    meta:
      export_to_board: false

    measures:
      - name: statuses
        sql: "listagg({TABLE.status}) WITHIN GROUP (ORDER BY {TABLE.status})"
        type: string

    dimensions:
      - name: status
        sql: "UPPER(status)"
        type: string

Case sensitivity¶

If your database uses case-sensitive identifiers, ensure you properly quote table and column names.

For example, to reference a Postgres table with uppercase letters:

SELECT
  site_number,
  site_name,
  site_region
FROM
  'public."Site"'

References¶

To create reusable data models, it is essential to reference members of tables and views, such as measures or dimensions, as well as columns. Lens supports the following syntax for references:

column¶

Prefix table column references with the table name or use the TABLE constant when referring to the current table's column.

In most cases, use bare column names in the sql parameter of measures or dimensions. For example, the name references the respective column of the users table.

tables:
  - name: order
    sql: {{ load_sql('order') }}
    description: Table containing information about orders
    public: true
    meta:
      export_to_board: false

    dimensions:
      - name: status
        sql: status
        type: string

This works well for simple cases. However, if your tables have joins and the joined tables have columns with the same name, the generated SQL query might become ambiguous. Here’s how to avoid that:

`{member}`¶

When defining measures and dimensions, you can reference other members of the same table by wrapping their names in curly braces.

In the example below, the full_name dimension references the name and surname dimensions of the same table.

tables:
  - name: customer
    sql: {{ load_sql('customer') }}
    description: Table containing information about customers
    public: true
    meta:
      export_to_board: false
    dimensions:
      - name: name
        sql: name
        type: string

      - name: surname
        sql: "UPPER(surname)"
        type: string

      - name: full_name
        sql: "CONCAT({name}, ' ', {surname})"
        type: string

For cases where you need to reference members of other tables, see the example below:

tables:
  - name: customer
    sql: {{ load_sql('customer') }}
    description: Table containing information about customers
    public: true
    meta:
      export_to_board: false
  dimensions:
    - name: name
      sql: name
      type: string

    - name: subq_rev
      sql: "{sales.total_revenue}"
      sub_query: true
      type: number

`{tablename}.column` and `{tablename.member}`¶

Qualify column and member names with the table name to remove ambiguity when tables are joined and reference members of other tables.

tables:
  - name: users
    sql: {{ load_sql('users') }}

    joins:
      - name: contacts
        sql: "{users}.contact_id = {contacts.id}"
        relationship: one_to_one

    dimensions:
      - name: id
        sql: "{users}.id"
        type: number
        primary_key: true

      - name: name
        sql: "COALESCE({users.name}, {contacts.name})"
        type: string

tables:
  - name: contacts
    sql: {{ load_sql('contacts') }}

    dimensions:
      - name: id
        sql: "{contacts}.id"
        type: number
        primary_key: true

      - name: name
        sql: "{contacts}.name"
        type: string

However, always referring to the current table by its name can lead to code repetition. Here’s how to solve that:

`{TABLE}` variable¶

Use the {TABLE} variable to reference the current table, avoiding the need to repeat its name.

tables:
  - name: users
    sql: {{ load_sql('users') }}

    joins:
      - name: contacts
        sql: "{TABLE}.contact_id = {contacts.id}"
        relationship: one_to_one

    dimensions:
      - name: id
        sql: "{TABLE}.id"
        type: number
        primary_key: true

      - name: name
        sql: "COALESCE({TABLE.name}, {contacts.name})"
        type: string

tables:
  - name: contacts
    sql: {{ load_sql('contacts') }}

    dimensions:
      - name: id
        sql: "{TABLE}.id"
        type: number
        primary_key: true

      - name: name
        sql: "{TABLE}.name"
        type: string

Using the {TABLE} variable keeps the data model code DRY and easy to maintain.

For more examples, refer to Do’s And Don’ts.

Non-SQL references¶

Outside the sql parameter, column is not recognized as a column name but as a member name. This means you can reference members directly by their names without using curly braces: member, table_name.member, or TABLE.member.

tables:
  - name: users
    sql: {{ load_sql('users') }}

    dimensions:
      - name: status
        sql: status
        type: string

    measures:
      - name: count
        type: count

    pre_aggregations:
      - name: orders_by_status
        dimensions:
          - TABLE.status
        measures:
          - TABLE.count

Partitioning¶

Partitions should be small so that the Lens workers can process them in less time. Start with a relatively large partition (e.g., yearly) and adjust as needed.
To minimize partition queueing, make refresh keys as infrequent as possible.

Payload edit¶

For more information on handling JSON payloads, refer to Working with Payload.

Commenting in SQL files¶

Add comments on a new line within the query.
For end-of-query comments, leave two blank lines before the comment.

SELECT
  customer_key,
  prefix,
  first_name,
  last_name,
  to_timestamp(birth_date) as birth_date,
  marital_status,
  gender,
  email_address,
  annual_income,
  total_children,
  education_level,
  occupation,
  home_owner
  -- ,'test' as test
FROM
  icebase.sports.sample_customer

-- where occupation in ('service','business')