Skip to content

Explore a Data Product

This guide provides an overview of the key user interface components of the Explorer Page of the Data Prodcut in the Data Product Hub.

Data Product Exploration Button

The Exploration button allows users to explore the data model (Lens) associated with the selected Data Product. By clicking this button, users can interact with the data model's pre-defined dimensions, measures, and segments, enabling them to visualize, filter, and analyze the data directly within the Lens. This provides an intuitive interface for business users and data analysts to explore the underlying data model without writing queries, offering insights through an easy-to-use exploratory environment.

DPH

Explore button

Clicking the Explore button opens the Explore Page, with the Studio tab displayed by default. Other available tabs include Model and GraphQL.

DPH

Data Product Hub Interface

Header and Page Context Information

The top of the page displays key information about the Data Product being explored, providing context and clarity for users.

  • Data Product Title: The title of the Data Product is always visible at the top of the page. For example, "Sales-360-data-product-v12". Above the title, the associated business unit or function for which the Data Product was created is displayed, offering context regarding its origin. In this example, the business unit is "Sales".
  • Description: Below the title, a description gives an overview of the Data Product's purpose. In this case, it’s described as a "customer 360 view of the world."
  • Tier | Use-case |Team Information: The header contains essential information, including the Data Product's tier (e.g., Aggregated, Source aligned, etc.), its primary use case (e.g., Demand Forecast, Performance monitoring, etc.), and the list of contributors or collaborators involved in creating and maintaining the product (e.g., Aakansha Jain, Darpan Vyas, etc.). This provides users with a clear understanding of the product's purpose and the key stakeholders behind it, ensuring transparency and context.

Star Button and BI Sync

The Star button allows users to mark the Data Product as a favorite. The BI Sync option opens the BI Sync functionality, enabling synchronization of the Data Product with tools such as PowerBI, Tableau Cloud, Tableau Desktop, and Apache Superset.

Source Details

Clicking the Source section displays details of the source on which the Lens model is built, along with the option to check the source's health. Upon selection, the following information is shown.

  • Type: Indicates the type of cluster source is linked to. In this case, the type is Minerva. It can either be Themis or Flash.
  • Dialect: The query language or SQL dialect used for interacting with the data. For this product, the dialect is Trino.
  • Connection: The connection string or URL links the Data Product to its source. This example defines the connection as minerva://miniature, indicating the internal connection point to the miniature cluster.
  • Database: The database name is not explicitly shown in the given details, but this field would typically indicate the database within the system where the Data Product is stored.
  • Host: Provides the host or context details where the data source is hosted. In this example, tcp.liberal-donkey.dataos.app is the context where the Data Product exists.
  • Watch: To watch the health of the Cluster, Click on the watch button.

DPH

Source health

Clicking the Watch button allows users to return to the source and view the source health status by selecting the cross button. Hovering over the Cluster name, such as Minerva, will display the health status. A green dot indicates good health and that queries will run smoothly, while a red dot indicates that the cluster is unavailable.

DPH

Red signal: Cluster is not ready for queries

DPH

Green signal: Cluster is ready for queries

Further, It contains three tabs for exploring the Data Product in detail.

  • Model Tab
  • Studio Tab
  • GraphQL Tab

Let’s first understand the default studio tab

Studio Tab

When the Explore page is first opened, the Studio tab is displayed by default. This tab provides an interactive workspace for designing queries, viewing tables, pivot tables, charts, and other visualizations. It serves as the primary tool for exploring and analyzing data without the need for technical coding.

This page is divided into five primary sections, each serving a specific purpose in the exploration and querying of Data Products:

Left Panel

Here are the key components in the sidebar navigation

Expand/Collapse All

An option to show or hide all the logical table Panel.

Selecting "Expand" will increase the screen size to display the logical table side panel.

History

This section logs queries executed on a data model using timestamp format. This allows users to revisit previous queries and review their selected dimensions, measures, filters, and results. It's beneficial for auditing, troubleshooting, or repeating previous analyses.

DPH

History

For instance, a query might have been executed 32 minutes ago, using the city dimension and total accounts measure to analyze the total accounts across various cities.

Perspectives

In the Data Product Exploration interface, a user's query can be saved as a Perspective. A Perspective captures the current state of the query, including all selected dimensions, measures, metrics, filters, and any applied settings (e.g., time zones or limits). Once saved, Perspectives can be accessed at any time from the Perspective tab, allowing users to revisit or share their specific analyses without recreating the queries. By saving a Perspective, users can efficiently store and retrieve their specific analyses for continuous use or collaboration with team members.

DPH

Perspectives

Perspectives listed in the sales360 Data Product

How to Save and Use Perspectives?

Running a Query: After selecting dimensions, measures, and running a query, it can be saved to avoid reconfiguring it later. For example, to find the total count of accounts in each city, select the city dimension and the accounts count measure from the accounts table. Then, click the Run Query button.

DPH

Run query

Saving a Perspective: Once the query is complete, click the Save Perspective button (typically near the "Run Query" button).

DPH

Save Persepectives

Accessing Saved Perspectives: After saving, the query will be available under the Perspective tab for the particular Data Product (e.g., Sales 360).

Users can click the Perspective tab and view all saved Perspectives, allowing them to reload the exact configuration of any previously saved query.

DPH

Accessing and reusing saved Persepective

Re-running a Perspective: To re-run a saved Perspective, select it from the list. This reloads the query and all settings and filters, making it easy to regenerate the results or apply new changes.

The drop-down menu under logical table allows users to switch between logical table, Entities, or Metrics.

DPH

Drop down menu

logical table

This is the default view in the sidebar. It shows a list of logical tables, which are structured datasets containing dimensions and measures. Logical tables are often used for performing queries, such as analyzing account data, product information, or sales figures.

For example, the above image contains three logical tables: account, product, and sales.

DPH

logical table and Search bar

As a particular table is expanded, the list of all its dimensions and measures is displayed. In this case, the table named account is visible with various components such as Measures (e.g., total_accounts, avg_rev_subquery) and Dimensions (e.g., city, address).

Search Bar: Just below the logical table title, the search bar allows users to filter and quickly find specific tables, dimensions, or measures by typing keywords.

DPH

Besides, Logical tables the model also contains views. In the Lens model, users can create two types of views, each serving a different purpose and approach in how data is accessed and presented. Here's a differentiation between the two types of views:

Entities

Entities serve as a layer atop the data graph of tables, presenting an abstraction of the entire data model for consumers to interact with. They serve as a layer for defining metrics, providing a simplified interface for end-users to interact with key metrics instead of the entire data model. In an entity-first approach, views are built around entities in the data model. Views are built as denormalized tables, bringing measures and dimensions from multiple logical tables. They don’t have measures, dimensions, or segments of their own.

DPH

Entities

Metrics

Metrics are presented as a drop-down list option allowing users to select from predefined key performance indicators (KPIs). These Metrics provide quantifiable measures crucial for assessing the performance of specific data points within the Data Product. For example, selecting the metric aggregated_qtr_revenue_metric.sales_sales_aggregate_qtr_revenue_2024 would allow users to analyze quarterly revenue for different product brands. To know more about how to create the matrixs refer this link

How to Use Metrics in the UI:
  • Step 1: Click on the Metrics from the drop-down list in the left sidebar
  • Step 2: Select the relevant metric based on analytical needs. For example, to track sales, aggregated_qtr_revenue_metric.sales_sales_aggregate_qtr_revenue_2024 may be selected.

DPH

Metircs

  • Step 3: The selected metric will be added to the query. When combined with dimensions (like sales source or product brand), it will give a complete performance view.

  • Step 4: Run the query to get the results and review how the selected metric performs over time, across categories, or compared to benchmarks.

    In this case, the metric aggregated_qtr_revenue_metric.sales_sales_aggregate_qtr_revenue_2024 is used along with two dimensions:

    1. Sales Source: Indicates whether the sales are from proof (alcoholic) or nonproof (non-alcoholic) products.
    2. Product Brand: Identifies the brand of the product sold.

    After running the query using this metric and dimensions, the result shows the revenue generated by each brand based on the sales source (proof/nonproof).

Members and Filters

the Members Tab dynamically displays the dimensions and measures selected by the user from the list of available entities (such as "sales" or "account"). This tab allows users to track and manage the dimensions and measures used in a query. One can unselect selected members using the minus (-) icon. It also allows users to modify these selections with visibility and filter controls. The Clear button at the top of the Members Tab removes all selected dimensions and measures, resetting the query.

Example:

Selecting the dimension account, city and the measure account total_accounts will immediately add them to the Members tab, indicating their activation in the query. Similarly, clicking the (-) sign will remove the selected member from the query.

DPH

  • Visibility Control: The "eye" icon next to each member allows toggling the visibility of selected dimensions or measures in the query results.

    Example: Hiding the visibility of total_accounts using the "eye" icon will display only the city names in the query results, while the total accounts count will be hidden. This feature is useful for focusing on specific data points without removing the measure entirely from the query.

    DPH

  • Filtering Option: A "filter" icon allows users to apply filters to the selected dimension or measure directly from the Members Tab.

    Example: Clicking the filter icon on the city dimension and applying the filter "city is not equal to Sacramento" will display data for all cities except Sacramento. After applying the filter, click Run Query to view the filtered results.

    DPH

Saving Queries

Query data and save them with a suitable name so they can be viewed later.

Result Display and Visualization Panels

The Result Display and Visualization Panel is where users can choose how to view the outcomes of their queries. Once dimensions and measures are selected in the Members Tab and the query is executed, this panel offers multiple options for displaying the results. The panel provides flexible visual formats, allowing users to explore data in ways that best suit their analysis needs. Additionally, controls for Timezone, Limit, and Offset enable users to further customize how the data is presented, ensuring a comprehensive and user-friendly data exploration experience.

The following are the options for displaying the results:

Table

This option displays the query results in a tabular format. It’s ideal for users who prefer to see data in rows and columns for easy comparison and detailed analysis.

DPH

Table View

Chart

This option allows users to visualize the query results in a graphical format. Various charts (e.g., bar, line, pie) can be used to display trends, comparisons, and distributions visually.

DPH

Chart View

Pivot

This option allows for interactive analysis. Users can group, summarize, and manipulate the data to create custom views and insights. Pivot tables are useful for dynamic data exploration.

DPH

Pivot View

Integration

The Integration tab enables users to integrate the query results with other tools or services. This is particularly useful for exporting or connecting the data to external platforms for further processing or visualization. This gives data app developers the functionality to fetch real-time data and display it visually in the dashboard or create an application on top of it.

The data model can be integrated through the following methods:

  • CURL
  • GraphQL
  • Postgres
  • Source SQL

DPH

For instance, to retrieve the total number of accounts grouped by city using an API, verify the output using a curl command, and then automate the process using Python. Here’s a practical example that demonstrates how to utilize the given curl command step-by-step, along with integrating it into a Python application workflow.

Steps to Follow:

Step 1: Select Desired Measures and Dimensions First, select the total_account measure and the city dimension from the side pane. As selections are made, the curl command for the query is automatically generated. Here's the generated command:

curl --location 'https://lucky-eagle.dataos.app/lens2/api/public:purchasebehavior360/v2/load' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer <api key here>' \
--data '{ "query": {
  "measures": [
    "account.total_accounts"
  ],
  "dimensions": [
    "account.city"
  ],
  "segments": [],
  "filters": [],
  "timeDimensions": [],
  "limit": 10,
  "offset": 0
} }'

Step 2: Execute the Command in the Terminal

  • Copy the entire command and replace DATAOS_API_KEY with the actual API key.

  • Open the terminal (Command Prompt on Windows or Terminal on macOS/Linux/Windows) and paste the modified command into the terminal and press Enter.

  • The provided curl command retrieves data about the total accounts per city in json format. The response will appear as follows:

{"query":{"measures":["account.total_accounts"],"dimensions":["account.city"],"segments":[],"filters":[],"timeDimensions":[],"limit":15,"offset":0,"timezone":"UTC","meta":{"secured":{"segments":[],"dimensions":[]}},"rowLimit":15},"data":[{"account.city":"Los Angeles","account.total_accounts":59},{"account.city":"Chicago","account.total_accounts":12},{"account.city":"Kansas City","account.total_accounts":11},{"account.city":"San Francisco","account.total_accounts":8},{"account.city":"New York","account.total_accounts":6},{"account.city":"Greenville","account.total_accounts":5},{"account.city":"Miami","account.total_accounts":5},{"account.city":"Philadelphia","account.total_accounts":5},{"account.city":"Houston","account.total_accounts":5},{"account.city":"Sacramento","account.total_accounts":4},{"account.city":"Omaha","account.total_accounts":4},{"account.city":"Cincinnati","account.total_accounts":3},{"account.city":"Jacksonville","account.total_accounts":2},{"account.city":"San Jose","account.total_accounts":2},{"account.city":"Spokane","account.total_accounts":2}],"lastRefreshTime":"2024-10-09T07:32:55.554Z","refreshKeyValues":[[{"refresh_key":"14403825"}]] 
  • Check to ensure the output is correct and contains the expected data.

Step 3: Integrate the Command into the Application's Workflow Once the output is confirmed to be correct, data retrieval can be automated using a Python script with the requests library.

Python Script
import requests
import json
import pandas as pd

# API URL and headers
url = 'https://lucky-possum.dataos.app/lens2/api/public:purchasebehavior360/v2/load'
headers = {
    'Content-Type': 'application/json',
    'Authorization': 'Bearer dG9rZW5fYWN0dWFsbHlfYWR2ZXJzZWx5X2luc3BpcmVkX3JvYmluLmVkY2VlNmU2LTJmNWQtNDdjNy05M2I1LThmNWFjN2MzZjEyNA=='  # Replace with the actual DATAOS API key
}

# Define the payload for the API request
payload = {
    "query": {
        "measures": ["account.total_accounts"],
        "dimensions": ["account.city"],
        "segments": [],
        "filters": [],
        "timeDimensions": [],
        "limit": 15,
        "offset": 0
    }
}

# Send the POST request to the API
response = requests.post(url, headers=headers, json=payload)

# Check if the request was successful
if response.status_code == 200:
    # Parse the JSON response
    data = response.json()

    # Extract relevant data
    account_data = data.get('data', [])

    # Convert to DataFrame for analysis or reporting
    df = pd.DataFrame(account_data)

    # Display the DataFrame
    print("City-wise Total Accounts:")
    print(df)

    # Optionally, save the DataFrame to a CSV file
    df.to_csv('city_account_data.csv', index=False)
    print("Data saved to city_account_data.csv")
else:
    print(f"Failed to fetch data. Status code: {response.status_code}, Message: {response.text}")
# Expected_Output

City-wise Total Accounts:
    account.city  account.total_accounts
0     Los Angeles                      59
1         Chicago                      12
2     Kansas City                      11
3   San Francisco                       8
4        New York                       6
5         Houston                       5
6    Philadelphia                       5
7           Miami                       5
8      Greenville                       5
9      Sacramento                       4
10          Omaha                       4
11     Cincinnati                       3
12     Fort Wayne                       2
13         Denver                       2
14    Minneapolis                       2
Data saved to city_account_data.csv

Execution and Saving:

  • Run Query Button: Once dimensions and measures are selected, this button executes the query and updates the results in the selected view format (table, chart, pivot).
  • Save Perspective: After running a query, users can save the configuration as a Perspective to easily return to the same query setup in the future.

Model Tab

The Model tab offers insights into the underlying structure of the Data Product. It allows users to view and understand the data model, which is particularly useful for data architects or advanced users who need to know how different tables and fields are connected. This tab gives a comprehensive view of relationships between datasets, enabling a better understanding of data lineage and dependencies.

Left-side Navigation Panel

The left-side navigation panel provides key options to explore the model in different ways:

  • Graph: The default view, showing a visual lineage of how different tables and entities are interconnected. This interactive graph shows relationships between tables and views. Let us understand graph tab more with an detailed example.
  • Our example has three main entities or logical tables: account, sales, and product, and two metrics, conversion_rate and qtd_revenue, represented by a wave-like icon. Metrics are derived from one or more logical tables to track performance or monitor trends, using defined measures and dimensions from these tables.The Graph tab visually displays how these logical tables and metrics are interconnected, providing an interactive way to explore data relationships and schema details.

DPH

Here's how it works in more detail:

When any logical table or metric is selected, a detailed tab opens on the side, displaying all the defined measures and dimensions within the table. Each attribute is listed with its corresponding data type (e.g., numeric, string, date), providing a clear understanding of the table's schema. The following section will explore the sales logical table in detail.

Sales (logical table):

  • This is a logical table containing information about sales.
  • It shows how this table is connected to other entities in the system, such as accounts, products, and other metrics.
  • Clicking on the sales table opens a detailed tab on the side, displaying all the defined segments, measures, and dimensions within the table. Each attribute is listed with its corresponding data type (e.g., numeric, string, date) to provide a clear understanding of the table's schema.

DPH

Schema Breakdown

  • Segments: Subsets or groupings of sales data, such as proof_sales and nonproof_sales.
  • Measures: Quantitative attributes that are crucial for calculations. Examples: recency and frequency.
  • Dimensions: Categorical attributes that describe data segments or groupings. Examples: invoice_no invoice_date.
  • More details can be explored by selecting the Schema dropdown.

Let us also understand a Metric by giving an example.

Metric

Metrics are a type of view. All Metrics are represented by a wave-like icon resembling a formula, indicating that they are Metrics. Metrics are a type of view in the data model. Unlike logical tables, they do not have their dimensions or measures. Instead, they reference the dimensions and measures of logical tables. For example, the conversion_rate metric tracks the percentage of leads or prospects that convert into paying customers over a defined period (e.g., a month). However, It relies on fields from the Sales Table.

When a metric references a measure or dimension from a logical table, it adopts the naming convention table_name + field_name. This approach ensures that users can easily identify the origin of each measure or dimension used in a metric, promoting transparency and data lineage. For example, the measure frequency from the Sales Table is renamed sales_frequency when used in the Conversion Rate Metric to indicate its origin.

DPH

Dashed blue lines visually indicate the relationships between the Sales Table and the conversion_rate metric. The lines highlight which dimensions and measures from the Sales Table are used to calculate the conversion_rate metric.

Similarly, QTD Revenue Metrics leverage fields from multiple logical tables—Sales, Product, and Account—to offer a rich, multi-dimensional view of performance. Using join paths in the manifest file ensures that fields are correctly referenced, maintaining clarity and consistency across the data model.

The blue dashed lines in the graph visualization represent the relationships between the logical tables and the metrics, indicating how data from each table feeds into the metric calculations. This visual connection helps users understand how measures and dimensions from different sources contribute to the overall metric. All fields are prefixed with their respective table name (e.g., sales_, product_, account_), showing which fields from different tables are contributing to the metric.

DPH

  • Schema: This section contains detailed information about the structure of the tables, views, and entities within the data model. Users can view columns, data types, and relationships, making it easier to understand how data is stored and linked within the product.

Schema Section Components

Overview: The User Groups section on the right panel details the access permissions applied to each user group and the API scopes that are granted. The Security section displays the redacted column name.

In this example, there are two user groups: nonproof_analyst and default. Both groups can access all API scopes, including meta, data, graphql, jobs, and source.

  • The nonproof_analyst group includes a user identified by the tag users:id:kanakgupta.
  • The default group includes all members, providing universal access to anyone not explicitly excluded from the model.

DPH

When the account table is selected, the following sections are displayed.

  1. logical table Overview:
    • Table Name: The logical table name (e.g., account) is displayed, providing context on the explored dataset.
    • Table Type: The type (e.g., Table) and whether it is public (True/False) are also indicated, providing insight into accessibility.
    • Field Overview: The schema section categorizes the table’s fields into Measures, Dimensions, and Segments, showing the total number of each type.
  2. Measures:
    • Measures represent quantitative fields that are used in calculations or aggregations.
    • Example: The measure total_accounts in the Account table shows the total number of accounts. It is accompanied by information such as its data type (number), aggregation type (count(distinct)), and a brief description ("Total customers").
  3. Dimensions:
    • Dimensions represent qualitative attributes used for categorization or grouping.
    • Example: Dimensions in the Account table include attributes like customer_name, address, city, and state. Each dimension is described by its data type (e.g., string, time), SQL reference, and a brief description explaining its purpose.
  4. Segments:
    • If applicable, segments are listed, representing pre-defined filters within the table for more granular analysis.
  5. Additional Information:
    • Secure: Indicates whether access to certain fields is restricted based on user permissions. For example, here, the email column is secured and redacted.
    • Aliases: Provides alternate names for fields if defined.

DPH

Account Table Schema

DPH

The secure column header indicates that the Email column is redacted

One can explore the details on each table. For example, if we click on the account table, we can see details like the total number of measures and dimensions and details like whether the table is public or not. It also exposes the underlying data type, aggregate type, and its SQL.

DPH

  • Files: This contains all relevant SQL files, tables, views, and YAML files (e.g., user groups) necessary to define the Lens. Here, users can explore the actual implementation and configuration of the model, making it useful for developers or advanced users who need to see the code and metadata behind the Data Product.

For example, In the above image are the YAML Implementation files for the Lens model of the Data Product for all entities. One can also view the Lens artifact in the Metis by clicking on Open in Metis button in the right side corner

DPH

GraphQL Tab

The GraphQL tab provides an interface for advanced users and developers to write and execute GraphQL queries. GraphQL enables precise data retrieval, allowing full control over the structure of the returned data. This is useful for performing complex or customized data extractions that extend beyond the standard capabilities of the Studio interface.