Jupyter Notebook integration¶
You can consume a Data Product to build the AI/ML models using Jupyter Notebook. Following are the steps to connect the Data Product to the Jupyter Notebooks.
Method 1¶
-
Go to the Access Options tab of your Data Product, on the App Development section, and click on the Download, it will download a
.ipynb
file. -
Download the file and open it in any IDE, such as VS Code or PyCharm. This
ipynb
file contains the templates to consume the Data Product via Rest APIs, PostgreSQL interface, and GraphQL APIs, and source native interface. -
To consume the Data Product via Rest APIs, you first need to create the API using Talos Stack. Follow the steps listed in the Talos Stack documentation to create the API.
-
Edit the template by providing, the API URL and API key and your actual query by taking help from the query example given in the template as shown below and run the code.
-
Rest APIs template
# Import necessary libraries import requests import pandas as pd import json # API URL and API key api_url = "https://lucky-possum.dataos.app/lens2/api/public:corp-market-performance/v2/load" apikey = 'api key here' # API payload, enter YOUR_QUERY here. payload = json.dumps({ "query": { YOUR_QUERY } }) # Query Example: This is how your query should look like. # "query": { # "measures": [ # "sales.total_quantities_sold", # "sales.proof_revenue" # ], # "dimensions": [ # "inventory.warehouse" # ], # "timeDimensions": [ # { # "dimension": "sales.invoice_date", # "granularity": "day" # } # ], # "limit": 1000, # "responseFormat": "compact" # } # Headers headers = { 'Content-Type': 'application/json', 'apikey': apikey } # Fetch data from API def fetch_data_from_api(api_url, payload, headers=None): response = requests.post(api_url, headers=headers, data=payload) if response.status_code == 200: data = response.json() df = pd.json_normalize(data['data']) # Create DataFrame return df else: print(f"Error: {response.status_code}") return None # Main execution if __name__ == "__main__": data = fetch_data_from_api(api_url, payload, headers=headers) if data is not None: print("Data Frame Created:") print(data.head()) # Show the first few rows of the DataFrame print("Ready for AI/ML model building.") else: print("Failed to fetch data.")
-
-
To consume the Data Product via the PostgreSQL interface, you need to provide the
dbname
,user
,password
,host
, andport
of your PostgreSQL server in the code, also write the query in the query section, and run the code.-
PostgreSQL template
# Import libraries import psycopg2 import pandas as pd from sqlalchemy import create_engine # Database connection details dbname = "postgres" user = "postgres" password = "*" host = "tcp.lucky-possum.dataos.app" port = "6432" # SQL query query = " YOUR_SQL_QUERY_HERE " # Your query should look like this: query = "SELECT total_quantities_sold, product_id FROM sales;" # Function to connect to the database and execute the query def connect_and_query(dbname, user, password, host, port, query): conn = psycopg2.connect(dbname=dbname, user=user, password=password, host=host, port=port) df = pd.read_sql(query, conn) conn.close() return df # Main execution if __name__ == "__main__": data = connect_and_query(dbname, user, password, host, port, query) if data is not None: print("Data Frame Created:") print(data.head()) # Ready for AI/ML model building else: print("Failed to fetch data.")
-
-
To consume the Data Product via GraphQL, you must provide your GraphQL API URL, API key, and GraphQL query as shown in the template.
-
GraphQL template
# Import libraries import requests import pandas as pd import json # API URL and API key graphql_url = "https://lucky-possum.dataos.app/lens2/api/public:corp-market-performance/v2/graphql" apikey = 'api key here' # Placeholder for the GraphQL query graphql_query = """ { inventoryData { YOUR_FIELDS_HERE # Replace this with actual fields } } """ # Your GraphQL query should look similar to this # graphql_query = """ # { # inventoryData { # {} # Using the measure variable # sales_proof_revenue # inventory_total_bottles_on_hand # inventory_total_cases_on_hand # sales_invoice_date # } # } # """ # Convert the query to JSON payload payload = json.dumps({ "query": graphql_query }) headers = { 'Content-Type': 'application/json', 'apikey': apikey } # Function to fetch data from GraphQL API def fetch_data_from_graphql(graphql_url, payload, headers=None): response = requests.post(graphql_url, headers=headers, data=payload) if response.status_code == 200: data = response.json() df = pd.json_normalize(data['data']['inventoryData']) # Normalize the JSON data to create a DataFrame return df else: print(f"Error: {response.status_code}") return None # Main execution if __name__ == "__main__": data = fetch_data_from_graphql(graphql_url, payload, headers=headers) if data is not None: print("Data Frame Created:") print(data.head()) # Display the first few rows of the DataFrame # Extend this code for AI/ML use cases # You can now preprocess the 'data' DataFrame for AI/ML tasks # Example: from sklearn.model_selection import train_test_split, etc. else: print("Failed to fetch data.")
-
-
After executing the code, you are ready to build AI/ML models.
Method 2¶
-
Download the Postman collection and open it on the Postman application.
-
Hit the endpoint.
-
Navigate to the ‘Code’ icon.
-
From the drop-down menu, select ‘Python - Requests’ as shown below.
-
Copy the code snippet and paste it on your Jupyter Notebook.
-
Execute the code and you are ready to build your AI/ML model. Successful execution will look like the following.