Learn¶
Welcome to the DataOS Learning Hub!
We have designed the DataOS Learning Hub to cater to your specific role and expertise level within the DataOS ecosystem. Our learning tracks are tailored to meet the needs of different personas, ensuring you receive the knowledge and skills required to excel. To further support your journey, we have made quick start guides and instructional videos available.
Learning tracks¶
Learning tracks are created to meet the unique needs of different user personas. These tracks offer learning paths to help individuals acquire the necessary skills to leverage DataOS capabilities—whether it's creating, managing, or consuming Data Products.
Each learning track is organized into modules, which further break down into topics. These topics combine quick concepts, practical scenarios, code snippets, and visuals to make your learning experience engaging and efficient.
Choose a learning path that suits your role:
-
Data Product Consumer
Crafted to help you gain a deeper understanding of how to work with Data Products. You'll develop the skills necessary to explore, analyze, and utilize Data Products effectively in your role, whether you're a Data Analyst, Data Scientist, or Business Analyst.
-
Data Product Developer
Designed to equip you with the skills needed to create, manage, and scale Data Products using DataOS. Whether it’s understanding business requirements or diving into the technical nitty-gritty of data pipelines, access control, quality checks, and more, this track covers all the essentials for your role.
-
DataOS Operator
Created to empower you with the knowledge and skills necessary to effectively manage the DataOS platform. As a DataOS Operator, you are responsible for overseeing the platform’s infrastructure, compute resources, data security, and compliance.
Data Product Consumer¶
Data Product Consumers in DataOS encompass a variety of roles, such as Data Analysts, Business Analysts, and Data Scientists. Analysts play essential roles in leveraging data for actionable insights and strategic decision-making. They utilize DataOS to discover, explore, and activate Data Products, enabling them to transform raw data into valuable business intelligence and drive innovation. Data Scientists leverage advanced analytical techniques and machine learning algorithms to extract meaningful insights from data within DataOS.
Key responsibilities¶
Here are the key responsibilities of a Data Product Consumer, though specific tasks may vary depending on the role or initiative:
-
Discovering and accessing Data Products: Identify and access relevant Data Products based on business needs. Interpret metadata to understand product details and assess the usability of Data Products for informed decision-making.
-
Navigating semantic models: Understand the relationships between data entities within semantic models to improve data comprehension.
-
Checking data quality: Evaluate Data Products for accuracy, consistency, and completeness, ensuring high-quality analysis and decision-making.
-
Understanding governance and policies: Ensure data usage and access aligns with organizational security standards and regulations.
-
Activating Data Products: Consider how Data Products can be consumed with Business Intelligence (BI) tools, APIs, and other applications to enhance workflows and reporting.
-
Tracking metrics and performance: Monitor performance, usage, and impact metrics of Data Products to assess their effectiveness and communicate results to stakeholders.
Modules overview¶
In this learning track, you will get a comprehensive introduction to Data Products, covering their types and importance in driving insights. You'll learn to navigate the Data Product Hub (DPH), access essential data product information, analyze input/output for meaningful insights, explore semantic models, assess data quality, and understand governance policies for data security.
Click here for details on the Data Product Consumer learning track modules.
No. | Module | Description | Key Topics |
---|---|---|---|
1 | Understanding Data Products | Get a solid foundation on what Data Products are and how they can drive insights and decision-making. Learn about their features, and importance in business processes. |
|
2 | Discovering Data Products on DPH | Learn how to navigate the Data Product Hub (DPH) to find Data Products that meet your needs using search, filters, tags, and categories. |
|
3 | Viewing Data Product Info | Access key details of the data product—contributors, tier, type, and tags, along with links to relevant Git repository and Jira for easy reference and collaboration to make informed decisions on data product usage. |
|
4 | Exploring Input and Output Data | Explore the input and output datasets that are either fed into or generated by the data product for consumption. |
|
5 | Navigating Semantic Models | Explore semantic models to understand relationships between data entities and improve data integration and comprehension. |
|
6 | Checking Data Quality | Learn how to assess data quality through key factors like accuracy, consistency, and timeliness to ensure reliable analysis. |
|
7 | Managing Data Governance | Understand governance policies, and compliance standards implemented with Data Products to ensure data security and integrity. |
|
8 | Integrating Data Products with BI Tools and Applications | Unlock the power of Data Products by connecting them to BI tools. Learn to use the data product in Jupyter Notebooks for AI/ML development, query data via Postgres or GraphQL, and easily integrate with your apps using flexible APIs. |
|
Start learning: Click here to access the modules.
Data Product Developer¶
Data Product Developers play a key role in creating, managing, and evolving Data Products within DataOS. They are responsible for building the data infrastructure that powers everything from analytics to business intelligence, making sure data flows smoothly through pipelines and stays accurate and accessible for users. Plus, they ensure those Data Products deliver reliable insights while staying in line with governance policies.
Key responsibilities¶
Here are the key responsibilities of a Data Product Developer, though specific tasks may differ based on the role or objective:
-
Collaborate with stakeholders: Collaborate with stakeholders to gather requirements, align Data Products with business objectives.,
-
Design Data Products: Design semantic models, define quality and security standards, and determine how users will consume the data product.
-
Data Pipeline Management: Create data pipelines, implement data transformations to efficiently handle data ingestion.
-
Quality Assurance: Ensure data integrity through quality checks and monitoring.
-
Data Governance and Security: Apply appropriate data security, access controls, ensuring regulatory compliance.
-
Deployement and Maintainance: Deploy Data Products efficiently, monitor their performance, and manage updates using CI/CD practices.
Modules overview¶
The learning track for Data Product Developers is divided into modules, each focusing on essential stages of the data product lifecycle. Every module covers key topics that provide step-by-step guidance using hands-on examples and best practices ensuring a comprehensive and practical learning experience.
Detailed module breakdown
Click here for details on the Data Product Developer learning track modules.
No | Modules | Description | Topics |
---|---|---|---|
1 | Understanding Data Needs | In this module, the focus is on grasping the business requirements that will guide the creation of the data product. Key activities include: |
|
2 | Designing Data Products | This module dives into the design phase using DataOS Metis and Workbench tools. |
|
3 | Building Data Products | This module covers the technical aspects of constructing the data product. |
|
4 | Deploying Data Products | The final module focuses on deploying the data product within DataOS. |
|
Start learning: Click here to access the modules.
DataOS Operator¶
A DataOS Operator is the administrator responsible for managing and maintaining the DataOS platform. This role involves overseeing the system’s performance, ensuring the secure management of resources, and guaranteeing compliance with regulatory standards. The operator is the key figure who ensures the platform’s day-to-day operations run smoothly, providing a stable environment for all teams interacting with DataOS.
The DataOS Operator handles a range of tasks, from provisioning compute resources to managing access controls and system security. They are also responsible for monitoring system health, ensuring interoperability with external systems, and scaling the platform to meet growing demands. In essence, the DataOS Operator ensures the platform’s integrity and performance, allowing teams to leverage data efficiently while safeguarding critical assets.
Key responsibilities¶
A DataOS Operator could be an existing Forward Deployment Engineer, DevOps Engineer, or a Cloud Engineer. Here are the key responsibilities of a DataOS Operator:
-
Kubernetes cluster management: Oversee and manage Kubernetes clusters to ensure the optimal performance of the DataOS platform.
-
Cloud infrastructure management: Handle deployments and resource management on cloud platforms like AWS, GCP, or Azure.
-
System monitoring: Use tools like Prometheus and Grafana to monitor system health, track performance metrics, and resolve issues proactively.
-
Access control management: Manage authentication and authorization mechanisms to enforce data governance and ensure appropriate access to resources.
-
Container management: Manage Docker containers to ensure smooth operation within DataOS' containerized environment.
-
Minerva cluster management: Optimize and manage Minerva Clusters to handle query processing and ensure efficient resource use.
-
Credential and secret management: Securely manage sensitive information, including credentials and secrets, to maintain system integrity.
-
Compute resource provisioning and scaling: Provision and scale compute instances based on the platform’s needs, ensuring sufficient resources for workflows, jobs, and queries.
-
Regulatory compliance: Ensure that all platform operations comply with relevant regulatory standards for security and data management.
-
System security: Maintain the security of the DataOS platform, implementing best practices for resource and data protection.
Modules overview¶
The learning track is divided into modules, with each module focusing on key operational areas. Every module contains specific topics that address common challenges you will encounter as a DataOS Operator and guide you through the core aspects of this role with the tools to troubleshoot efficiently.
Detailed module breakdown
Click here for details on the DataOS Operator learning track modules.
No | Modules | Description | Topics |
---|---|---|---|
1 | Compute management | Learn how to manage compute resources effectively to ensure smooth operation of workflows, jobs, services, and querying processes within DataOS. |
|
2 | Query cluster management | Understand how to optimize and manage query clusters to provide seamless data access and performance. |
|
3 | Credential security | Safeguard sensitive information by managing credentials securely within the DataOS platform. |
|
4 | Data source connectivity | Learn how to establish secure and stable connections to data sources while adhering to best practices for security and performance. |
|
5 | Access management | Ensure appropriate access control by managing user permissions and roles within the DataOS platform. |
|
6 | System monitoring | Proactively monitor the platform using system metrics to ensure optimal performance and resolve issues before they affect operations. |
|
7 | Interoperability with external platforms | Ensure smooth interoperability between DataOS and external platforms by managing integrations and connections securely. |
|
8 | Stack provisioning | Scale the DataOS platform by provisioning additional stacks to meet increasing resource demands. |
|
9 | Compliance and governance | Ensure that the DataOS platform adheres to global data governance standards and regulatory requirements. |
|
Quick start guides¶
Looking for a fast way to get up and running? Our Quick Start Guides provide step-by-step instructions for performing key tasks and operations within DataOS. Perfect for getting things done quickly!
Videos¶
Explore our Video Library to watch tutorials that cover various topics from the basics to advanced features of DataOS.