DataOS Operator Learning Track¶
This track is designed to empower you with the expertise to manage, optimize, and ensure the reliability of the DataOS platform.
By the end of this course, you'll be able to:
-
Secure platform operations: Implement credential security and access management practices to protect sensitive data and enforce proper access control.
-
Establish and maintain connectivity: Configure secure, stable connections to data sources while adhering to best practices for security.
-
Monitor and optimize performance: Use system metrics, routine checks, and query cluster management to ensure seamless data access and platform efficiency.
-
Manage platform upgrades: Plan, execute, and roll back platform upgrades confidently, minimizing downtime and maintaining operational stability.
-
Proactively resolve issues: Leverage monitoring tools like Prometheus and Grafana to detect and address issues before they impact the platform.
-
Enforce precise access control: Manage user roles and permissions to ensure data security while enabling efficient collaboration within the DataOS platform.
Module 1: Credential security¶
As a DataOS Operator, you are responsible for safeguarding sensitive information by managing credentials securely within the DataOS platform. Learn to implement best practices, ensuring the system remains secure.
Module 2: Data source connectivity¶
Your team needs to connect a new data source to DataOS while ensuring robust security. You need to configure a stable connection using encrypted credentials ensuring reliable access to the data source.
Module 3: Routine checks¶
As part of your daily tasks, you perform a routine system health check as you might often encounters challenges in managing DataOS’ Kubernetes infrastucture and managing Pulsar configurations to ensure DataOS operates smoothly. In this module, navigate through essential commands, dashboards, and administrative adjustments to keep the platform robust and efficient.
Module 4: DataOS upgrade and rollback strategies¶
Your team is planning to upgrade the DataOS platform to a new version. You need to create a detailed upgrade plan, including downtime schedules and a rollback strategy in case of failures. Learn about essential measures to complete the upgrade successfully, ensuring minimal disruption to operations.
Module 5: System monitoring¶
Using Prometheus and Grafana, you can set up dashboards to monitor key system metrics like CPU usage, memory consumption, and query performance. By Configuring alerts for key performance indicators and troubleshooting performance issues, you can prevent system slowdown and maintain performance stability.
Module 6: Query Cluster management¶
Your team reports slow query performance during peak hours. You need to analyze the query cluster logs and identify an underperforming cluster. In this module, learn to schedule cluster scaling during peak hours using cron jobs, ensuring seamless data access and optimal performance.
Module 7: Access management¶
A new team has been onboarded to work on a data project in DataOS. You evaluate their role requirements and grant them access to only the necessary datasets and tools, preventing unauthorized access while allowing the team to perform their tasks efficiently.