Stack¶

Stack is a DataOS Resource that acts as an execution engine and an extension point for integrating new programming paradigms within the platform. Stacks are composable and can be orchestrated using DataOS Resources such as a Worker, Service, or within a designated job in a Workflow Resource.

While certain pre-configured Stacks such as Flare, Bento, and Scanner are natively available within DataOS, users retain the autonomy to define and deploy their own tailor-made Stacks.

Key features of Stacks¶

Custom Computing Environments

Stacks provide a means for users to define and manage custom computing environments tailored to their specific needs.

Dynamic Transformation

Traditionally, a static, rigid, and linear approach exists for building and deploying software images, with each transformation step requiring manual intervention and a rebuild. In contrast, Stacks enable a dynamically configured transformation approach, allowing more flexibility, where the transformation process is defined through a configuration, making it easier to adapt and extend the process as needed.

Ensuring Consistency and Reliability

Stacks encapsulate the necessary components, configurations, and dependencies, ensuring that each execution within a Stack adheres to predefined specifications and operates seamlessly within the broader DataOS ecosystem.

Abstraction of Deployment Complexities

Stacks function as an abstraction layer, obviating the need to grapple with the intricacies of deployment. This abstraction empowers data developers to concentrate on their use cases, unfettered by the burden of micromanaging low-level technical intricacies.

Increasing Platform Team productivity

Traditionally, data platform teams invest considerable time integrating new programming paradigms within the platform, a repetitive task recurring regularly. The introduction of Stacks delegates the responsibility of creating new Stacks to data developers, enabling them to declaratively create Stacks. For advanced capabilities requiring custom Operators, collaboration with the Platform team may be necessary. This collaborative model enhances the productivity of both teams, preventing the Platform team from becoming a bottleneck for successive capability integration requests.

Ensures Guarantees

Stacks assure specific guarantees, such as displaying lineage information on the specified endpoint, ensuring governance, appropriate observability, and orchestrating using DataOS Resources like Worker, Workflow, and Service. Declarative specification of these attributes replaces the extensive development efforts previously required from the Platform team.

Reusing existing Codebases

Data Developers can create tailor-made Stacks to incorporate their existing codebases into DataOS, eliminating the need for rewriting. This expedites onboarding and allows developers to seamlessly declare their Stacks (e.g., RClone, Great Expectations) and commence work within the DataOS ecosystem.

Built-in Stacks in DataOS¶

Container¶

Container Stack is a declarative DevOps SDK used for seamless deployment of data applications into production environments.

Beacon¶

Beacon Stack is a standalone HTTP server that exposes API endpoints on top of a Postgres database. It offers a single flavor beacon+rest that enables exposure of REST APIs on Postgres database.

Bento¶

Bento is a high-performance, resilient, and declarative stream processing Stack.

CLI¶

The DataOS Command-Line Interface (CLI), known as dataos-ctl, also serves as a Stack within the DataOS ecosystem. It enables users to programmatically execute CLI commands through a YAML manifest. To learn more about CLI Stack, refer to the link: CLI.

Flare¶

Flare is a powerful declarative Stack designed specifically for large-scale data processing tasks.

Scanner¶

The Scanner Stack in DataOS is a Python-based framework that allows developers to extract metadata from external source systems (such as RDBMS, Data Warehouses, Messaging services, etc.) as well as components/services within the DataOS environment.

Soda¶

Soda is a declarative Stack integrated into DataOS, specifically for data quality testing and profiling across one or more datasets. To learn more about Soda, refer to the link: Soda.

These built-in Stacks offer a wide range of capabilities, empowering data developers to efficiently build, process, and manage data within the DataOS ecosystem.

How to create your own Stack?¶

Aside from the pre-defined Stacks within DataOS, data developers retain the autonomy to create their own tailor made Stacks to extend the existing capabilities of the platform and introduce new programming paradigms within DataOS. To learn more refer to the following link: How to create your own custom Stack?