Dagster is a data orchestrator. It lets you define pipelines (DAGs) in terms of the data flow between logical components called solids. These pipelines can be developed locally and run anywhere.
Each page in this section contains:
Solids and Pipelines are the building blocks of Dagster code. You use these to define orchestration graphs. This section covers how to define and use both solids and pipelines.
Modes, alongside resources, enable you to separate the pipeline logic from its heavyweight external dependencies. This makes testing and develop data pipelines possible in various environments.
Dagster enables you to build testable and maintainable data applications. This section shows that Dagster enables you to unit-test your data pipelines, separate business logic from external dependencies, and run data quality tests.
Dagster provides a configuration system that allows you to document, schematize, and error-check your configuration. This section demonstrates how configurations work with different Dagster entities.
Dagster includes gradual, opt-in typing for the inputs and outputs of solids. This section explains how to define, use, and test types in Dagster.
IO Managers are user-provided objects that store solid outputs and load them as inputs to downstream solids. This section explains how Dagster thinks about IO management and shows how to define and use IO managers and other IO-related features.
Dagit is a web-based interface for viewing and interacting with Dagster objects. This section walks you through Dagit's functionalities and the GraphQL API used to interact with Dagster programatically.
A workspace is a collection of user-defined repositories and information about where to find them. Dagster tools, like Dagit and the Dagster CLI, use workspaces to load user code. This section shows how to define and when to use repositories and workspaces.
Schedulers can launch runs on a fixed interval, while sensors allow you to run based on any external state change. This section demonstrates how to define them and their convenient capabilities like partitioning and backfilling.
Assets are data objects that you produce during a pipeline run. This section walks you through how to inform Dagster about these assets so that they can be tracked over time.
Dagster includes a rich and extensible logging system. This section showcases Dagster's built-in logger and shows how you can customize loggers to fit your logging and monitoring infrastructure.
Executors are responsible for executing steps within a pipeline run. This section explains how to define an executor to meet your computation needs.