Terraform for data platforms. Build, validate, secure, and evolve data stacks using modular components and explicit contracts.
Composable Data Stack (CDS) is a framework for defining and assembling data platforms from reusable modules such as orchestrators, warehouses, BI tools, and secrets providers.
- Star and follow on GitHub: RonaldHensbergen/composable-data-stack
- Contribute: open a discussion, file an issue, or send a PR to help shape CDS
- Proof it: if you run it in a real workflow, share your feedback β good or bad
Note: Development helper tools are located in the
tools/directory (git-ignored). Seetools/pr-cli/README.mdfor PR creation scripts.
Instead of hardcoding integrations or relying on fragile pipelines, CDS introduces:
- π§ Modules: reusable components (Dagster, Postgres, Superset)
- π Contracts: explicit interfaces between components
- π§© Profiles: fully composed, runnable stacks
Think of it as Infrastructure as Code, but for data platforms.
Modern data platforms force a trade-off:
| Approach | Problem |
|---|---|
| Monolithic stack | Rigid, hard to evolve |
| Custom pipelines | Flexible but fragile and inconsistent |
CDS gives you the best of both:
- composability without chaos
- flexibility with guarantees
- modularity with structure
- no vendor lock-in by design
Use CDS if you:
- want to swap tools (Airflow β Dagster, Superset β Metabase)
- need reproducible environments across dev, CI, and prod
- are building a platform for multiple teams
- want contract-driven integration instead of implicit coupling
CDS may be overkill if:
- you only run a single-tool stack
- you do not need interchangeable components
The local-dagster-postgres-superset profile defines:
- Dagster -> orchestration
- Postgres -> storage
- Superset -> BI
- Validates module definitions
- Resolves contract bindings
- Checks compatibility and security constraints
- Produces a fully wired stack definition
cds plan resolves the full dependency graph before runtime configuration is generated, ensuring all module interactions are valid and predictable.
You can replace components without changing system behavior:
Dagster -> Airflow
Superset -> Metabase
Postgres -> MariaDB
CDS wires modules through contracts, not direct dependencies:
---
config:
layout: elk
---
flowchart TD
Dagster[Dagster]
Postgres[(Postgres)]
Superset[Superset]
Dagster -->|transformation-runner| Postgres
Postgres -->|warehouse-query| Superset
classDef tool stroke:#818cf8,fill:#eef2ff
classDef database stroke:#2dd4bf,fill:#f0fdfa
classDef viz stroke:#a78bfa,fill:#f5f3ff
class Dagster tool
class Postgres database
class Superset viz
CDS includes built-in security validation to prevent unsafe configurations before a stack is deployed.
The cds security checks analyze profiles and modules for common risks such as:
- weak or default passwords
- missing secret configurations
- insecure service exposure
- unsafe defaults in module configuration
- incomplete contract bindings that may leak data
Security checks run as part of validation and can be extended with custom rules.
cds security local-dagster-postgres-supersetWhen you run CDS:
- validated module graph
- resolved contract bindings
- dependency-aware execution plan
- generated Docker Compose configuration
- reproducible stack definition
This allows you to go from a declarative profile to a runnable local data stack.
git clone https://github.com/RonaldHensbergen/composable-data-stack.git
cd composable-data-stackLinux/macOS:
python3 -m venv .venv
source .venv/bin/activate
pip install -e .Windows PowerShell:
py -m venv .venv
.\.venv\Scripts\Activate.ps1
python -m pip install -e .Windows CMD:
py -m venv .venv
.venv\Scripts\activate.bat
python -m pip install -e .If PowerShell blocks the activation script, run
Set-ExecutionPolicy -Scope Process -ExecutionPolicy Bypass in the same
terminal session and activate the environment again.
cp .env.example .envSet:
CDS_POSTGRES_PASSWORD
CDS_SUPERSET_SECRET_KEY
CDS_SUPERSET_ADMIN_PASSWORD
cds validate local-dagster-postgres-supersetExpected output:
Profile is valid.
cds security local-dagster-postgres-supersetcds plan local-dagster-postgres-supersetThis resolves:
- module dependencies
- contract bindings
- execution order
cds render local-dagster-postgres-supersetBy default, this writes docker-compose.yml to the project root so you can run docker compose up immediately.
Use a custom location when needed:
cds render local-dagster-postgres-superset --output build/docker-compose.ymlThis generates:
- docker-compose.yml
- service definitions
- fully wired module configuration
Reusable building blocks:
- orchestration (Dagster, Airflow)
- warehouse (Postgres, MariaDB)
- BI (Superset, Metabase)
- secrets (env, vault)
Structure:
modules/<category>/<name>/
βββ module.yaml
βββ defaults.yaml
βββ compose.yaml
βββ scripts/
βββ tests/
Contracts define how modules interact.
Examples:
| Contract | Purpose |
|---|---|
| sql-database | database interface |
| http-service | service exposure |
| secrets-provider | secret resolution |
Example binding:
dagster.database -> postgres.sql-database
superset.database -> postgres.sql-database
No implicit dependencies. Everything is explicit.
Profiles define supported stacks:
local-dagster-postgres-superset
local-airflow-postgres-superset
integration-airflow-postgres-dbt
Structure:
profiles/[profile]/
βββ profile.yaml
βββ values.yaml
βββ README.md
| Command | Description |
|---|---|
| cds validate [profile] | Validate modules and contracts |
| cds plan [profile] | Resolve dependencies and generate an execution plan |
| cds render [profile] | Generate Docker Compose configuration from a resolved plan |
| cds up [profile] | Start services (planned) |
| cds test [profile] | Run health checks (planned) |
[profile] accepts:
| Form | Example |
|---|---|
| Profile name | local-dagster-postgres-superset |
Path to a profile.yaml file |
profiles/local-dagster-postgres-superset/profile.yaml |
| Path to a profiles root directory | profiles/ |
When [profile] is omitted, CDS_PROFILE_PATH is used instead and accepts the same three forms. If neither is provided and exactly one profile exists under profiles/, it is selected automatically.
To view the full list of options for any command, use the --help flag:
cds --help
cds validate --help
cds plan --helpCommon errors from cds validate, cds plan, and cds render, and how to fix them.
| Error | Cause | Fix |
|---|---|---|
[E020] ... YAML file not found: <path> |
The profile identifier or file path passed to cds validate, cds plan, or cds render <profile> doesn't resolve to an existing YAML file. |
Run cds list profiles to see valid identifiers. Set CDS_PROFILE_PATH to a profile name, a profile.yaml file path, or a profiles root directory. |
[E081] ... Required secret "CDS_X_PASSWORD" not found in environment |
A secret marked required: true in the profile's spec.secrets.values is missing from the shell environment or the .env file in the current working directory. |
Copy .env.example to .env in the project root and set the missing CDS_* variable, or export it directly before running the command. |
[E041] ... Contract ref "x.y" points to unknown module "x" |
A consumes binding's contractRef refers to a module ID that isn't defined in the profile. |
Check spec.modules for the correct module id, and confirm the contract ref follows <module-id>.<contract-name>. |
[E041] ... but it does not provide "<contract-name>" |
The referenced module exists, but its spec.provides list doesn't expose that contract name. |
Check the producing module's module.yaml for the contracts it actually provides, and fix the consumer's contractRef to match. |
[E042] ... Contract kind mismatch |
The consumer expects one contract kind (e.g. sql-database) but the producer exposes a different kind. |
Point the binding at a module that provides the expected contract kind, or update the consumer's expected kind if the mismatch is intentional. |
All diagnostics print with their error code and YAML path (e.g. spec.modules[1].config), so search the profile file for that path to find the exact line to fix.
1. cds validate -> check module definitions
2. cds security -> detect unsafe configurations
3. cds plan -> resolve dependencies and bindings
4. cds render -> generate Docker Compose stack
5. cds up -> start services (planned)
6. cds test -> run health checks (planned)
.
βββ cli/
βββ modules/
β βββ bi/
β βββ orchestration/
β βββ secrets/
β βββ warehouse/
βββ profiles/
βββ docs/
βββ pyproject.toml
βββ Makefile
Modules declare:
- what they provide
- what they require
- configuration inputs
- health checks
- lifecycle hooks
Profiles define supported stacks. The profile is the unit of support, not individual modules.
Zero Hidden Coupling
- no implicit environment variables
- no cross-module assumptions
- no shared mutable state
All interactions happen through explicit contracts.
CDS validates configurations before runtime, ensuring that:
- weak credentials are detected early
- secrets are properly configured
- services are not unintentionally exposed
Security is part of platform composition, not an afterthought.
The same composition model applies across:
- local development
- CI environments
- production
Only runtime packaging differs.
| Capability | Monolith | Custom pipelines | CDS |
|---|---|---|---|
| Swap components | β | β | |
| Reuse modules | β | β | β |
| Explicit contracts | β | β | β |
| Reproducibility | β | ||
| Security validation | β | β | β |
| Vendor lock-in | β | β |
MVP ready:
- module validation
- contract resolution
- security checks
- profile composition
- Docker Compose rendering
Next:
- runtime orchestration
- Kubernetes support
- advanced secret providers
- stack bootstrap and health checks
See docs/roadmap.md for milestones and detailed status.
Contributions are welcome.
Please read these first:
Good first contributions:
- adding new modules
- improving profile examples
- extending contract definitions
- adding validation or security rules
- Quickstart β get running in 5 minutes
- From Docker Compose to CDS Profile β complete transformation guide
- Architecture β design and core concepts
- Modules β how to structure reusable components
- Roadmap β planned features and milestones
See LICENSE.