ELT pipeline with dbt, BigQuery and Airflow orchestration on Google Cloud. Raw data: Qatar FIFA World Cup Players stats.
- dbt (dbt-core + dbt-bigquery) for data transformations
- BigQuery as data warehouse
- Apache Airflow (Cloud Composer) for orchestration
- Astronomer Cosmos for materializing dbt models as Airflow tasks
- Cloud Run Jobs for dbt execution
- Cloud Build for CI/CD
- uv as Python package manager
- direnv with
.envrcfor environment management
world_cup_qatar_elt_dbt/dag/- Airflow DAGsworld_cup_qatar_elt_dbt_case1_dag.py- triggers dbt viaCloudRunJobOperatorworld_cup_qatar_elt_dbt_case2_dag.py- dbt models as Airflow tasks via Cosmos with Cloud Run executorsettings.py- DAG settings from Airflow Variables
world_cup_qatar_elt_dbt/dbt/world_cup_qatar_elt/- dbt project (models, macros, profiles)dags_setup/- Composer DAG setupconfig/variables/- Airflow variables per environmentscripts/- deployment scripts for DAGs and config to Cloud Composerdeploy-dbt-app.yaml- Cloud Build config for dbt Cloud Run jobdeploy-dag.yaml- Cloud Build config for DAG deploymentDockerfile- dbt image for Cloud Run jobdbt_requirements.txt- separate dbt-only deps for Docker (keeps image lean)
- Python 3.13.11 (see
.python-version) - Uses
uvfor dependency management (pyproject.toml) - Uses
direnvwith.envrcfor automatic venv activation and env vars - Run
uv syncto install dependencies - GCP project:
gb-poc-373711, region:europe-west1 - Cloud Composer environment:
dev-composer-env
Use the airflow-gcp-docker-dev image to test DAGs locally against real GCP resources before deploying to Composer.
docker run -it \
-p 8080:8080 \
-e GOOGLE_APPLICATION_CREDENTIALS=/root/.config/gcloud/application_default_credentials.json \
-e GCP_PROJECT=gb-poc-373711 \
-e GOOGLE_CLOUD_PROJECT=gb-poc-373711 \
-e COMPOSER_LOCATION=europe-west1 \
-v $HOME/.config/gcloud/application_default_credentials.json:/root/.config/gcloud/application_default_credentials.json \
-v $(pwd)/world_cup_qatar_elt_dbt:/opt/airflow/dags/world_cup_qatar_elt_dbt \
-v $(pwd)/config:/opt/airflow/config \
airflow-devThe entire world_cup_qatar_elt_dbt folder is mounted because the Cosmos DAG needs the dbt project alongside the DAG files.
dbt deps --project-dir world_cup_qatar_elt_dbt/dbt/world_cup_qatar_elt
dbt run --project-dir world_cup_qatar_elt_dbt/dbt/world_cup_qatar_eltThe DAG folder and its config variables are deployed via Cloud Build (deploy-dag.yaml), which invokes scripts/deploy_dag_config.sh and scripts/deploy_dag_folder.sh:
gcloud builds submit \
--project=$PROJECT_ID \
--region=$LOCATION \
--config deploy-dag.yaml \
--substitutions _DAG_ROOT_FOLDER=$DAG_ROOT_FOLDER,_COMPOSER_ENVIRONMENT=$COMPOSER_ENVIRONMENT,_CONFIG_FOLDER_NAME=$CONFIG_FOLDER_NAME,_ENV=$ENVRequired env vars (set in .envrc): DAG_ROOT_FOLDER, COMPOSER_ENVIRONMENT, CONFIG_FOLDER_NAME, ENV.