Skip to content
View alwyndsouza's full-sized avatar

Block or report alwyndsouza

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
alwyndsouza/README.md

Alwyn D'Souza

Enterprise Data & AI Platform Architect

Building production-grade data systems that scale for enterprise teams.

20+ years designing and delivering data platforms across regulated industries — telco, banking, and retail. Based in Sydney, Australia.


Expertise

Lakehouse Architecture

Scalable cloud-native analytics using Databricks, Delta Lake, Spark, and modern lakehouse patterns.

DataOps & Platform Engineering

Automated CI/CD pipelines, observability frameworks, governance systems, and reliable engineering workflows.

AI Agent Systems

Agentic workflows, orchestration frameworks, MCP tooling, and AI-enabled engineering systems.

Enterprise Data Platforms

Modernising enterprise analytics ecosystems through scalable architecture and platform engineering.


Technical Stack

Data Engineering & Lakehouse

Databricks dbt Apache Spark Delta Lake Apache Iceberg dltHub

Cloud & Platform Engineering

AWS GCP Terraform Docker Kubernetes

Streaming & Messaging

Apache Kafka Apache Flink Redpanda RisingWave

Programming & Automation

Python SQL GitHub Actions AWS Step Functions


Capability Areas

  • Data Engineering & Lakehouse: Data Mesh · AWS Glue · Lake Formation · Iceberg · Databricks · dbt · Redshift · GCP · dltHub
  • DataOps: Data Contracts · CI/CD for Pipelines · Semantic Layers · End-to-End Observability · Incident & Change Management
  • AI & ML Engineering: AIOps · Production LLM Integration · Agentic Pipelines (MCP) · RAG · AI Guardrails for Data Quality
  • MLOps: Pipeline Orchestration · Model Lifecycle Governance · ML Observability
  • Streaming: Flink · Kafka · Redpanda · RisingWave
  • Infra & Orchestration: Terraform · Step Functions · GitHub Actions · dlt
  • Leadership: COE Capability Uplift · Vendor Engagement (dbt Labs · AWS · Databricks) · Technical Mentorship

Architecture Portfolio

Reference architectures across lakehouse design, AI agent orchestration, data mesh, dbt transformation lineage, and MLOps pipelines.

👉 alwyndsouza.github.io


Latest Technical Articles

Articles published on data engineering, AI systems, and platform architecture.


Philosophy

Build systems that scale.
Automate relentlessly.
Design with architecture in mind.
Engineer platforms that enable teams.


Connect

Pinned Loading

  1. rp-dbt-rw-fraud-monitor rp-dbt-rw-fraud-monitor Public

    SQL‑first real‑time fraud detection using Redpanda, dbt, RisingWave, and Grafana

    Python 1

  2. dbt-ci-cd dbt-ci-cd Public

    A production-ready framework for dbt CI/CD that automates code validation, testing, and deployment workflows to maintain a resilient and scalable data platform.

    1 1

  3. dbt-conversation-ai-local dbt-conversation-ai-local Public

    Conversational AI for dbt: A Streamlit-based local agent powered by Ollama and MCP to query, document, and analyze dbt semantic models and metrics in a private environment.

    Python 2

  4. mds-databricks-semantic-layer mds-databricks-semantic-layer Public

    A production-grade Modern Data Stack reference implementation using dlt, dbt-core, and Databricks Unity Catalog, featuring a governed semantic layer with MetricFlow.

    Python 1

  5. mds-duckdb-semantic-layer mds-duckdb-semantic-layer Public

    A local-first Modern Data Stack (MDS) reference architecture using dlt for ingestion, dbt for transformation, and DuckDB as the high-performance compute engine and semantic layer.

    Python 2 1

  6. agile-story-skills agile-story-skills Public

    AI agent skills for agile story writing, story splitting, problem framing, and sprint goal generation.

    JavaScript 1