Scientific Dataset Catalog

Managing scientific datasets, their relationships, and metadata across research workflows can be complex and error-prone. The Scientific Dataset Catalog provides a centralized system for tracking datasets, their lineage, collections, and rich metadata throughout the research lifecycle.

This system helps you organize datasets into collections, track how datasets relate to each other (lineage relationships), store rich metadata, and provides a Python client for programmatic access to all these capabilities.

Getting Started

🐍 Using the Python Client

Want to integrate dataset catalog functionality into your Python workflows? → Quick Start | Full Documentation

📋 Understanding the Data Schema

Want to understand dataset metadata structure and relationships? → Schema Documentation

🤖 Claude Code Plugin

Prefer to work in Claude Code? Install the catalog plugin. → Installation

🔧 Contributing

Want to contribute to the codebase? → Development Guide

Claude Code Plugin

This repo ships a Claude Code plugin, catalog, distributed through the dataset-catalog marketplace defined in .claude-plugin/marketplace.json. Install it from inside a Claude Code session:

/plugin marketplace add chanzuckerberg/dataset-catalog
/plugin install catalog@dataset-catalog

Quick Start

Ready to start using the Python client? The fastest way to get up and running:

→ Installation & Quick Start Guide

This will walk you through installation, getting an API token, and your first few API calls.

Documentation & Resources

📚 Complete Documentation

Python Client Usage Guide - Comprehensive guide covering datasets, collections, lineage, async usage, and error handling
Interactive Examples - Jupyter notebooks with step-by-step walkthroughs
API Token Setup - How to generate and use API tokens

🔗 Related Projects

Dataset Catalog API - The backend service this client connects to
Schema Documentation - Detailed data models and relationships

🤝 Contributing

Development Setup - Local development and testing
Issues & Feedback - Report bugs or request features

Code of Conduct

This project adheres to the Contributor Covenant code of conduct. By participating, you are expected to uphold this code. Please report unacceptable behavior to opensource@chanzuckerberg.com.

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
.claude-plugin		.claude-plugin
.github/workflows		.github/workflows
dataset-catalog-client		dataset-catalog-client
plugins/catalog		plugins/catalog
schema		schema
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.release-please-manifest.json		.release-please-manifest.json
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE.md		LICENSE.md
README.md		README.md
SECURITY.md		SECURITY.md
release-please-config.json		release-please-config.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Scientific Dataset Catalog

Getting Started

🐍 Using the Python Client

📋 Understanding the Data Schema

🤖 Claude Code Plugin

🔧 Contributing

Claude Code Plugin

Quick Start

Documentation & Resources

📚 Complete Documentation

🔗 Related Projects

🤝 Contributing

Code of Conduct

About

Uh oh!

Releases 2

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Scientific Dataset Catalog

Getting Started

🐍 Using the Python Client

📋 Understanding the Data Schema

🤖 Claude Code Plugin

🔧 Contributing

Claude Code Plugin

Quick Start

Documentation & Resources

📚 Complete Documentation

🔗 Related Projects

🤝 Contributing

Code of Conduct

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages