-
Notifications
You must be signed in to change notification settings - Fork 1
contributing
Thank you for your interest in contributing to BMLibrarian!
This document provides guidelines for contributing to the project. All contributions are welcome, from bug reports to new features.
- Code of Conduct
- Getting Started
- How to Contribute
- Development Workflow
- Coding Standards
- Testing Guidelines
- Documentation
- Pull Request Process
Be respectful, inclusive, and professional. We welcome contributors of all backgrounds and experience levels.
- Python 3.12 or higher
- PostgreSQL 14+ with pgvector extension
- Ollama for local LLM inference
- Git for version control
-
uvpackage manager (recommended)
-
Fork the repository on GitHub
-
Clone your fork:
git clone https://github.com/YOUR_USERNAME/bmlibrarian.git cd bmlibrarian -
Add upstream remote:
git remote add upstream https://github.com/hherb/bmlibrarian.git
-
Install dependencies:
uv sync --dev # Includes development dependencies -
Set up database:
createdb bmlibrarian_dev # Use separate dev database cp test_database.env.example .env # Edit .env with dev database credentials
-
Initialize database:
uv run python initial_setup_and_download.py .env --skip-medrxiv --skip-pubmed
-
Run tests to verify setup:
uv run python -m pytest tests/ -v
- Search existing issues to avoid duplicates
- Use the bug report template on GitHub Issues
-
Include:
- Clear description of the bug
- Steps to reproduce
- Expected vs. actual behavior
- Python version, OS, database version
- Error messages and stack traces
- Screenshots if applicable
- Check if feature already exists or is planned
- Use the feature request template
-
Describe:
- Use case and motivation
- Proposed solution or API
- Alternatives considered
- Impact on existing functionality
Areas where contributions are especially welcome:
- New Agents - Create specialized AI agents
- Qt Plugins - Extend the GUI with new tabs
- Importers - Add support for new data sources
- Documentation - Improve guides and examples
- Tests - Increase test coverage
- Bug Fixes - Fix reported issues
-
main- Stable release branch -
develop- Development branch (base for PRs) -
feature/your-feature- Feature branches -
bugfix/issue-number- Bug fix branches
# Update your fork
git fetch upstream
git checkout develop
git merge upstream/develop
# Create feature branch
git checkout -b feature/your-feature-name-
Make small, focused commits:
git add src/bmlibrarian/agents/new_agent.py git commit -m "Add new agent for X functionality" -
Follow commit message conventions:
- Use present tense ("Add feature" not "Added feature")
- Be concise but descriptive
- Reference issues: "Fix #123: Description"
-
Keep commits atomic - One logical change per commit
# Run all tests
uv run python -m pytest tests/ -v
# Run specific test file
uv run python -m pytest tests/test_my_agent.py -v
# Run with coverage
uv run python -m pytest tests/ --cov=bmlibrarian --cov-report=html
# Test Qt GUI (requires X server or Xvfb)
uv run python -m pytest tests/gui/qt/ -v# Fetch latest changes
git fetch upstream
# Rebase on develop
git rebase upstream/develop
# Or merge if rebasing is problematic
git merge upstream/developWe follow PEP 8 with some exceptions:
- Line length: 100 characters (not 79)
- Imports: Group by standard library, third-party, local
- Docstrings: Google style (see below)
All functions and methods must include type hints:
from typing import Dict, List, Optional, Tuple, Any
def process_documents(
documents: List[Dict[str, Any]],
min_score: float = 0.7,
callback: Optional[Callable[[int, int], None]] = None
) -> Tuple[List[Dict], Dict[str, float]]:
"""
Process documents and return scored results.
Args:
documents: List of document dictionaries
min_score: Minimum relevance score (0.0-1.0)
callback: Optional progress callback (current, total)
Returns:
Tuple of (scored_documents, statistics)
"""
# Implementation...Use Google-style docstrings for all public functions and classes:
def search_literature(
query: str,
max_results: int = 100
) -> List[Dict[str, Any]]:
"""
Search biomedical literature databases.
Performs full-text search across PubMed and medRxiv databases
using PostgreSQL text search.
Args:
query: Natural language search query
max_results: Maximum number of results to return (default: 100)
Returns:
List of document dictionaries with keys:
- id: Document ID
- title: Document title
- abstract: Document abstract
- publication_date: Publication date (ISO format)
Raises:
ValueError: If query is empty or max_results <= 0
ConnectionError: If database is unavailable
Examples:
>>> docs = search_literature("COVID-19 vaccine", max_results=10)
>>> print(f"Found {len(docs)} documents")
Found 10 documents
Note:
Results are ordered by publication date (newest first).
"""Use named constants or configuration:
# BAD
if score > 0.7:
documents = documents[:50]
# GOOD
MIN_RELEVANCE_THRESHOLD = 0.7
DEFAULT_MAX_DOCUMENTS = 50
if score > MIN_RELEVANCE_THRESHOLD:
documents = documents[:DEFAULT_MAX_DOCUMENTS]Use Python's logging module, never print():
import logging
logger = logging.getLogger(__name__)
def process_data(data):
logger.info(f"Processing {len(data)} items")
try:
result = complex_operation(data)
logger.info("Processing completed successfully")
return result
except Exception as e:
logger.error(f"Processing failed: {e}", exc_info=True)
raise# Standard library imports
import json
import logging
from pathlib import Path
from typing import Dict, List, Optional
# Third-party imports
from PySide6.QtWidgets import QWidget, QVBoxLayout
from PySide6.QtCore import Signal
import psycopg
# BMLibrarian imports
from bmlibrarian.config import get_config
from bmlibrarian.database import get_db_manager
from bmlibrarian.agents.base import BaseAgent"""
Module-level docstring describing purpose.
"""
# Standard library imports
# Third-party imports
# BMLibrarian imports
# Module-level constants
DEFAULT_BATCH_SIZE = 50
MIN_CONFIDENCE_THRESHOLD = 0.7
# Module-level logger
logger = logging.getLogger(__name__)
class MyClass:
"""Class definition with docstring."""
def __init__(self, ...):
"""Constructor docstring."""
# Implementation
def public_method(self, ...) -> ReturnType:
"""Public method with full docstring."""
# Implementation
def _private_method(self, ...) -> ReturnType:
"""Private method (still needs docstring)."""
# Implementation"""
Unit tests for CustomAgent.
"""
import unittest
from unittest.mock import Mock, patch
from bmlibrarian.agents.custom_agent import CustomAgent
class TestCustomAgent(unittest.TestCase):
"""Test suite for CustomAgent."""
def setUp(self):
"""Set up test fixtures."""
self.agent = CustomAgent(
model="gpt-oss:20b",
temperature=0.1,
show_model_info=False
)
def test_agent_initialization(self):
"""Test agent initializes correctly."""
self.assertEqual(self.agent.model, "gpt-oss:20b")
self.assertEqual(self.agent.temperature, 0.1)
@patch('bmlibrarian.agents.base.BaseAgent._make_ollama_request')
def test_process_data(self, mock_request):
"""Test data processing."""
# Setup mock
mock_request.return_value = {"result": "success"}
# Run test
result = self.agent.process({"data": "test"})
# Assertions
self.assertEqual(result["result"], "success")
mock_request.assert_called_once()
def tearDown(self):
"""Clean up after tests."""
pass
if __name__ == '__main__':
unittest.main()- Unit Tests - Test individual functions/methods in isolation
- Integration Tests - Test component interactions
- Edge Cases - Test boundary conditions
- Error Handling - Test exception cases
- Mock External Dependencies - Mock Ollama, database calls
Aim for >80% code coverage for new code:
uv run python -m pytest tests/ --cov=bmlibrarian --cov-report=html
open htmlcov/index.htmlDocument:
- All public APIs
- Plugin development guides
- Agent development guides
- Configuration options
- New features
- Docstrings - In-code documentation
-
User Guides -
doc/users/directory -
Developer Guides -
doc/developers/directory - Wiki - High-level guides and tutorials
- README - Project overview and quick start
- CHANGELOG - Version history and changes
- Use Markdown for all documentation
- Include code examples
- Add screenshots for GUI features
- Keep documentation up-to-date with code changes
- Use clear, concise language
- Avoid jargon (or explain it)
-
Update your branch with latest develop:
git fetch upstream git rebase upstream/develop
-
Run tests:
uv run python -m pytest tests/ -v
-
Check code style:
uv run python -m black src/bmlibrarian tests/ uv run python -m flake8 src/bmlibrarian tests/
-
Update documentation if needed
-
Add tests for new functionality
-
Push your branch:
git push origin feature/your-feature-name
-
Create PR on GitHub:
- Base branch:
develop - Compare branch:
feature/your-feature-name - Use the PR template
- Link related issues
- Base branch:
-
PR Description Should Include:
- Summary of changes
- Motivation and context
- Type of change (bug fix, feature, etc.)
- Testing performed
- Checklist completion
## Description
Brief description of changes
## Motivation
Why is this change needed?
## Type of Change
- [ ] Bug fix (non-breaking change)
- [ ] New feature (non-breaking change)
- [ ] Breaking change
- [ ] Documentation update
## Testing
- [ ] Tests added/updated
- [ ] All tests pass
- [ ] Manual testing performed
## Checklist
- [ ] Code follows style guidelines
- [ ] Documentation updated
- [ ] No new warnings
- [ ] Added/updated tests
- [ ] All tests pass- Automated checks must pass (CI/CD)
- Code review by maintainers
- Address feedback with new commits
- Squash and merge when approved
-
Delete your feature branch:
git branch -d feature/your-feature-name git push origin --delete feature/your-feature-name
-
Update your fork:
git fetch upstream git checkout develop git merge upstream/develop
See Plugin Development Guide for Qt plugins.
For a new AI agent:
-
Inherit from BaseAgent:
from bmlibrarian.agents.base import BaseAgent class MyAgent(BaseAgent): def get_agent_type(self) -> str: return "my_agent" def process(self, data): # Implementation
-
Add configuration to
config.py -
Write tests in
tests/test_my_agent.py -
Add documentation in
doc/developers/ -
Update README with usage example
See Plugin Development Guide for complete details.
Quick start:
- Create directory:
src/bmlibrarian/gui/qt/plugins/my_plugin/ - Implement
plugin.pywithcreate_plugin()function - Inherit from
BaseTabPlugin - Add to
gui_config.json - Test in Qt GUI
- Create importer:
src/bmlibrarian/importers/new_source_importer.py - Add source to database:
INSERT INTO source ... - Implement import logic
- Create CLI tool:
new_source_import_cli.py - Update documentation
- Questions: GitHub Discussions
- Bugs: GitHub Issues
- Chat: Check repository for community chat links
- Documentation: BMLibrarian Wiki
By contributing, you agree that your contributions will be licensed under the same license as the project.
Thank you for contributing to BMLibrarian! 🙏
Your contributions help make biomedical research more accessible and efficient for everyone.
Getting Started
Applications
Features
- Workflow Guide
- Agents Guide
- Multi-Model Query Guide
- Query Agent Guide
- Citation Guide
- Reporting Guide
- Counterfactual Guide
Advanced
Architecture
Systems
- Workflow System
- Queue System Architecture
- Citation System
- Reporting System
- Counterfactual System
- Multi-Model Architecture
Contributing