Skip to content

awslabs/hcls-agent-skills

Repository files navigation

Agent Skills for Healthcare & Life Sciences

License: MIT-0 Skills Domains Works with

A curated collection of agent skills for healthcare and life sciences (HCLS) workflows, following the Agent Skills open standard. These 38 domain skills make any AI agent measurably better at HCLS questions. In a 410-prompt pairwise evaluation (380 single-skill + 30 cross-skill) judged by Claude Opus 4.7, skills win 70–86% of head-to-head comparisons depending on harness, with up to 85% critical thinking and up to 87% scientific accuracy win rates. Skills work on 20+ platforms including Kiro, Amazon Quick, Amazon Bedrock AgentCore, AWS Strands SDK, Claude Code, OpenAI Codex, CrewAI, LangChain, and more.

Skills fall into two categories: reasoning skills encode methodology, decision frameworks, and domain expertise to guide the agent's thinking (e.g., ACMG variant classification, target trial emulation, HEDIS measure specification); pipeline skills encode tool-specific commands, code patterns, and parameter tables that produce runnable artifacts (e.g., GATK4 variant calling, RDKit cheminformatics, HL7v2 parsing). Together they make any AI agent a competent collaborator across genomics, single-cell analysis, medical imaging, protein structure, translational research, pharmacoepidemiology, clinical data, drug discovery, proteomics, clinical data review, multi-omics integration, healthcare operations, machine learning, and AWS architecture.

Why Agent Skills for Healthcare & Life Sciences?

The line between scientist and builder is fading. Molecular biologists want to classify variants from sequencing data, run single-cell analysis pipelines, and design validation studies all without waiting for IT to scope a project. AI agents like Amazon Quick and Kiro make this possible: universal tools that let a researcher go from question to working system in minutes.

Capability is no longer the bottleneck; methodology is. Every generation of LLMs grows more powerful and knowledgeable. They know facts but not procedures; the multi-step decision chains that domain experts internalize over years. They have "read every textbook" but never worked through the nuance of biology or patient care. When an epidemiologist asks an AI agent to emulate a target trial from claims data, adjust for time-varying confounding, or validate a risk stratification model, reproducible output must be both methodologically and scientifically sound and not just plausible. In regulated domains like healthcare and life sciences, the gap between plausible and correct has patient safety and compliance consequences.

Agent skills close this gap. A skill encodes the structured reasoning that experts follow, such as exact criteria, specific thresholds, failure modes that only experience teaches. Think of it as the computational equivalent of a lab protocol: instead of sitting in a binder, it actively guides the agent in real time. Three properties make this better than the alternatives:

  1. Better than long prompts. Real research tasks require multi-step reasoning. Benchmarks like LifeSciBench show 79% of realistic life science tasks involve multiple decision steps. A monolithic system prompt is brittle, unversioned, and impossible to share across a team. A skill gives the agent a structured starting point: pre-encoded decision logic that's reproducible across users, sessions, and platforms.

  2. Portable and updateable. These skills follow the open Agent Skills standard and work across 20+ platforms (Kiro, Quick, Strands Agent SDK, AgentCore, Claude Code, Codex). When Center of Medicare and Medicaid Services publishes new HCC coefficients or ACMG updates classification criteria, you edit the skill to reflect that knowledge for all future work.

  3. Encodes what textbooks don't. Skills capture tacit and institutional knowledge: which bioinformatics parameters work for your sequencer, which reagent vendor to trust, which clinical codes your payer actually accepts. The kind of judgment that lives in a senior scientist's head and takes years to transfer.

The biggest gains appear exactly where unguided agents fail most: on the hard, multi-step regulatory and methodological questions where getting it wrong has consequences. Outputs become reproducible, not just occasionally correct.

This is what raises the floor for every scientist without lowering the ceiling. Skills don't skip the learning; they provide scaffolding so a new postdoc's analysis meets the same methodological standard as the PI's while they build understanding of why each step matters. The work shifts from instructing the agent ("here's how to avoid immortal time bias") to exercising judgment at a higher level ("is a new-user active-comparator design even the right approach?").

Skill Catalog

Click to expand full catalog (38 skills across 11 domains)

Genomics

Name Category Description
genomic-variant-interpretation reasoning ACMG/AMP variant classification, ClinVar evidence, computational predictor thresholds
variant-calling pipeline BWA-MEM2, GATK4 HaplotypeCaller, joint genotyping, VQSR, Mutect2
rna-seq-analysis pipeline STAR/Salmon alignment, featureCounts, DESeq2 differential expression
ngs-quality-control pipeline FastQC, fastp, Picard metrics, mosdepth coverage, MultiQC

Single-Cell Analysis

Name Category Description
biomarker-discovery reasoning Prognostic vs predictive biomarkers, feature selection, validation design
scrna-seq-pipeline pipeline Scanpy/AnnData processing: QC, normalization, HVG, PCA, UMAP, Leiden
cell-type-annotation pipeline CellTypist, SingleR, marker-based annotation, label transfer
trajectory-analysis pipeline Pseudotime, PAGA, RNA velocity (scVelo), CellRank fate probabilities

Medical Imaging

Name Category Description
imaging-study-design reasoning Preprocessing strategy selection, DICOM de-ID risk, imaging biomarker choice
digital-pathology pipeline TIAToolbox, H-optimus-0, whole-slide image tiling and inference
dicom-processing pipeline DICOM parsing, NIfTI conversion, de-identification, BIDS organization
radiology-preprocessing pipeline HD-BET skull stripping, N4 bias correction, ANTs/FSL registration

Protein Structure

Name Category Description
structure-based-drug-design reasoning Druggability assessment, docking strategy, scoring function selection
protein-structure-analysis pipeline PDB parsing, RMSD, Ramachandran, binding pocket detection (fpocket)
molecular-docking pipeline AutoDock Vina receptor/ligand prep, grid box, virtual screening

Cross-Domain

Name Category Description
translational-research reasoning T0-T4 translation stages, target validation, neuro hypothesis validation
ml-researcher reasoning ML experiment design, evaluation strategy, reporting standards for HCLS
aws-genai-ml-architect reasoning AWS service selection, HIPAA compliance, MLOps for regulated workloads

Pharmacoepidemiology & Real-World Data

Name Category Description
pharmacoepidemiology reasoning New-user designs, target trial emulation, immortal time bias, propensity scores
rwd-cohort-analysis pipeline Claims cohort identification, PDC adherence, Kaplan-Meier, Cox models, PS matching

Clinical Data

Name Category Description
clinical-data-standards reasoning MedDRA hierarchy, ICD-10 structure, SNOMED CT, LOINC, terminology mapping
ehr-data-parsing pipeline HL7v2 message parsing, FHIR R4 extraction, format conversion, data quality

Drug Discovery

Name Category Description
drug-repurposing reasoning Target-based vs phenotype-based repurposing, DGIdb, OpenTargets, translatability
cheminformatics pipeline RDKit descriptors, Lipinski/PAINS filtering, Morgan fingerprints, MMP analysis

Proteomics

Name Category Description
quantitative-proteomics reasoning LFQ/TMT/DIA strategy, imputation methods, normalization, DE interpretation

Clinical Data Review

Name Category Description
cdisc-compliance reasoning SDTM/ADaM rules, FDA/PMDA expectations, controlled terminology, define.xml
edc-data-validation pipeline EDC export validation, range checks, SDTM structure, CT validation

Multi-Omics Integration

Name Category Description
multi-omics-integration reasoning Early/intermediate/late integration, batch correction, partial overlap
multi-omics-pipeline pipeline ID mapping, ComBat batch correction, MOFA2, GSEA enrichment

Healthcare Operations

Name Category Description
claims-billing-rules reasoning CMS billing rules, NCCI edits, FWA patterns, upcoding/unbundling detection
claims-analytics pipeline X12 837/835 parsing, provider profiling, NCCI validation, duplicate detection
risk-adjustment-strategy reasoning CMS-HCC V24/V28, disease hierarchies, RAF methodology, coding gap strategy
risk-adjustment pipeline ICD-10-to-HCC crosswalk, RAF score calculation, Rx/lab proxy gap detection
pa-clinical-policy reasoning Step therapy, medical necessity, LCD/NCD rules, Da Vinci PAS, appeals
pa-decision-automation pipeline X12 278/FHIR PAS parsing, rules engine, ML classifier, denial analysis
hedis-measure-specification reasoning HEDIS measure structure, denominator/numerator/exclusion logic, NCQA audit, care gaps
risk-stratification-indices reasoning LACE, Charlson, Elixhauser indices, SDOH Z-codes, ADI scoring
quality-measures pipeline HEDIS calculation, enrollment checks, care gap detection, utilization rates

Installation

npx skills add awslabs/hcls-agent-skills

Or clone the repository for full control:

git clone https://github.com/awslabs/hcls-agent-skills.git
cd ./hcls-agent-skills
./install.sh --target <platform>

Platform Compatibility

These skills follow the Agent Skills open standard (SKILL.md + YAML frontmatter) and work across multiple AI assistants:

Platform Install Command How It Works
Kiro IDE ./install.sh --target kiro Copies skills to ~/.kiro/skills/ and agent config to ~/.kiro/agents/
Kiro CLI ./install.sh --target kiro Same as above; switch with /agent hcls
Claude Code ./install.sh --target claude-code Symlinks skills into .claude/skills/ for auto-discovery
Quick Desktop ./install.sh --target quick-desktop Prints upload instructions (GUI-only: Settings → Skills → Upload)
GitHub Copilot, Cursor, Codex, Gemini CLI, 50+ others npx skills add . Universal CLI from vercel-labs/skills
CrewAI skills=["./skills"] Native parameter on Agent class
LangChain / LangGraph Documented pattern Skills as prompt-driven specializations
AWS Strands SDK AgentSkills + plugins=[skills] Built-in plugin, deploys to AgentCore
Claude Messages API Upload via Skills Management API First-class container parameter with CRUD

Kiro (IDE & CLI)

./install.sh --target kiro

# In Kiro CLI, switch to the unified agent:
/agent hcls

Or load individual skills on the fly:

/context add ./hcls-agent-skills/skills/variant-calling/SKILL.md

Multi-Agent Mode (Kiro)

For context-efficient specialist routing (loads ~15K tokens per specialist instead of ~80K for all 38 skills):

./install.sh --target kiro --mode multiagent

# Switch to the coordinator:
/agent hcls-multiagent

The coordinator routes queries to 8 domain specialists:

Specialist Domain
hcls-genomics Variant classification, NGS pipelines, RNA-seq
hcls-omics Single-cell, proteomics, multi-omics, biomarkers
hcls-imaging DICOM, digital pathology, radiology preprocessing
hcls-drug-discovery Docking, cheminformatics, drug repurposing, translational
hcls-clinical-data HL7/FHIR, CDISC/SDTM, EDC validation
hcls-rwe Pharmacoepidemiology, RWD cohorts, risk indices
hcls-healthcare-ops Claims billing, risk adjustment, prior auth, HEDIS
hcls-ai-infra ML experiment design, AWS architecture

Amazon Quick

Amazon Quick loads skills via GUI upload of individual SKILL.md files:

# See which skills to upload and instructions
./install.sh --target quick-desktop

Upload path: Settings → Capabilities → Skills → Upload → select a SKILL.md file. Start with one domain and add more as needed. Follow the instructions in Skills and agents in the Amazon Quick documentation.

AWS Strands Agents SDK

For the AWS Strands Agents SDK, load skills directly in your Python code:

pip install strands-agents
from strands import Agent, AgentSkills
from strands.models.bedrock import BedrockModel

model = BedrockModel(model_id="us.anthropic.claude-sonnet-4-20250514", region_name="us-east-1")
skills = AgentSkills(skills="./hcls-agent-skills/skills/")
agent = Agent(model=model, plugins=[skills])

Skills integrate natively with Strands Agents SDK — no wrappers or adapters needed. The AgentSkills plugin discovers and loads all skills from the specified directory. Skills activate automatically based on query content.

Amazon AgentCore

Once your skill-equipped agent works locally, you can move it to production. Amazon Bedrock AgentCore provides an alternative path to inject skills into hosted agents. In addition to embedding them in the Strands agent code, you can configure skills at the environment level so they're available to any agent running in that harness.

Alternatively you can load skills to AgentCore harness, by following Skills in the AgentCore documentation. AgentCore harness provides managed hosting, Auto Scaling, security boundaries, and observability capabilities without managing infrastructure.

Claude Code

# Install into your project (symlinks to repo, so updates propagate)
npx skills add awslabs/hcls-agent-skills -a claude-code

Skills auto-activate when Claude Code detects relevant topics in your prompts. No agent config needed — Claude Code discovers skills from .claude/skills/ automatically.

Quick Start

Invoke a skill by describing the task in natural language — your agent will match on the skill's triggers:

> I have a new candidate biomarker for early-stage NSCLC. Help me design a validation study.
[agent loads translational-research skill]

Or request a pipeline skill explicitly:

> Use the digital-pathology skill to tile this SVS file at 20x with 512px tiles.

Evaluation

An automated evaluation suite measures whether skills improve agent responses. See eval/README.md for full documentation.

Quick run:

uv venv --python 3.12 && uv pip install -e .
python -m eval.run --parallel 2 --version v4 --pairwise
python eval/build_review.py
open eval/results/review.html

Latest results (pairwise, 410 prompts — 38 skills + 3 cross-skill categories): Overall win rate 70–86% (v9 Strands: 85.9%, d=0.97; v3 kiro-cli: 69.5%, d=0.39), critical thinking 77–85%. Eval reports show results across 8 versions testing different harness configurations. See eval/TECHNICAL_REPORT.md and eval/HARNESS_EFFECTS.md for the full analysis.

Customization

Want to modify a skill for your org, or create a new one? See CUSTOMIZING.md and SKILL_DESIGN_GUIDE.md for practical guidance on adapting skills to your workflows.

Contributing

New skills and improvements are welcome. See CONTRIBUTING.md and the QUALITY_CHECKLIST.md before submitting.

License

MIT-0

Disclaimer

This solution is for demonstrative purposes only. It is not for clinical use and is not a substitute for professional medical advice, diagnosis, or treatment. The associated skills, and source code are not intended for production. It is each customers' responsibility to determine whether they are subject to HIPAA, and if so, how best to comply with HIPAA and its implementing regulations. Before using AWS in connection with protected health information, customers must enter an AWS Business Associate Addendum (BAA) and follow its configuration requirements.

About

Agent skills for healthcare and life sciences: genomics, imaging, claims, drug discovery, and more. Works with Amazon Quick, Kiro, Amazon AgentCore, AWS Strands SDK, Claude Code, Codex, and any Agent Skills-compatible platform.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Contributors