Research Intelligence

Pancreatic Signal now leads with Research Intelligence as its primary product narrative: a cited pancreatic oncology watchtower for documents, digests, opportunities, experiments, and case-facing research briefs. Explainable triage, benchmark proof, and trial upkeep remain downstream workflow surfaces of that discovery engine.

Execution companion:

RESEARCH_INTELLIGENCE_EXECUTION_PLAN.md

Purpose

Research Intelligence exists to:

organize pancreatic oncology signals from literature, trials, guidance, regulatory updates, and open-source activity
turn those signals into durable artifacts instead of one-off chat output
help contributors spot benchmark gaps, rule gaps, trial-catalog updates, and tooling opportunities
inform case review with cited research briefs without changing case scores automatically
give new contributors a discovery-first entry point into the project before they touch the applied workflow surfaces

What Ships Today

The current implementation includes:

a new API namespace at /api/v1/research-intel/*
a new web workspace at /research-intel
seeded catalogs in data/research/ for sources, topics, a lightweight graph, and sample documents
persisted records for sources, runs, run items, documents, evidence, topics, digests, and opportunities
manual ingest and digest scripts plus Make targets
digest artifacts written under artifacts/research-intel/
case-level research briefs linked from the existing case detail page
discovery-ingest source health, document provenance, and novelty scoring
fixture-backed connector feeds for reproducible validation plus opt-in live connector scaffolding
a schedule-aware watchtower layer with due-source planning, due-only ingest, and a dedicated /research-intel/schedule view
graph-backed entity resolution plus a /research-intel/graph workspace for active node and edge inspection
research-first landing, onboarding, and handoff documentation so contributors meet the system through discovery work first

Current status:

the first slice is manual-triggered and now supports discovery-ingest modes
public read endpoints are open for OSS exploration
run execution and opportunity promotion require authenticated operator roles

Local Workflow

Seed the watchtower and generate a digest:

make research-intel-refresh

Or run the two phases independently:

make research-intel-ingest
make research-intel-digest

To inspect the watchtower schedule or run only due sources:

make research-intel-schedule
make research-intel-ingest-due
make research-intel-watchtower

For the reproducible discovery-ingest path:

make research-intel-discovery-fixture

For opt-in live connector attempts on supported sources:

make research-intel-discovery-live

Then explore:

/research-intel for dashboard and run health
/research-intel/schedule for due-source planning, live-ready connector coverage, and watchtower cadence
/research-intel/documents for normalized documents, citations, and topic tags
/research-intel/graph for graph entities, edges, and active node heat
/research-intel/digests for council-backed digest output
/research-intel/opportunities for human-gated contribution proposals

To run a safe experiment against the highest-confidence supported benchmark or rule opportunity:

python scripts/run_research_intel_experiment.py --json

Pipeline

The current workflow follows five explicit phases.

1. Collect

load the source catalog from data/research/sources.json
select fixture-backed or live-discovery pancreatic oncology documents
normalize identifiers and source metadata
track connector mode, source health, and provenance

2. Structure

map documents onto topic watchlists from data/research/topics.json
extract cited evidence spans and claim text
compute lightweight relevance scores and topic heat

3. Deliberate

persist stage 1 independent opinions
persist stage 1 open questions, evidence gaps, confidence labels, and proposed opportunity types
persist stage 2 ranking, critique, peer review, and confidence adjustments
persist stage 3 chairman synthesis with overall confidence, next experiments, and promotion guardrails
record disagreement instead of hiding it

4. Publish

create digest records
write JSON and Markdown artifacts under artifacts/research-intel/
keep claims citation-backed

5. Act

generate structured opportunities from high-signal digest output
attach typed action specs with objectives, evidence bundles, measurable outcomes, and downstream artifact hints
support human-gated promotion into docs drafts, benchmark tasks, or GitHub-issue style artifacts

Seeded Catalogs

The first foundation slice ships with:

9 source definitions
8 live-ready source definitions
7 topic watchlists
25 lightweight graph nodes
7 seeded pancreatic oncology documents
9 discovery fixture feeds for reproducible connector runs

Those assets establish the domain model and the reproducible discovery-ingest path before broader live source polling is turned on.

Phase 1 Discovery Status

Phase 1 is now in progress with these capabilities:

fixture-backed connector ingestion is available through auto and fixture modes
source health is persisted in the source registry state
document provenance and novelty are stored and exposed through the API
opt-in live connector code paths now cover a broader curated set of Europe PMC and ClinicalTrials.gov watches, official NCI and FDA feeds, and GitHub-backed open-source discovery
broad feeds can now be filtered down to pancreas-relevant items through connector-level include and exclude terms
the open-source watch now uses GitHub repository search in live mode while still preserving a fixture-backed validation path

Phase 7 Watchtower Scheduling Status

Phase 7 is now underway with these capabilities:

the source registry now computes due, scheduled, unscheduled, and disabled states for each watch source
due-only ingest is available through the API, CLI, and Make targets for automation-friendly watchtower ticks
source health now tracks consecutive failures so scheduling can surface backoff-aware cadence
a dedicated /research-intel/schedule workspace exposes due sources, live-ready coverage, and next-run timing for operators

Watchtower Automation Status

The next watchtower phase is now underway with these capabilities:

a single audited watchtower tick can now be triggered through /api/v1/research-intel/runs/watchtower, make research-intel-watchtower, or scripts/run_research_intel_watchtower.py
watchtower ticks capture schedule state before and after the run, making due-only automation readable without opening the database
digest generation is now policy-gated so recurring automation can skip digest churn when no new documents were added
the schedule workspace now surfaces the latest watchtower automation tick for operators
GitHub Actions now includes a cache-backed hosted watchtower workflow through .github/workflows/research-watchtower.yml

Phase 2 Knowledge Graph Status

Phase 2 is now underway with these capabilities:

the pancreatic oncology graph now includes broader typed entities for biomarkers, therapies, cohorts, modalities, workflow rules, and research artifacts
graph-backed entity resolution now uses concept families and related-node reinforcement during document normalization
graph entities now help reinforce topic assignment and evidence extraction with richer cross-document conceptual grouping
a dedicated graph surface exposes active nodes, edge relationships, and document-backed entity heat

Phase 3 Council Status

Phase 3 is now underway with these capabilities:

stage 1 opinions now carry primary topics, confidence labels, key claims, open questions, and evidence gaps
stage 2 rankings now include explicit peer critiques, challenge targets, preferred actions, and confidence adjustments
stage 3 synthesis now records overall confidence, evidence gaps, open questions, next experiments, and promotion guardrails
digests now also persist multi-run history snapshots with trend, recurring open questions, recurring disagreement points, resolved items, and recent digest windows
digest history now also carries a longer-horizon calibration snapshot with confidence distribution, disagreement volatility, recurring themes, and long-horizon backlog items
persisted council payloads remain backward-compatible with older digest versions

Phase 4 Discovery-To-Action Status

Phase 4 is now underway with these capabilities:

opportunities now persist typed action payloads instead of loose promotion hints
each opportunity carries a discovery objective, why-now rationale, discovery question, and measurable outcomes
each opportunity also carries a cited evidence bundle, open questions, evidence gaps, next experiments, and promotion guardrails
digest runs now write contributor-ready opportunity JSON and Markdown artifacts under artifacts/research-intel/opportunities/
opportunities now also include contributor packets for issue-ready, benchmark-ready, dataset-ready, rule, trial, case-brief, and tooling follow-through
contributor packet artifacts now live under artifacts/research-intel/packets/
benchmark and dataset packets now also emit collaborator bundle manifests and READMEs under artifacts/research-intel/collaborator-bundles/
promotion artifacts now preserve the same structured discovery-to-action context instead of collapsing into shallow summaries

Phase 5 Safe Experimentation Status

Phase 5 is now underway with these capabilities:

benchmark-gap and rule-gap opportunities can be evaluated through a dedicated experiment runner
experiments are deterministic proposal checks, not code edits or score mutations
each experiment records baseline, candidate score, delta, threshold, evidence coverage, and a keep-or-discard ratchet outcome
experiments now also support stress-test modes with scored dimensions for wording variance, confounder handling, traceability, and handoff quality
experiment artifacts are written under artifacts/research-intel/experiments/
the latest experiment result is persisted back onto the opportunity payload for future comparison

Phase 6 Research-First Repositioning Status

Phase 6 is now in place with these capabilities:

home, about, quickstart, README, and handoff surfaces now lead with the discovery workspace rather than treating it as a sidecar
the repository now frames triage, benchmark proof, imports, and case briefs as applied surfaces of the research engine
contributor context now starts from cited evidence, graph activity, council output, opportunities, and experiment artifacts
safety boundaries still remain explicit: no autonomous diagnosis, no score mutation from research-intel output, and no autonomous spending

Governance Status

Governance is now explicitly documented for the research-first pivot:

paid-source access and donation-funded operations are documented in RESEARCH_INTELLIGENCE_GOVERNANCE.md
public and openly accessible sources remain the default
paid or institution-restricted sources may be modeled, but procurement and renewal stay human-gated
no autonomous payment, subscription purchase, or crypto treasury execution is in scope

Opportunity Types

The current opportunity taxonomy is fixed and explicit:

rule_gap
benchmark_gap
trial_catalog_gap
case_brief
community_project
external_tooling

Triage Integration Boundary

Research Intelligence informs the existing triage product in three ways only:

case briefs that connect case rationale or trial context to current topics and cited documents
benchmark growth ideas such as wording variance, confounders, and follow-up patterns
rule and trial-catalog proposals that still require human review plus tests before merge

It does not:

rewrite case scores
auto-close or re-prioritize cases in reviewer workflow
act as autonomous diagnosis or treatment guidance

Design Influences

The implementation direction borrows selectively from several open-source research-agent ideas while staying grounded in this repository's explainability and audit requirements.

SciAgentsDiscovery: graph-seeded discovery and domain-aware expansion shaped the lightweight pancreatic ontology and topic graph
AgentLaboratory: phased collect-to-publish workflow shaped the durable artifact pipeline
llm-council: independent opinion, ranking, and chairman synthesis shaped the council payload
Kosmos: validation-first and sandbox-oriented thinking shaped the artifact and experiment boundaries
autoresearch: ratchet-style improvement shaped the expectation that proposals should become measurable benchmark or rule work
quantum-agentics and the openclaw standby-agent idea: influenced the future operating model for schedulers and named agents without introducing those runtime dependencies today

Guardrails

research-use software only
cited outputs over uncited synthesis
human-gated promotion over autonomous action
experiment runs evaluate proposals only and do not edit code, merge changes, or alter triage scores automatically
no autonomous payment, subscription procurement, or crypto treasury execution in the current product
no automatic mutation of triage scores or reviewer state

Next Slices

deepen graph coverage and cross-document entity resolution against a wider live corpus
exercise the new collaborator bundles against a larger real benchmark drop or public dataset bundle
keep governance and procurement notes current before any paid-source expansion is activated
continue improving connector quality where the watchtower still leans on fixtures

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Research Intelligence

Purpose

What Ships Today

Local Workflow

Pipeline

1. Collect

2. Structure

3. Deliberate

4. Publish

5. Act

Seeded Catalogs

Phase 1 Discovery Status

Phase 7 Watchtower Scheduling Status

Watchtower Automation Status

Phase 2 Knowledge Graph Status

Phase 3 Council Status

Phase 4 Discovery-To-Action Status

Phase 5 Safe Experimentation Status

Phase 6 Research-First Repositioning Status

Governance Status

Opportunity Types

Triage Integration Boundary

Design Influences

Guardrails

Next Slices

FilesExpand file tree

RESEARCH_INTELLIGENCE.md

Latest commit

History

RESEARCH_INTELLIGENCE.md

File metadata and controls

Research Intelligence

Purpose

What Ships Today

Local Workflow

Pipeline

1. Collect

2. Structure

3. Deliberate

4. Publish

5. Act

Seeded Catalogs

Phase 1 Discovery Status

Phase 7 Watchtower Scheduling Status

Watchtower Automation Status

Phase 2 Knowledge Graph Status

Phase 3 Council Status

Phase 4 Discovery-To-Action Status

Phase 5 Safe Experimentation Status

Phase 6 Research-First Repositioning Status

Governance Status

Opportunity Types

Triage Integration Boundary

Design Influences

Guardrails

Next Slices