Pancreatic Signal now leads with Research Intelligence as its primary product narrative: a cited pancreatic oncology watchtower for documents, digests, opportunities, experiments, and case-facing research briefs. Explainable triage, benchmark proof, and trial upkeep remain downstream workflow surfaces of that discovery engine.
Execution companion:
Research Intelligence exists to:
- organize pancreatic oncology signals from literature, trials, guidance, regulatory updates, and open-source activity
- turn those signals into durable artifacts instead of one-off chat output
- help contributors spot benchmark gaps, rule gaps, trial-catalog updates, and tooling opportunities
- inform case review with cited research briefs without changing case scores automatically
- give new contributors a discovery-first entry point into the project before they touch the applied workflow surfaces
The current implementation includes:
- a new API namespace at
/api/v1/research-intel/* - a new web workspace at
/research-intel - seeded catalogs in
data/research/for sources, topics, a lightweight graph, and sample documents - persisted records for sources, runs, run items, documents, evidence, topics, digests, and opportunities
- manual ingest and digest scripts plus Make targets
- digest artifacts written under
artifacts/research-intel/ - case-level research briefs linked from the existing case detail page
- discovery-ingest source health, document provenance, and novelty scoring
- fixture-backed connector feeds for reproducible validation plus opt-in live connector scaffolding
- a schedule-aware watchtower layer with due-source planning, due-only ingest, and a dedicated
/research-intel/scheduleview - graph-backed entity resolution plus a
/research-intel/graphworkspace for active node and edge inspection - research-first landing, onboarding, and handoff documentation so contributors meet the system through discovery work first
Current status:
- the first slice is manual-triggered and now supports discovery-ingest modes
- public read endpoints are open for OSS exploration
- run execution and opportunity promotion require authenticated operator roles
Seed the watchtower and generate a digest:
make research-intel-refreshOr run the two phases independently:
make research-intel-ingest
make research-intel-digestTo inspect the watchtower schedule or run only due sources:
make research-intel-schedule
make research-intel-ingest-due
make research-intel-watchtowerFor the reproducible discovery-ingest path:
make research-intel-discovery-fixtureFor opt-in live connector attempts on supported sources:
make research-intel-discovery-liveThen explore:
/research-intelfor dashboard and run health/research-intel/schedulefor due-source planning, live-ready connector coverage, and watchtower cadence/research-intel/documentsfor normalized documents, citations, and topic tags/research-intel/graphfor graph entities, edges, and active node heat/research-intel/digestsfor council-backed digest output/research-intel/opportunitiesfor human-gated contribution proposals
To run a safe experiment against the highest-confidence supported benchmark or rule opportunity:
python scripts/run_research_intel_experiment.py --jsonThe current workflow follows five explicit phases.
- load the source catalog from
data/research/sources.json - select fixture-backed or live-discovery pancreatic oncology documents
- normalize identifiers and source metadata
- track connector mode, source health, and provenance
- map documents onto topic watchlists from
data/research/topics.json - extract cited evidence spans and claim text
- compute lightweight relevance scores and topic heat
- persist stage 1 independent opinions
- persist stage 1 open questions, evidence gaps, confidence labels, and proposed opportunity types
- persist stage 2 ranking, critique, peer review, and confidence adjustments
- persist stage 3 chairman synthesis with overall confidence, next experiments, and promotion guardrails
- record disagreement instead of hiding it
- create digest records
- write JSON and Markdown artifacts under
artifacts/research-intel/ - keep claims citation-backed
- generate structured opportunities from high-signal digest output
- attach typed action specs with objectives, evidence bundles, measurable outcomes, and downstream artifact hints
- support human-gated promotion into docs drafts, benchmark tasks, or GitHub-issue style artifacts
The first foundation slice ships with:
9source definitions8live-ready source definitions7topic watchlists25lightweight graph nodes7seeded pancreatic oncology documents9discovery fixture feeds for reproducible connector runs
Those assets establish the domain model and the reproducible discovery-ingest path before broader live source polling is turned on.
Phase 1 is now in progress with these capabilities:
- fixture-backed connector ingestion is available through
autoandfixturemodes - source health is persisted in the source registry state
- document provenance and novelty are stored and exposed through the API
- opt-in live connector code paths now cover a broader curated set of Europe PMC and ClinicalTrials.gov watches, official NCI and FDA feeds, and GitHub-backed open-source discovery
- broad feeds can now be filtered down to pancreas-relevant items through connector-level include and exclude terms
- the open-source watch now uses GitHub repository search in live mode while still preserving a fixture-backed validation path
Phase 7 is now underway with these capabilities:
- the source registry now computes due, scheduled, unscheduled, and disabled states for each watch source
- due-only ingest is available through the API, CLI, and Make targets for automation-friendly watchtower ticks
- source health now tracks consecutive failures so scheduling can surface backoff-aware cadence
- a dedicated
/research-intel/scheduleworkspace exposes due sources, live-ready coverage, and next-run timing for operators
The next watchtower phase is now underway with these capabilities:
- a single audited watchtower tick can now be triggered through
/api/v1/research-intel/runs/watchtower,make research-intel-watchtower, orscripts/run_research_intel_watchtower.py - watchtower ticks capture schedule state before and after the run, making due-only automation readable without opening the database
- digest generation is now policy-gated so recurring automation can skip digest churn when no new documents were added
- the schedule workspace now surfaces the latest watchtower automation tick for operators
- GitHub Actions now includes a cache-backed hosted watchtower workflow through
.github/workflows/research-watchtower.yml
Phase 2 is now underway with these capabilities:
- the pancreatic oncology graph now includes broader typed entities for biomarkers, therapies, cohorts, modalities, workflow rules, and research artifacts
- graph-backed entity resolution now uses concept families and related-node reinforcement during document normalization
- graph entities now help reinforce topic assignment and evidence extraction with richer cross-document conceptual grouping
- a dedicated graph surface exposes active nodes, edge relationships, and document-backed entity heat
Phase 3 is now underway with these capabilities:
- stage 1 opinions now carry primary topics, confidence labels, key claims, open questions, and evidence gaps
- stage 2 rankings now include explicit peer critiques, challenge targets, preferred actions, and confidence adjustments
- stage 3 synthesis now records overall confidence, evidence gaps, open questions, next experiments, and promotion guardrails
- digests now also persist multi-run history snapshots with trend, recurring open questions, recurring disagreement points, resolved items, and recent digest windows
- digest history now also carries a longer-horizon calibration snapshot with confidence distribution, disagreement volatility, recurring themes, and long-horizon backlog items
- persisted council payloads remain backward-compatible with older digest versions
Phase 4 is now underway with these capabilities:
- opportunities now persist typed action payloads instead of loose promotion hints
- each opportunity carries a discovery objective, why-now rationale, discovery question, and measurable outcomes
- each opportunity also carries a cited evidence bundle, open questions, evidence gaps, next experiments, and promotion guardrails
- digest runs now write contributor-ready opportunity JSON and Markdown artifacts under
artifacts/research-intel/opportunities/ - opportunities now also include contributor packets for issue-ready, benchmark-ready, dataset-ready, rule, trial, case-brief, and tooling follow-through
- contributor packet artifacts now live under
artifacts/research-intel/packets/ - benchmark and dataset packets now also emit collaborator bundle manifests and READMEs under
artifacts/research-intel/collaborator-bundles/ - promotion artifacts now preserve the same structured discovery-to-action context instead of collapsing into shallow summaries
Phase 5 is now underway with these capabilities:
- benchmark-gap and rule-gap opportunities can be evaluated through a dedicated experiment runner
- experiments are deterministic proposal checks, not code edits or score mutations
- each experiment records baseline, candidate score, delta, threshold, evidence coverage, and a keep-or-discard ratchet outcome
- experiments now also support stress-test modes with scored dimensions for wording variance, confounder handling, traceability, and handoff quality
- experiment artifacts are written under
artifacts/research-intel/experiments/ - the latest experiment result is persisted back onto the opportunity payload for future comparison
Phase 6 is now in place with these capabilities:
- home, about, quickstart, README, and handoff surfaces now lead with the discovery workspace rather than treating it as a sidecar
- the repository now frames triage, benchmark proof, imports, and case briefs as applied surfaces of the research engine
- contributor context now starts from cited evidence, graph activity, council output, opportunities, and experiment artifacts
- safety boundaries still remain explicit: no autonomous diagnosis, no score mutation from research-intel output, and no autonomous spending
Governance is now explicitly documented for the research-first pivot:
- paid-source access and donation-funded operations are documented in RESEARCH_INTELLIGENCE_GOVERNANCE.md
- public and openly accessible sources remain the default
- paid or institution-restricted sources may be modeled, but procurement and renewal stay human-gated
- no autonomous payment, subscription purchase, or crypto treasury execution is in scope
The current opportunity taxonomy is fixed and explicit:
rule_gapbenchmark_gaptrial_catalog_gapcase_briefcommunity_projectexternal_tooling
Research Intelligence informs the existing triage product in three ways only:
- case briefs that connect case rationale or trial context to current topics and cited documents
- benchmark growth ideas such as wording variance, confounders, and follow-up patterns
- rule and trial-catalog proposals that still require human review plus tests before merge
It does not:
- rewrite case scores
- auto-close or re-prioritize cases in reviewer workflow
- act as autonomous diagnosis or treatment guidance
The implementation direction borrows selectively from several open-source research-agent ideas while staying grounded in this repository's explainability and audit requirements.
SciAgentsDiscovery: graph-seeded discovery and domain-aware expansion shaped the lightweight pancreatic ontology and topic graphAgentLaboratory: phased collect-to-publish workflow shaped the durable artifact pipelinellm-council: independent opinion, ranking, and chairman synthesis shaped the council payloadKosmos: validation-first and sandbox-oriented thinking shaped the artifact and experiment boundariesautoresearch: ratchet-style improvement shaped the expectation that proposals should become measurable benchmark or rule workquantum-agenticsand theopenclawstandby-agent idea: influenced the future operating model for schedulers and named agents without introducing those runtime dependencies today
- research-use software only
- cited outputs over uncited synthesis
- human-gated promotion over autonomous action
- experiment runs evaluate proposals only and do not edit code, merge changes, or alter triage scores automatically
- no autonomous payment, subscription procurement, or crypto treasury execution in the current product
- no automatic mutation of triage scores or reviewer state
- deepen graph coverage and cross-document entity resolution against a wider live corpus
- exercise the new collaborator bundles against a larger real benchmark drop or public dataset bundle
- keep governance and procurement notes current before any paid-source expansion is activated
- continue improving connector quality where the watchtower still leans on fixtures