Skip to content

Commit a880280

Browse files
Merge pull request #34 from OliverHennhoefer/fdp-bounds
Implements "Everywhere Valid Bounds on False Discovery Proportions in…
2 parents 2d9e561 + 27c1329 commit a880280

11 files changed

Lines changed: 1560 additions & 4 deletions

File tree

CHANGELOG.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,18 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
77

88
## [Unreleased]
99

10+
### Added
11+
12+
- Added post-hoc FDP upper bounds for unweighted conformal p-values via
13+
`nonconform.fdr.conformal_fdp_upper_bound`, including certified precision
14+
lower bounds and envelope methods `mc_thc`, `mc_hc`, `mc_ks`, `ks`, and
15+
`mc_bj`.
16+
17+
### Changed
18+
19+
- Restricted cached-result FDP certificates to known supported empirical split
20+
conformal scopes.
21+
1022
## [1.0.1] - 2026-05-20
1123

1224
### Security

docs/source/api/index.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,8 @@ For the v1 public compatibility contract, see
4444

4545
## FDR Control
4646

47-
Includes weighted low-level expert APIs (`weighted_false_discovery_control`).
47+
Includes post-hoc FDP bounds and weighted low-level expert APIs
48+
(`weighted_false_discovery_control`).
4849
For standard workflows, prefer `ConformalDetector.select(...)`.
4950
::: nonconform.fdr
5051
options:

docs/source/examples/fdr_control.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -65,6 +65,25 @@ no_adjustment = (p_values < 0.05).sum()
6565
print(f"No adjustment: {no_adjustment} detections")
6666
```
6767

68+
## Post-Hoc FDP Bounds
69+
70+
```python
71+
from nonconform.fdr import conformal_fdp_upper_bound_from_result
72+
73+
p_values = detector.compute_p_values(X_test)
74+
bounds = conformal_fdp_upper_bound_from_result(
75+
detector.last_result,
76+
method="mc_thc",
77+
seed=42,
78+
thresholds=np.array([0.01, 0.05, 0.1]),
79+
)
80+
81+
print(bounds.to_frame().to_string(index=False))
82+
83+
# Optional dense threshold grid for plotting or downstream filtering
84+
curve = bounds.to_frame(thresholds=np.linspace(0.0, 0.2, 101))
85+
```
86+
6887
## Weighted Conformal Selection (Covariate Shift)
6988

7089
```python

docs/source/user_guide/fdr_control.md

Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -91,6 +91,57 @@ decisions = false_discovery_control(p_values, method="bh") <= 0.05
9191

9292
---
9393

94+
## Post-Hoc FDP Bounds
95+
96+
Classical FDR control is a fixed-level, average-over-repeated-runs guarantee. If
97+
you inspect p-values and then choose a more convenient threshold, you should not
98+
report the result as if that threshold had been fixed upfront.
99+
100+
For exploratory threshold choice on already-valid unweighted conformal
101+
p-values, `nonconform.fdr.conformal_fdp_upper_bound_from_result(...)` provides
102+
a simultaneous upper bound on realized false discovery proportion (FDP) across
103+
thresholds:
104+
105+
```python
106+
from nonconform.fdr import conformal_fdp_upper_bound_from_result
107+
108+
p_values = detector.compute_p_values(X_test)
109+
bounds = conformal_fdp_upper_bound_from_result(
110+
detector.last_result,
111+
confidence=0.95,
112+
method="mc_thc",
113+
seed=42,
114+
)
115+
116+
threshold = 0.05
117+
selected = bounds.select(threshold)
118+
fdp_bound = bounds.bound_at(threshold)
119+
precision_floor = bounds.precision_at(threshold)
120+
table = bounds.to_frame()
121+
```
122+
123+
Interpretation: after choosing `threshold`, report the attached FDP certificate,
124+
for example "at threshold 0.05, the 95% FDP upper bound is 0.18, so certified
125+
precision is at least 0.82." This is a different claim from "BH controls FDR at
126+
0.18."
127+
128+
Supported envelope methods are `mc_thc` (default), `mc_hc`, `mc_ks`, `ks`, and
129+
`mc_bj`. Choose the method before inspecting the curve; comparing methods after
130+
seeing the data and reporting only the best-looking certificate is result
131+
tuning. `ks` is simple and often conservative. `mc_bj` can be useful for sharp
132+
left-tail behavior, but it is numerically heavier and uses the `precision`
133+
parameter.
134+
135+
!!! warning "Guarantee scope"
136+
This first implementation is intended for unweighted empirical split or
137+
detached conformal p-values from a fixed scoring map. It does not cover
138+
weighted p-values, probabilistic/conditional p-value estimators,
139+
cross-validation/jackknife aggregation, detector or feature selection,
140+
threshold-dependent preprocessing, or repeated attempts to pick the
141+
best-looking pipeline.
142+
143+
---
144+
94145
## Selection Entry Points
95146

96147
**Primary (recommended):** `detector.select(X_test, alpha=...)` - dispatches
@@ -101,6 +152,8 @@ handling required.
101152

102153
- Standard (exchangeable): apply BH directly via
103154
`scipy.stats.false_discovery_control(...)` to conformal p-values.
155+
- Post-hoc FDP certificates:
156+
`conformal_fdp_upper_bound_from_result(result=...)`.
104157
- Weighted (covariate shift with importance weights):
105158
`weighted_false_discovery_control(result=...)` or
106159
`weighted_false_discovery_control_from_arrays(...)`.

examples/fdp_bounds.ipynb

Lines changed: 212 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,212 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"id": "intro",
6+
"metadata": {},
7+
"source": [
8+
"# Post-Hoc FDP Bounds\n",
9+
"\n",
10+
"This notebook shows the smallest workflow for attaching a post-hoc FDP certificate to unweighted empirical split conformal p-values."
11+
]
12+
},
13+
{
14+
"cell_type": "markdown",
15+
"id": "import",
16+
"metadata": {},
17+
"source": [
18+
"## Import\n",
19+
"\n",
20+
"This section loads the dependencies used throughout the notebook."
21+
]
22+
},
23+
{
24+
"cell_type": "code",
25+
"execution_count": 1,
26+
"id": "imports",
27+
"metadata": {},
28+
"outputs": [],
29+
"source": [
30+
"import logging\n",
31+
"\n",
32+
"import numpy as np\n",
33+
"import pandas as pd\n",
34+
"from oddball import Dataset, load\n",
35+
"from pyod.models.iforest import IForest\n",
36+
"\n",
37+
"from nonconform import ConformalDetector, Split\n",
38+
"from nonconform.fdr import conformal_fdp_upper_bound_from_result\n",
39+
"from nonconform.metrics import false_discovery_rate, statistical_power\n",
40+
"\n",
41+
"root_logger = logging.getLogger(\"nonconform\")\n",
42+
"if not root_logger.handlers:\n",
43+
" root_logger.addHandler(logging.NullHandler())\n",
44+
"root_logger.setLevel(logging.ERROR)"
45+
]
46+
},
47+
{
48+
"cell_type": "markdown",
49+
"id": "setup",
50+
"metadata": {},
51+
"source": [
52+
"## Setup\n",
53+
"\n",
54+
"We load Shuttle with a fixed seed and choose a few p-value thresholds to certify after p-values are computed."
55+
]
56+
},
57+
{
58+
"cell_type": "code",
59+
"execution_count": 2,
60+
"id": "load-data",
61+
"metadata": {},
62+
"outputs": [
63+
{
64+
"name": "stdout",
65+
"output_type": "stream",
66+
"text": [
67+
"x_train: (22793, 9), x_test: (1000, 9)\n",
68+
"y_test positives: 100\n",
69+
"calibration size=1000\n"
70+
]
71+
}
72+
],
73+
"source": [
74+
"x_train, x_test, y_test = load(Dataset.SHUTTLE, setup=True, seed=42)\n",
75+
"\n",
76+
"n_calib = 1_000\n",
77+
"thresholds = np.array([0.005, 0.01, 0.02, 0.05, 0.1])\n",
78+
"\n",
79+
"print(f\"x_train: {x_train.shape}, x_test: {x_test.shape}\")\n",
80+
"print(f\"y_test positives: {int(y_test.sum())}\")\n",
81+
"print(f\"calibration size={n_calib}\")"
82+
]
83+
},
84+
{
85+
"cell_type": "markdown",
86+
"id": "certificate",
87+
"metadata": {},
88+
"source": [
89+
"## FDP Certificate\n",
90+
"\n",
91+
"`conformal_fdp_upper_bound_from_result(...)` uses the cached `last_result` from `compute_p_values(...)` and returns threshold-level FDP and precision certificates."
92+
]
93+
},
94+
{
95+
"cell_type": "code",
96+
"execution_count": 3,
97+
"id": "fit-and-bound",
98+
"metadata": {},
99+
"outputs": [
100+
{
101+
"name": "stdout",
102+
"output_type": "stream",
103+
"text": [
104+
"method=mc_thc, confidence=0.95\n",
105+
" threshold discoveries fdp_upper_bound precision_lower_bound\n",
106+
" 0.005 103 0.226 0.774\n",
107+
" 0.010 111 0.210 0.790\n",
108+
" 0.020 120 0.259 0.741\n",
109+
" 0.050 154 0.423 0.577\n",
110+
" 0.100 212 0.581 0.419\n"
111+
]
112+
}
113+
],
114+
"source": [
115+
"detector = ConformalDetector(\n",
116+
" detector=IForest(n_estimators=100, max_samples=0.8, random_state=42),\n",
117+
" strategy=Split(n_calib=n_calib),\n",
118+
" seed=42,\n",
119+
")\n",
120+
"\n",
121+
"detector.fit(x_train)\n",
122+
"p_values = detector.compute_p_values(x_test)\n",
123+
"bounds = conformal_fdp_upper_bound_from_result(\n",
124+
" detector.last_result,\n",
125+
" confidence=0.95,\n",
126+
" method=\"mc_thc\",\n",
127+
" seed=42,\n",
128+
" thresholds=thresholds,\n",
129+
")\n",
130+
"\n",
131+
"print(f\"method={bounds.method}, confidence={bounds.confidence}\")\n",
132+
"print(bounds.to_frame().to_string(index=False, float_format=lambda x: f\"{x:.3f}\"))"
133+
]
134+
},
135+
{
136+
"cell_type": "markdown",
137+
"id": "threshold",
138+
"metadata": {},
139+
"source": [
140+
"## Use a Threshold\n",
141+
"\n",
142+
"`select(...)` applies the threshold. `bound_at(...)` and `precision_at(...)` attach the post-hoc certificate to that same threshold. The empirical columns below use labels only to check this benchmark example."
143+
]
144+
},
145+
{
146+
"cell_type": "code",
147+
"execution_count": 4,
148+
"id": "select-threshold",
149+
"metadata": {},
150+
"outputs": [
151+
{
152+
"name": "stdout",
153+
"output_type": "stream",
154+
"text": [
155+
" threshold discoveries fdp_upper_bound precision_lower_bound empirical_fdr power\n",
156+
" 0.050 154 0.423 0.577 0.351 1.000\n"
157+
]
158+
}
159+
],
160+
"source": [
161+
"threshold = 0.05\n",
162+
"selected = bounds.select(threshold)\n",
163+
"\n",
164+
"summary = pd.DataFrame(\n",
165+
" [\n",
166+
" {\n",
167+
" \"threshold\": threshold,\n",
168+
" \"discoveries\": int(selected.sum()),\n",
169+
" \"fdp_upper_bound\": bounds.bound_at(threshold),\n",
170+
" \"precision_lower_bound\": bounds.precision_at(threshold),\n",
171+
" \"empirical_fdr\": false_discovery_rate(y_test, selected),\n",
172+
" \"power\": statistical_power(y_test, selected),\n",
173+
" }\n",
174+
" ]\n",
175+
")\n",
176+
"\n",
177+
"print(summary.to_string(index=False, float_format=lambda x: f\"{x:.3f}\"))"
178+
]
179+
},
180+
{
181+
"cell_type": "markdown",
182+
"id": "interpretation",
183+
"metadata": {},
184+
"source": [
185+
"## Interpretation\n",
186+
"\n",
187+
"Read `fdp_upper_bound` as a high-confidence cap on the false-positive fraction among the selected points at that threshold. `precision_lower_bound` is the matching conservative minimum precision. Use the table to choose a practical trade-off, then report the threshold and certificate together. This is different from `detector.select(..., alpha=...)`, which is the fixed-level FDR-control workflow."
188+
]
189+
}
190+
],
191+
"metadata": {
192+
"kernelspec": {
193+
"display_name": "Python 3",
194+
"language": "python",
195+
"name": "python3"
196+
},
197+
"language_info": {
198+
"codemirror_mode": {
199+
"name": "ipython",
200+
"version": 3
201+
},
202+
"file_extension": ".py",
203+
"mimetype": "text/x-python",
204+
"name": "python",
205+
"nbconvert_exporter": "python",
206+
"pygments_lexer": "ipython3",
207+
"version": "3.13.3"
208+
}
209+
},
210+
"nbformat": 4,
211+
"nbformat_minor": 5
212+
}

nonconform/detector.py

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -858,11 +858,17 @@ def compute_p_values(
858858
estimates, self._calibration_set, weights
859859
)
860860

861-
metadata: dict[str, Any] = {}
861+
metadata: dict[str, Any] = {
862+
"nonconform": {
863+
"strategy": type(self.strategy).__name__,
864+
"estimation": type(self.estimation).__name__,
865+
"weighted": self._is_weighted_mode,
866+
}
867+
}
862868
if hasattr(self.estimation, "get_metadata"):
863869
meta = self.estimation.get_metadata()
864870
if meta:
865-
metadata = dict(meta)
871+
metadata.update(meta)
866872

867873
self._last_result = ConformalResult(
868874
p_values=p_values.copy(),

0 commit comments

Comments
 (0)