OliverHennhoefer
diff --git a/‎CHANGELOG.md‎
Lines changed: 12 additions & 0 deletions b/‎CHANGELOG.md‎
Lines changed: 12 additions & 0 deletions
diff --git a/‎docs/source/api/index.md‎
Lines changed: 2 additions & 1 deletion b/‎docs/source/api/index.md‎
Lines changed: 2 additions & 1 deletion
diff --git a/‎docs/source/examples/fdr_control.md‎
Lines changed: 19 additions & 0 deletions b/‎docs/source/examples/fdr_control.md‎
Lines changed: 19 additions & 0 deletions
diff --git a/‎docs/source/user_guide/fdr_control.md‎
Lines changed: 53 additions & 0 deletions b/‎docs/source/user_guide/fdr_control.md‎
Lines changed: 53 additions & 0 deletions
diff --git a/‎examples/fdp_bounds.ipynb‎
Lines changed: 212 additions & 0 deletions b/‎examples/fdp_bounds.ipynb‎
Lines changed: 212 additions & 0 deletions
diff --git a/‎nonconform/detector.py‎
Lines changed: 8 additions & 2 deletions b/‎nonconform/detector.py‎
Lines changed: 8 additions & 2 deletions
@@ -7,6 +7,18 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 
 ## [Unreleased]
 
+### Added
+
+- Added post-hoc FDP upper bounds for unweighted conformal p-values via
+  `nonconform.fdr.conformal_fdp_upper_bound`, including certified precision
+  lower bounds and envelope methods `mc_thc`, `mc_hc`, `mc_ks`, `ks`, and
+  `mc_bj`.
+
+### Changed
+
+- Restricted cached-result FDP certificates to known supported empirical split
+  conformal scopes.
+
 ## [1.0.1] - 2026-05-20
 
 ### Security
 
@@ -44,7 +44,8 @@ For the v1 public compatibility contract, see
 
 ## FDR Control
 
-Includes weighted low-level expert APIs (`weighted_false_discovery_control`).
+Includes post-hoc FDP bounds and weighted low-level expert APIs
+(`weighted_false_discovery_control`).
 For standard workflows, prefer `ConformalDetector.select(...)`.
 ::: nonconform.fdr
     options:
 
@@ -65,6 +65,25 @@ no_adjustment = (p_values < 0.05).sum()
 print(f"No adjustment: {no_adjustment} detections")
 ```
 
+## Post-Hoc FDP Bounds
+
+```python
+from nonconform.fdr import conformal_fdp_upper_bound_from_result
+
+p_values = detector.compute_p_values(X_test)
+bounds = conformal_fdp_upper_bound_from_result(
+    detector.last_result,
+    method="mc_thc",
+    seed=42,
+    thresholds=np.array([0.01, 0.05, 0.1]),
+)
+
+print(bounds.to_frame().to_string(index=False))
+
+# Optional dense threshold grid for plotting or downstream filtering
+curve = bounds.to_frame(thresholds=np.linspace(0.0, 0.2, 101))
+```
+
 ## Weighted Conformal Selection (Covariate Shift)
 
 ```python
 
@@ -91,6 +91,57 @@ decisions = false_discovery_control(p_values, method="bh") <= 0.05
 
 ---
 
+## Post-Hoc FDP Bounds
+
+Classical FDR control is a fixed-level, average-over-repeated-runs guarantee. If
+you inspect p-values and then choose a more convenient threshold, you should not
+report the result as if that threshold had been fixed upfront.
+
+For exploratory threshold choice on already-valid unweighted conformal
+p-values, `nonconform.fdr.conformal_fdp_upper_bound_from_result(...)` provides
+a simultaneous upper bound on realized false discovery proportion (FDP) across
+thresholds:
+
+```python
+from nonconform.fdr import conformal_fdp_upper_bound_from_result
+
+p_values = detector.compute_p_values(X_test)
+bounds = conformal_fdp_upper_bound_from_result(
+    detector.last_result,
+    confidence=0.95,
+    method="mc_thc",
+    seed=42,
+)
+
+threshold = 0.05
+selected = bounds.select(threshold)
+fdp_bound = bounds.bound_at(threshold)
+precision_floor = bounds.precision_at(threshold)
+table = bounds.to_frame()
+```
+
+Interpretation: after choosing `threshold`, report the attached FDP certificate,
+for example "at threshold 0.05, the 95% FDP upper bound is 0.18, so certified
+precision is at least 0.82." This is a different claim from "BH controls FDR at
+0.18."
+
+Supported envelope methods are `mc_thc` (default), `mc_hc`, `mc_ks`, `ks`, and
+`mc_bj`. Choose the method before inspecting the curve; comparing methods after
+seeing the data and reporting only the best-looking certificate is result
+tuning. `ks` is simple and often conservative. `mc_bj` can be useful for sharp
+left-tail behavior, but it is numerically heavier and uses the `precision`
+parameter.
+
+!!! warning "Guarantee scope"
+    This first implementation is intended for unweighted empirical split or
+    detached conformal p-values from a fixed scoring map. It does not cover
+    weighted p-values, probabilistic/conditional p-value estimators,
+    cross-validation/jackknife aggregation, detector or feature selection,
+    threshold-dependent preprocessing, or repeated attempts to pick the
+    best-looking pipeline.
+
+---
+
 ## Selection Entry Points
 
 **Primary (recommended):** `detector.select(X_test, alpha=...)` - dispatches
@@ -101,6 +152,8 @@ handling required.
 
 - Standard (exchangeable): apply BH directly via
   `scipy.stats.false_discovery_control(...)` to conformal p-values.
+- Post-hoc FDP certificates:
+  `conformal_fdp_upper_bound_from_result(result=...)`.
 - Weighted (covariate shift with importance weights):
   `weighted_false_discovery_control(result=...)` or
   `weighted_false_discovery_control_from_arrays(...)`.
 
@@ -0,0 +1,212 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "intro",
+   "metadata": {},
+   "source": [
+    "# Post-Hoc FDP Bounds\n",
+    "\n",
+    "This notebook shows the smallest workflow for attaching a post-hoc FDP certificate to unweighted empirical split conformal p-values."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "import",
+   "metadata": {},
+   "source": [
+    "## Import\n",
+    "\n",
+    "This section loads the dependencies used throughout the notebook."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "imports",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import logging\n",
+    "\n",
+    "import numpy as np\n",
+    "import pandas as pd\n",
+    "from oddball import Dataset, load\n",
+    "from pyod.models.iforest import IForest\n",
+    "\n",
+    "from nonconform import ConformalDetector, Split\n",
+    "from nonconform.fdr import conformal_fdp_upper_bound_from_result\n",
+    "from nonconform.metrics import false_discovery_rate, statistical_power\n",
+    "\n",
+    "root_logger = logging.getLogger(\"nonconform\")\n",
+    "if not root_logger.handlers:\n",
+    "    root_logger.addHandler(logging.NullHandler())\n",
+    "root_logger.setLevel(logging.ERROR)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "setup",
+   "metadata": {},
+   "source": [
+    "## Setup\n",
+    "\n",
+    "We load Shuttle with a fixed seed and choose a few p-value thresholds to certify after p-values are computed."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "load-data",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "x_train: (22793, 9), x_test: (1000, 9)\n",
+      "y_test positives: 100\n",
+      "calibration size=1000\n"
+     ]
+    }
+   ],
+   "source": [
+    "x_train, x_test, y_test = load(Dataset.SHUTTLE, setup=True, seed=42)\n",
+    "\n",
+    "n_calib = 1_000\n",
+    "thresholds = np.array([0.005, 0.01, 0.02, 0.05, 0.1])\n",
+    "\n",
+    "print(f\"x_train: {x_train.shape}, x_test: {x_test.shape}\")\n",
+    "print(f\"y_test positives: {int(y_test.sum())}\")\n",
+    "print(f\"calibration size={n_calib}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "certificate",
+   "metadata": {},
+   "source": [
+    "## FDP Certificate\n",
+    "\n",
+    "`conformal_fdp_upper_bound_from_result(...)` uses the cached `last_result` from `compute_p_values(...)` and returns threshold-level FDP and precision certificates."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "fit-and-bound",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "method=mc_thc, confidence=0.95\n",
+      " threshold  discoveries  fdp_upper_bound  precision_lower_bound\n",
+      "     0.005          103            0.226                  0.774\n",
+      "     0.010          111            0.210                  0.790\n",
+      "     0.020          120            0.259                  0.741\n",
+      "     0.050          154            0.423                  0.577\n",
+      "     0.100          212            0.581                  0.419\n"
+     ]
+    }
+   ],
+   "source": [
+    "detector = ConformalDetector(\n",
+    "    detector=IForest(n_estimators=100, max_samples=0.8, random_state=42),\n",
+    "    strategy=Split(n_calib=n_calib),\n",
+    "    seed=42,\n",
+    ")\n",
+    "\n",
+    "detector.fit(x_train)\n",
+    "p_values = detector.compute_p_values(x_test)\n",
+    "bounds = conformal_fdp_upper_bound_from_result(\n",
+    "    detector.last_result,\n",
+    "    confidence=0.95,\n",
+    "    method=\"mc_thc\",\n",
+    "    seed=42,\n",
+    "    thresholds=thresholds,\n",
+    ")\n",
+    "\n",
+    "print(f\"method={bounds.method}, confidence={bounds.confidence}\")\n",
+    "print(bounds.to_frame().to_string(index=False, float_format=lambda x: f\"{x:.3f}\"))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "threshold",
+   "metadata": {},
+   "source": [
+    "## Use a Threshold\n",
+    "\n",
+    "`select(...)` applies the threshold. `bound_at(...)` and `precision_at(...)` attach the post-hoc certificate to that same threshold. The empirical columns below use labels only to check this benchmark example."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "select-threshold",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      " threshold  discoveries  fdp_upper_bound  precision_lower_bound  empirical_fdr  power\n",
+      "     0.050          154            0.423                  0.577          0.351  1.000\n"
+     ]
+    }
+   ],
+   "source": [
+    "threshold = 0.05\n",
+    "selected = bounds.select(threshold)\n",
+    "\n",
+    "summary = pd.DataFrame(\n",
+    "    [\n",
+    "        {\n",
+    "            \"threshold\": threshold,\n",
+    "            \"discoveries\": int(selected.sum()),\n",
+    "            \"fdp_upper_bound\": bounds.bound_at(threshold),\n",
+    "            \"precision_lower_bound\": bounds.precision_at(threshold),\n",
+    "            \"empirical_fdr\": false_discovery_rate(y_test, selected),\n",
+    "            \"power\": statistical_power(y_test, selected),\n",
+    "        }\n",
+    "    ]\n",
+    ")\n",
+    "\n",
+    "print(summary.to_string(index=False, float_format=lambda x: f\"{x:.3f}\"))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "interpretation",
+   "metadata": {},
+   "source": [
+    "## Interpretation\n",
+    "\n",
+    "Read `fdp_upper_bound` as a high-confidence cap on the false-positive fraction among the selected points at that threshold. `precision_lower_bound` is the matching conservative minimum precision. Use the table to choose a practical trade-off, then report the threshold and certificate together. This is different from `detector.select(..., alpha=...)`, which is the fixed-level FDR-control workflow."
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.13.3"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
@@ -858,11 +858,17 @@ def compute_p_values(
             estimates, self._calibration_set, weights
         )
 
-        metadata: dict[str, Any] = {}
+        metadata: dict[str, Any] = {
+            "nonconform": {
+                "strategy": type(self.strategy).__name__,
+                "estimation": type(self.estimation).__name__,
+                "weighted": self._is_weighted_mode,
+            }
+        }
         if hasattr(self.estimation, "get_metadata"):
             meta = self.estimation.get_metadata()
             if meta:
-                metadata = dict(meta)
+                metadata.update(meta)
 
         self._last_result = ConformalResult(
             p_values=p_values.copy(),