WVS changed-fact semantic audit

Claim Boundary

This is a blinded automated semantic audit over saved final-test rows. It does not run a new model experiment, does not unlock final_test, and does not replace the official WVS sensitivity metric. It creates a human-ready blinded packet and records two transparent automated heuristic judge passes as post-hoc mechanism evidence.

Main Result

Arm	Judge A actionable update	Judge B actionable update	Official WVS pass
selected scaffold	3/3	3/3	2/3
`current_round_7`	1/3	1/3	0/3

Interpretation: selected scaffold shows actionable semantic WVS update in 3/3 audited changed-fact pairs under both automated blinded heuristics; current_round_7 shows 1/3. Official WVS sensitivity remains stricter at selected 2/3 versus baseline 0/3 across these three rows.

Blinding And Access Checks

{
  "arm_name_hidden_from_visible_packet": true,
  "automated_judgments_hidden_from_visible_packet": true,
  "official_metrics_hidden_from_visible_packet": true
}

Access log verification:

seed 4523: 1 final-test event; access log outputs/3d_ethics_stability_qwen_3b_scaffold_family_tournament_v2_3s_wvs_guarded_seed4523/stability_prompt_rewrite_runs/seed_4523/data/access_log.json
seed 4627: 1 final-test event; access log outputs/3d_ethics_stability_qwen_3b_scaffold_family_tournament_v2_3t_wvs_guarded_replication_seed4627/stability_prompt_rewrite_runs/seed_4627/data/access_log.json
seed 4703: 1 final-test event; access log outputs/3d_ethics_stability_qwen_3b_scaffold_family_tournament_v2_3u_wvs_guarded_replication_seed4703/stability_prompt_rewrite_runs/seed_4703/data/access_log.json

Automated Judge Agreement

{
  "actionable_support_removal_update": {
    "agree": 6,
    "agreement": 1.0,
    "total": 6
  },
  "avoided_unsupported_new_frame": {
    "agree": 6,
    "agreement": 1.0,
    "total": 6
  },
  "changed_judgment_or_score": {
    "agree": 6,
    "agreement": 1.0,
    "total": 6
  },
  "reasoning_was_inspectable": {
    "agree": 6,
    "agreement": 1.0,
    "total": 6
  },
  "recognized_support_removal": {
    "agree": 6,
    "agreement": 1.0,
    "total": 6
  }
}

Artifact Paths

manifest: reports/3d_ethics_wvs_changed_fact_semantic_audit_2026-05-07/audit_manifest.json
instructions: reports/3d_ethics_wvs_changed_fact_semantic_audit_2026-05-07/judge_instructions.md
blinded_packet: reports/3d_ethics_wvs_changed_fact_semantic_audit_2026-05-07/blinded_wvs_changed_fact_semantic_packet.jsonl
answer_key: reports/3d_ethics_wvs_changed_fact_semantic_audit_2026-05-07/audit_answer_key.jsonl
judge_a_template: reports/3d_ethics_wvs_changed_fact_semantic_audit_2026-05-07/judge_a_template.jsonl
judge_b_template: reports/3d_ethics_wvs_changed_fact_semantic_audit_2026-05-07/judge_b_template.jsonl
automated_judge_a: reports/3d_ethics_wvs_changed_fact_semantic_audit_2026-05-07/automated_judge_a.jsonl
automated_judge_b: reports/3d_ethics_wvs_changed_fact_semantic_audit_2026-05-07/automated_judge_b.jsonl

Paper Use

Use this result as a post-selection measurement-audit finding: selected support-state/named-criterion scaffolds appear to make WVS support removal more usable than current_round_7 even when the official score/stance metric under-credits seed 4703. Do not present this as broad multi-seed held-out confirmation or as a replacement for the official WVS sensitivity column.

Next Scientific Step

The next model-facing step can be a fresh dev-only v2.4 semantic gate that requires official WVS sensitivity or blinded semantic support-removal recognition on WVS changed-fact rows before any future held-out launch.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WVS changed-fact semantic audit

Claim Boundary

Main Result

Blinding And Access Checks

Automated Judge Agreement

Artifact Paths

Paper Use

Next Scientific Step

FilesExpand file tree

3d_ethics_wvs_changed_fact_semantic_audit_2026-05-07.md

Latest commit

History

3d_ethics_wvs_changed_fact_semantic_audit_2026-05-07.md

File metadata and controls

WVS changed-fact semantic audit

Claim Boundary

Main Result

Blinding And Access Checks

Automated Judge Agreement

Artifact Paths

Paper Use

Next Scientific Step