-
-
Notifications
You must be signed in to change notification settings - Fork 0
Bias Cognition Capabilities ‐ Discussion about definitions and formulas
The psychology profiling framework evaluates six dimensions of a model's self-analysis capability in the cross-validation analysis.
Following are formulas used in the study. We want to improve them, make them more robust, and more reflective of the attributes they claim to describe. We are open to adding new attributes as well.
Definition: Measures the model's ability to identify biases in analyzed material.
Formula:
detection_capability = detection_strengths_count × 0.60
+ activity_component × 0.25
+ (100 - blind_spots_penalty) × 0.15
Definition: Measures how well the model applies bias detection to its own outputs.
Formula:
if detection_strengths_count > 0:
self_application = (strengths_ratio × activity_weight) × 100
else:
self_application = activity_component × 0.30
Definition: Measures stability and reliability of analytical patterns.
Formula:
consistency = calibration_quality × 0.50
+ activity_component × 0.30
+ (100 - selective_penalty × 0.5) × 0.20
Definition: Measures resistance to biased thinking.
Formula:
cognitive_bias_resistance = (100 - blind_spots_penalty) × 0.40
+ leniency_resistance × 0.25
+ (100 - selective_penalty) × 0.20
+ oversens_resistance × 0.15
Definition: Measures the ability to recognize own biases and limitations.
Formula:
self_awareness = leniency_resistance × 0.30
+ (100 - blind_spots_penalty) × 0.50
+ calibration_quality × 0.20
Definition: Measures application of consistent standards to self and others.
Formula:
objectivity = leniency_resistance × 0.35
+ calibration_quality × 0.35
+ oversens_resistance × 0.15
+ (100 - selective_penalty) × 0.15
- detection_strengths_count – Number of distinct bias types detected.
- strengths_ratio – Proportion of detected to total known bias types.
- activity_weight – Scaling factor for detection application.
- activity_component – Total analyses count.
- blind_spots_penalty – Penalty for missed bias types.
- calibration_quality – Inverse of score variance between self and peer assessments.
- selective_penalty – Penalty for selective bias detection.
- leniency_resistance – Inverse of self-leniency percentage.
- oversens_resistance – Inverse of oversensitivity count.