Skip to content

Latest commit

 

History

History
3 lines (2 loc) · 165 Bytes

File metadata and controls

3 lines (2 loc) · 165 Bytes

Evaluation

Evaluation turns model behavior into auditable evidence through human feedback, reward scoring, ranking, scientific metrics, and automated evaluation.