Evaluation Evaluation turns model behavior into auditable evidence through human feedback, reward scoring, ranking, scientific metrics, and automated evaluation.