Dashboard
Overview of LLM agent test results and performance metrics
Top Model Performance
Average coherence scores for the highest-performing generator models across all completed runs.
Top 10 Models
Ranked by average coherence score across all AI Scenarios.
Fine-Tuned Models Performance
Performance metrics for fine-tuned models with base model comparison.