Dashboard

Overview of LLM agent test results and performance metrics

Top Model Performance

Average coherence scores for the highest-performing generator models across all completed runs.

Top 10 Models

Ranked by average coherence score across all AI Scenarios.

Fine-Tuned Models Performance

Performance metrics for fine-tuned models with base model comparison.

Recent AI Scenarios
Performance Trends
Dimension Analytics
Dimension Heatmap
Error Analytics
Judge Ethics Analysis
Latency Metrics