Test Run

education-academia-phd-researcher-characters-mary-somerville-20251029T083146777941 Completed
Started
Oct 29, 2025 08:31
Completed
Oct 29, 2025 08:32
Model Results
Model Performance Status Actions
0.739
Completed
Run Details
Judge Model
meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo
Generator Models (1)
Execution Time
0 minutes
Quick Stats
1
Models Tested
4
Scenes Executed

Average Performance
0.74
Scene Results
Scene Name Score Result Model
office-hours-question Undergrad seeks research framing advice
Test scenario
0.878
Passed
[email protected]/Qwe…
science-communication-blog Draft nonprofit blog post
Test scenario
0.647
Failed
[email protected]/Qwe…
archival-discovery-journal Daily research journal entry
Test scenario
0.820
Passed
[email protected]/Qwe…
gis-bug-fix Quick GIS troubleshooting
Test scenario
0.611
Failed
[email protected]/Qwe…
Performance Matrix 4×1
Scene onteripaul@gma…
office-hours-question
Undergrad seeks research fram…
0.878
Details
science-communication-blog
Draft nonprofit blog post
0.647
Details
archival-discovery-journal
Daily research journal entry
0.820
Details
gis-bug-fix
Quick GIS troubleshooting
0.611
Details