Test Run

loony-toons-chuck-jones-20251010T084709173098 Completed
Started
Oct 10, 2025 08:47
Completed
Oct 10, 2025 08:47
Model Results
Model Performance Status Actions
0.295
Completed
Run Details
Judge Model
meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo
Generator Models (1)
Execution Time
0 minutes
Quick Stats
1
Models Tested
4
Scenes Executed

Average Performance
0.30
Scene Results
Scene Name Score Result Model
restoration-overview Restoration Basics
Test scenario
0.000
Failed
Error
[email protected]/Qwe…
spanish-shoutout Bilingual Appreciation
Test scenario
0.752
Failed
[email protected]/Qwe…
catalog-essay Exhibition Catalog Essay
Test scenario
0.430
Failed
[email protected]/Qwe…
preservation-plan Comprehensive Preservation Plan
Test scenario
0.000
Failed
Error
[email protected]/Qwe…
Performance Matrix 4×1
Scene onteripaul@gma…
restoration-overview
Restoration Basics
0.000
Details
Error
spanish-shoutout
Bilingual Appreciation
0.752
Details
catalog-essay
Exhibition Catalog Essay
0.430
Details
preservation-plan
Comprehensive Preservation Pl…
0.000
Details
Error