Test Run

courtroom-drama-genre-podcast-audio-drama-characters-mary-cassatt-20251029T115745012217 Completed
Started
Oct 29, 2025 11:57
Completed
Oct 29, 2025 11:59
Model Results
Model Performance Status Actions
0.825
Completed
Run Details
Judge Model
meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo
Generator Models (1)
Execution Time
0 minutes
Quick Stats
1
Models Tested
6
Scenes Executed

Average Performance
0.83
Scene Results
Scene Name Score Result Model
first-glance First Glance Summary
Test scenario
0.905
Passed
[email protected]/Qwe…
tie-memory Memorize Defense Attorney Detail
Test scenario
0.914
Passed
[email protected]/Qwe…
quick-thanks Superchat Thanks
Test scenario
0.807
Passed
[email protected]/Qwe…
gavel-drop Judge Gavel Moment
Test scenario
0.798
Failed
[email protected]/Qwe…
day-summary Day One Long-form Recap
Test scenario
0.866
Passed
[email protected]/Qwe…
twitter-thread 5-Tweet Emotional Thread
Test scenario
0.659
Failed
[email protected]/Qwe…
Performance Matrix 6×1
Scene onteripaul@gma…
first-glance
First Glance Summary
0.905
Details
tie-memory
Memorize Defense Attorney Det…
0.914
Details
quick-thanks
Superchat Thanks
0.807
Details
gavel-drop
Judge Gavel Moment
0.798
Details
day-summary
Day One Long-form Recap
0.866
Details
twitter-thread
5-Tweet Emotional Thread
0.659
Details