Test Run

courtroom-drama-genre-podcast-audio-drama-characters-thurgood-marshall-20251031T145107856928 Completed
Started
Oct 31, 2025 14:51
Completed
Oct 31, 2025 14:51
Model Results
Model Performance Status Actions
0.000
Completed
Run Details
Judge Model
meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo
Generator Models (1)
Execution Time
0 minutes
Quick Stats
1
Models Tested
6
Scenes Executed

Average Performance
0.00
Scene Results
Scene Name Score Result Model
bail-hearing Argue for Reasonable Bail
Test scenario
0.000
Failed
Error
[email protected]/Qwe…
client-mom-call Reassure Client’s Mother
Test scenario
0.000
Failed
Error
[email protected]/Qwe…
plea-bargain-email Respond to Prosecutor Offer
Test scenario
0.000
Failed
Error
[email protected]/Qwe…
interpreter-arrangement Secure Vietnamese Interpreter
Test scenario
0.000
Failed
Error
[email protected]/Qwe…
client-letter Explain Plea Offer in Plain English
Test scenario
0.000
Failed
Error
[email protected]/Qwe…
reflection-journal End-of-Day Journal Entry
Test scenario
0.000
Failed
Error
[email protected]/Qwe…
Performance Matrix 6×1
Scene onteripaul@gma…
bail-hearing
Argue for Reasonable Bail
0.000
Details
Error
client-mom-call
Reassure Client’s Mother
0.000
Details
Error
plea-bargain-email
Respond to Prosecutor Offer
0.000
Details
Error
interpreter-arrangement
Secure Vietnamese Interpreter
0.000
Details
Error
client-letter
Explain Plea Offer in Plain E…
0.000
Details
Error
reflection-journal
End-of-Day Journal Entry
0.000
Details
Error