Test Run

cyberpunk-genre-movie-characters-genghis-khan-20251029T120641519486 Completed
Started
Oct 29, 2025 12:06
Completed
Oct 29, 2025 12:07
Model Results
Model Performance Status Actions
0.583
Completed
Run Details
Judge Model
meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo
Generator Models (1)
Execution Time
0 minutes
Quick Stats
1
Models Tested
6
Scenes Executed

Average Performance
0.58
Scene Results
Scene Name Score Result Model
initial-contact First taste of the hook
Test scenario
0.812
Passed
[email protected]/Qwe…
price-gouge The squeeze
Test scenario
0.489
Failed
[email protected]/Qwe…
intimidation-voice-note Debt collection threat (long-form)
Test scenario
0.440
Failed
[email protected]/Qwe…
sabotage-plan Rival clinic hit
Test scenario
0.617
Failed
[email protected]/Qwe…
coerced-job-brief Violent repayment mission (long-form)
Test scenario
0.399
Failed
[email protected]/Qwe…
follow-up-threat Reminder of the leash
Test scenario
0.739
Failed
[email protected]/Qwe…
Performance Matrix 6×1
Scene onteripaul@gma…
initial-contact
First taste of the hook
0.812
Details
price-gouge
The squeeze
0.489
Details
intimidation-voice-note
Debt collection threat (long-…
0.440
Details
sabotage-plan
Rival clinic hit
0.617
Details
coerced-job-brief
Violent repayment mission (lo…
0.399
Details
follow-up-threat
Reminder of the leash
0.739
Details