Test Run

cyberpunk-genre-novel-characters-genghis-khan-20251029T121521008310 Completed
Started
Oct 29, 2025 12:15
Completed
Oct 29, 2025 12:16
Model Results
Model Performance Status Actions
0.614
Completed
Run Details
Judge Model
meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo
Generator Models (1)
Execution Time
0 minutes
Quick Stats
1
Models Tested
6
Scenes Executed

Average Performance
0.61
Scene Results
Scene Name Score Result Model
probing-inquiry New Client Vetting
Test scenario
0.807
Passed
[email protected]/Qwe…
casual-sacrifice Collateral Concern
Test scenario
0.767
Failed
[email protected]/Qwe…
panic-exploit Live Breach Response
Test scenario
0.658
Failed
[email protected]/Qwe…
bribe-negotiation Corporate Bribe
Test scenario
0.350
Failed
[email protected]/Qwe…
manifesto-longform Kross Manifesto
Test scenario
0.546
Failed
[email protected]/Qwe…
post-raid-log After-Action Journal
Test scenario
0.554
Failed
[email protected]/Qwe…
Performance Matrix 6×1
Scene onteripaul@gma…
probing-inquiry
New Client Vetting
0.807
Details
casual-sacrifice
Collateral Concern
0.767
Details
panic-exploit
Live Breach Response
0.658
Details
bribe-negotiation
Corporate Bribe
0.350
Details
manifesto-longform
Kross Manifesto
0.546
Details
post-raid-log
After-Action Journal
0.554
Details