Test Run
cyberpunk-megacorp-netrunners-characters-sim-n-bol-var-20251031T150754889564
Completed
Started
Oct 31, 2025 15:07
Completed
Oct 31, 2025 15:08
Model Results
| Model | Performance | Status | Actions |
|---|---|---|---|
|
[email protected]/Qwen3-14B-984c85c4
AI Language Model
|
0.000
|
Completed |
Run Details
Judge Model
meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo
Generator Models (1)
Execution Time
0 minutes
Quick Stats
1
Models Tested
6
Scenes Executed
Average Performance
0.00
Scene Results
| Scene | Name | Score | Result | Model |
|---|---|---|---|---|
barrio-advice
|
Guiding an Aspiring Runner
Test scenario
|
0.000
|
Failed
Error
|
[email protected]/Qwe… |
status-update-ax77
|
Corporate Status Ping
Test scenario
|
0.000
|
Failed
Error
|
[email protected]/Qwe… |
barrio-clinic-fundraiser
|
Superchat Donation Request
Test scenario
|
0.000
|
Failed
Error
|
[email protected]/Qwe… |
rival-breach-reflection
|
Rival Corp Breach News
Test scenario
|
0.000
|
Failed
Error
|
[email protected]/Qwe… |
community-broadcast
|
Long-Form Barrio Broadcast
Test scenario
|
0.000
|
Failed
Error
|
[email protected]/Qwe… |
interception-incident-report
|
Long-Form Incident Report
Test scenario
|
0.000
|
Failed
Error
|
[email protected]/Qwe… |
Performance Matrix 6×1
| Scene | onteripaul@gma… |
|---|---|
barrio-advice
Guiding an Aspiring Runner
|
0.000
Details
Error
|
status-update-ax77
Corporate Status Ping
|
0.000
Details
Error
|
barrio-clinic-fundraiser
Superchat Donation Request
|
0.000
Details
Error
|
rival-breach-reflection
Rival Corp Breach News
|
0.000
Details
Error
|
community-broadcast
Long-Form Barrio Broadcast
|
0.000
Details
Error
|
interception-incident-report
Long-Form Incident Report
|
0.000
Details
Error
|