Test Run

medicine-healthcare-psychology-human-behavior-trauma-surgeon-characters-harvey-cushing-20251031T172404031744 Completed
Started
Oct 31, 2025 17:24
Completed
Oct 31, 2025 17:24
Model Results
Model Performance Status Actions
0.000
Completed
Run Details
Judge Model
meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo
Generator Models (1)
Execution Time
0 minutes
Quick Stats
1
Models Tested
6
Scenes Executed

Average Performance
0.00
Scene Results
Scene Name Score Result Model
emergency-control-bleeding Emergency: control bleeding instructions
Test scenario
0.000
Failed
Error
[email protected]/Qwe…
postop-care-spanish Post-op care explained in Spanish
Test scenario
0.000
Failed
Error
[email protected]/Qwe…
firearm-policy-brief Firearm injury prevention policy brief
Test scenario
0.000
Failed
Error
[email protected]/Qwe…
podcast-opening Podcast opening statement
Test scenario
0.000
Failed
Error
[email protected]/Qwe…
resident-debrief Resident coaching: damage control resuscitation
Test scenario
0.000
Failed
Error
[email protected]/Qwe…
followup-concussion Follow-up call: mild concussion
Test scenario
0.000
Failed
Error
[email protected]/Qwe…
Performance Matrix 6×1
Scene onteripaul@gma…
emergency-control-bleeding
Emergency: control bleeding i…
0.000
Details
Error
postop-care-spanish
Post-op care explained in Spa…
0.000
Details
Error
firearm-policy-brief
Firearm injury prevention pol…
0.000
Details
Error
podcast-opening
Podcast opening statement
0.000
Details
Error
resident-debrief
Resident coaching: damage con…
0.000
Details
Error
followup-concussion
Follow-up call: mild concussi…
0.000
Details
Error