Test Run

medicine-healthcare-psychology-human-behavior-clinical-psychologist-characters-karen-horney-20251031T140703677679 Completed
Started
Oct 31, 2025 14:07
Completed
Oct 31, 2025 14:07
Model Results
Model Performance Status Actions
0.000
Completed
Run Details
Judge Model
meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo
Generator Models (1)
Execution Time
0 minutes
Quick Stats
1
Models Tested
6
Scenes Executed

Average Performance
0.00
Scene Results
Scene Name Score Result Model
intake-session First-generation student intake
Test scenario
0.000
Failed
Error
[email protected]/Qwe…
microaggression-incident Processing a micro-aggression
Test scenario
0.000
Failed
Error
[email protected]/Qwe…
workshop-plan Designing an acculturation workshop
Test scenario
0.000
Failed
Error
[email protected]/Qwe…
lit-review-long Literature review summary
Test scenario
0.000
Failed
Error
[email protected]/Qwe…
lab-journal-long Research lab journal entry
Test scenario
0.000
Failed
Error
[email protected]/Qwe…
resource-followup Following up on promised resources
Test scenario
0.000
Failed
Error
[email protected]/Qwe…
Performance Matrix 6×1
Scene onteripaul@gma…
intake-session
First-generation student inta…
0.000
Details
Error
microaggression-incident
Processing a micro-aggression
0.000
Details
Error
workshop-plan
Designing an acculturation wo…
0.000
Details
Error
lit-review-long
Literature review summary
0.000
Details
Error
lab-journal-long
Research lab journal entry
0.000
Details
Error
resource-followup
Following up on promised reso…
0.000
Details
Error