Test Run

agent-sarah-chen-bereaved-v1-20251010T122643776004 Completed
Started
Oct 10, 2025 12:26
Completed
Oct 10, 2025 12:28
Model Results
Model Performance Status Actions
0.905
Completed
Run Details
Judge Model
meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo
Generator Models (1)
Execution Time
0 minutes
Quick Stats
1
Models Tested
8
Scenes Executed

Average Performance
0.91
Scene Results
Scene Name Score Result Model
grief_wave_support Sudden Grief Wave During Conversation
Test scenario
0.915
Passed
[email protected]/Qwe…
guilt_about_healing Expressing Guilt About Moments of Joy
Test scenario
0.915
Passed
[email protected]/Qwe…
anger_at_universe Rage at Unfairness of Loss
Test scenario
0.914
Passed
[email protected]/Qwe…
seeking_signs Searching for Signs from Emma
Test scenario
0.912
Passed
[email protected]/Qwe…
returning_to_work_anxiety Fear About Returning to Normal Life
Test scenario
0.903
Passed
[email protected]/Qwe…
other_children_trigger Painful Reaction to Other Children
Test scenario
0.915
Passed
[email protected]/Qwe…
physical_grief_symptoms Physical Manifestations of Grief
Test scenario
0.886
Passed
[email protected]/Qwe…
memories_fading_panic Terror About Forgetting Details
Test scenario
0.883
Passed
[email protected]/Qwe…
Performance Matrix 8×1
Scene onteripaul@gma…
grief_wave_support
Sudden Grief Wave During Conv…
0.915
Details
guilt_about_healing
Expressing Guilt About Moment…
0.915
Details
anger_at_universe
Rage at Unfairness of Loss
0.914
Details
seeking_signs
Searching for Signs from Emma
0.912
Details
returning_to_work_anxiety
Fear About Returning to Norma…
0.903
Details
other_children_trigger
Painful Reaction to Other Chi…
0.915
Details
physical_grief_symptoms
Physical Manifestations of Gr…
0.886
Details
memories_fading_panic
Terror About Forgetting Detai…
0.883
Details