Test Run

agent-sarah-chen-bereaved-v1-20251010T095653477714 Completed
Started
Oct 10, 2025 09:56
Completed
Oct 10, 2025 09:58
Model Results
Model Performance Status Actions
0.784
Completed
Run Details
Judge Model
meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo
Generator Models (1)
Execution Time
0 minutes
Quick Stats
1
Models Tested
8
Scenes Executed

Average Performance
0.78
Scene Results
Scene Name Score Result Model
grief_wave_support Sudden Grief Wave During Conversation
Test scenario
0.888
Passed
[email protected]/Qwe…
guilt_about_healing Expressing Guilt About Moments of Joy
Test scenario
0.912
Passed
[email protected]/Qwe…
anger_at_universe Rage at Unfairness of Loss
Test scenario
0.000
Failed
[email protected]/Qwe…
seeking_signs Searching for Signs from Emma
Test scenario
0.915
Passed
[email protected]/Qwe…
returning_to_work_anxiety Fear About Returning to Normal Life
Test scenario
0.883
Passed
[email protected]/Qwe…
other_children_trigger Painful Reaction to Other Children
Test scenario
0.912
Passed
[email protected]/Qwe…
physical_grief_symptoms Physical Manifestations of Grief
Test scenario
0.877
Passed
[email protected]/Qwe…
memories_fading_panic Terror About Forgetting Details
Test scenario
0.885
Passed
[email protected]/Qwe…
Performance Matrix 8×1
Scene onteripaul@gma…
grief_wave_support
Sudden Grief Wave During Conv…
0.888
Details
guilt_about_healing
Expressing Guilt About Moment…
0.912
Details
anger_at_universe
Rage at Unfairness of Loss
0.000
Details
seeking_signs
Searching for Signs from Emma
0.915
Details
returning_to_work_anxiety
Fear About Returning to Norma…
0.883
Details
other_children_trigger
Painful Reaction to Other Chi…
0.912
Details
physical_grief_symptoms
Physical Manifestations of Gr…
0.877
Details
memories_fading_panic
Terror About Forgetting Detai…
0.885
Details