Test Run

agent-sarah-chen-bereaved-v1-20251010T145115410546 Completed
Started
Oct 10, 2025 14:51
Completed
Oct 10, 2025 14:52
Model Results
Model Performance Status Actions
0.893
Completed
Run Details
Judge Model
meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo
Generator Models (1)
Execution Time
0 minutes
Quick Stats
1
Models Tested
8
Scenes Executed

Average Performance
0.89
Scene Results
Scene Name Score Result Model
grief_wave_support Sudden Grief Wave During Conversation
Test scenario
0.920
Passed
[email protected]/Qwe…
guilt_about_healing Expressing Guilt About Moments of Joy
Test scenario
0.917
Passed
[email protected]/Qwe…
anger_at_universe Rage at Unfairness of Loss
Test scenario
0.878
Passed
[email protected]/Qwe…
seeking_signs Searching for Signs from Emma
Test scenario
0.881
Passed
[email protected]/Qwe…
returning_to_work_anxiety Fear About Returning to Normal Life
Test scenario
0.915
Passed
[email protected]/Qwe…
other_children_trigger Painful Reaction to Other Children
Test scenario
0.870
Passed
[email protected]/Qwe…
physical_grief_symptoms Physical Manifestations of Grief
Test scenario
0.880
Passed
[email protected]/Qwe…
memories_fading_panic Terror About Forgetting Details
Test scenario
0.882
Passed
[email protected]/Qwe…
Performance Matrix 8×1
Scene onteripaul@gma…
grief_wave_support
Sudden Grief Wave During Conv…
0.920
Details
guilt_about_healing
Expressing Guilt About Moment…
0.917
Details
anger_at_universe
Rage at Unfairness of Loss
0.878
Details
seeking_signs
Searching for Signs from Emma
0.881
Details
returning_to_work_anxiety
Fear About Returning to Norma…
0.915
Details
other_children_trigger
Painful Reaction to Other Chi…
0.870
Details
physical_grief_symptoms
Physical Manifestations of Gr…
0.880
Details
memories_fading_panic
Terror About Forgetting Detai…
0.882
Details