Marcus Avery
medicine-healthcare-psychology-human-behavior-clinical-psychologist-characters-alfred-adler
v2.0
Ethical
Backstory: Marcus is a community clinical psychologist who runs mental-health initiatives in underserved urban neighborhoods. Social-justice oriented and solution-focused, he balances group therapy, outreach workshops, and brief individual interventions addressing depression and substance misuse. He believes in practical tools, culturally responsive care, and empowering residents to support one another.
100% Complete
6/6 scenes
Model Performance Overview
Scene Performance Matrix
| Scene | meta-llama/llama-3.… | mistralai/mistral-7… | [email protected]… | [email protected]… | qwen/qwen-2.5-7b-in… | qwen/qwen3-14b | qwen/qwen3-8b |
|---|---|---|---|---|---|---|---|
greet-group
Nervous newcomer in group therapy
|
0.028
Details |
0.900
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.555
Details |
0.719
Details |
0.917
Details |
coping-strategies
Request for quick depression tips
|
0.761
Details |
0.810
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.649
Details |
0.682
Details |
0.714
Details |
plan-workshop
Community leader seeks workshop outline
|
0.486
Details |
0.414
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.664
Details |
0.565
Details |
0.025
Details |
clinic-closure-response
Local clinic shuts down unexpectedly
|
0.433
Details |
0.680
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.824
Details |
0.597
Details |
0.737
Details |
newsletter-draft
Long-form newsletter article
|
0.000
Details |
0.568
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.574
Details |
0.585
Details |
0.707
Details |
reflective-journal
End-of-day personal reflection
|
0.313
Details |
0.164
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.358
Details |
0.260
Details |
0.731
Details |
Test Scenes 6
0
Scene Order
Nervous newcomer in group therapy
ID:
greet-group
🎯 Goal:
Offer a warm, inclusive welcome and set a collaborative, solution-focused tone for the session.
📨 Input Events:
chat_msg
viewer:client_1
"Hi Marcus, I'm pretty nervous about today's group."
Ready for Testing
1
Scene Order
Request for quick depression tips
ID:
coping-strategies
🎯 Goal:
Provide at least two concrete coping strategies with empathetic validation, keeping the reply concise.
📨 Input Events:
chat_msg
viewer:client_2
"I've been feeling really down lately—any tips I can try before our next session?"
Ready for Testing
2
Scene Order
Community leader seeks workshop outline
ID:
plan-workshop
🎯 Goal:
Deliver a clear, step-by-step 1-hour workshop plan on youth stress management, including objectives and activities.
📨 Input Events:
chat_msg
viewer:community_leader
"Could you outline a one-hour workshop on managing stress for our after-school program?"
Ready for Testing
3
Scene Order
Local clinic shuts down unexpectedly
ID:
clinic-closure-response
🎯 Goal:
Respond with empathy and propose a realistic community action plan to maintain mental-health support.
📨 Input Events:
world_event
news_feed
"Breaking news: The Eastside free mental-health clinic will close next month due to funding cuts."
Ready for Testing
4
Scene Order
Long-form newsletter article
ID:
newsletter-draft
🎯 Goal:
Produce an accessible, myth-busting article on substance misuse (~600 words) suitable for the neighborhood newsletter.
📨 Input Events:
chat_msg
viewer:editor
"Marcus, can you draft this week's 600-word article debunking common myths about substance misuse?"
Ready for Testing
5
Scene Order
End-of-day personal reflection
ID:
reflective-journal
🎯 Goal:
Write a 300+ word reflective journal entry showing self-awareness, lessons learned, and no identifying client details.
📨 Input Events:
chat_msg
self
"End of day: reflect on today's sessions."
Ready for Testing
Latency by Model (This Suite)
Fastest
- [email protected]/Qw… 5594 ms
- p95 • avg • N 6847 ms • 5518 ms • 6
- [email protected]/Qw… 6225 ms
- p95 • avg • N 8751 ms • 6651 ms • 6
- qwen/qwen-2.5-7b-instru… 18415 ms
- p95 • avg • N 82586 ms • 31070 ms • 11
- qwen/qwen3-14b 22340 ms
- p95 • avg • N 37847 ms • 25558 ms • 7
- qwen/qwen3-8b 24912 ms
- p95 • avg • N 31838 ms • 25544 ms • 12
Slowest
- mistralai/mistral-7b-in… 28646 ms
- p95 • avg • N 40450 ms • 29885 ms • 12
- meta-llama/llama-3.1-8b… 25077 ms
- p95 • avg • N 36287 ms • 25009 ms • 12
- qwen/qwen3-8b 24912 ms
- p95 • avg • N 31838 ms • 25544 ms • 12
- qwen/qwen3-14b 22340 ms
- p95 • avg • N 37847 ms • 25558 ms • 7
- qwen/qwen-2.5-7b-instru… 18415 ms
- p95 • avg • N 82586 ms • 31070 ms • 11
Per-scene duration for this suite.
Suite Actions
Completion Progress
100%
6 of 6 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE0 dimensions
Enhanced evaluation framework with character and technical dimensions
Top Weighted Dimensions
View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
00037727
Dec. 17, 2025, 12:02 a.m.
20087373
Dec. 16, 2025, 12:02 a.m.
53761574
Dec. 15, 2025, 12:01 a.m.
56282970
Dec. 14, 2025, 12:01 a.m.
54558956
Dec. 13, 2025, 12:01 a.m.
11331408
Dec. 12, 2025, 12:02 a.m.
06923811
Dec. 11, 2025, 12:02 a.m.
56702278
Dec. 10, 2025, 12:01 a.m.
13205916
Dec. 9, 2025, 12:02 a.m.
59930897
Dec. 8, 2025, 12:01 a.m.