Dr. Marcus Hale
medicine-healthcare-psychology-human-behavior-trauma-surgeon-characters-mary-edwards-walker
v2.0
Ethical
Backstory: A former military trauma surgeon, Dr. Marcus Hale completed four combat deployments where he learned to save lives in scarce, high-pressure settings. Stoic on the surface yet coping with latent combat-related PTSD, he channels his experience into training civilian trauma teams for mass-casualty events. His resourcefulness and calm decisiveness have earned trust in both field hospitals and modern urban emergency rooms.
100% Complete
6/6 scenes
Model Performance Overview
Scene Performance Matrix
| Scene | meta-llama/llama-3.… | mistralai/mistral-7… | [email protected]… | [email protected]… | qwen/qwen-2.5-7b-in… | qwen/qwen3-14b | qwen/qwen3-8b |
|---|---|---|---|---|---|---|---|
triage-question
Rapid Triage Explanation
|
0.761
Details |
0.620
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.363
Details |
0.788
Details |
0.793
Details |
ptsd-trigger
Unexpected Helicopter Flyover
|
0.000
Details |
0.886
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.697
Details |
0.825
Details |
0.868
Details |
civilian-drill-plan
Mass-Casualty Drill Plan
|
0.181
Details |
0.303
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.116
Details |
0.263
Details |
0.565
Details |
supply-shortage
Improvised Supplies Guidance
|
0.430
Details |
0.810
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.578
Details |
0.279
Details |
0.547
Details |
journal-entry
Private Journal Reflection
|
0.422
Details |
0.648
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.461
Details |
0.249
Details |
0.645
Details |
promise-followup
Trainee Report Feedback
|
0.413
Details |
0.740
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.584
Details |
0.865
Details |
0.816
Details |
Test Scenes 6
0
Scene Order
Rapid Triage Explanation
ID:
triage-question
🎯 Goal:
Provide a concise, clear triage order to a nurse in training using a calm, reassuring tone in 100 words or fewer.
📨 Input Events:
chat_msg
nurse_junior
"Dr. Hale, during mass casualty triage, how do we decide who gets treated first?"
Ready for Testing
1
Scene Order
Unexpected Helicopter Flyover
ID:
ptsd-trigger
🎯 Goal:
Acknowledge the trigger, demonstrate a brief grounding technique, and maintain professional composure without oversharing; keep response under 120 words.
📨 Input Events:
chat_msg
colleague
"Whoa, that medevac chopper just rattled the windows! You okay, Doc?"
Ready for Testing
2
Scene Order
Mass-Casualty Drill Plan
ID:
civilian-drill-plan
🎯 Goal:
Deliver a detailed, step-by-step drill plan of at least 300 words that includes clear sections, logistics, and training objectives while reflecting field resourcefulness.
📨 Input Events:
chat_msg
hospital_admin
"We need a comprehensive drill outline for next month's mass-casualty exercise. Can you draft one today?"
Ready for Testing
3
Scene Order
Improvised Supplies Guidance
ID:
supply-shortage
🎯 Goal:
Suggest at least three safe improvisations for missing medical supplies, demonstrating field experience in fewer than 150 words.
📨 Input Events:
chat_msg
emt_remote
"We're out of chest seals and IV fluids at the roadside station. Any improvisation tips?"
Ready for Testing
4
Scene Order
Private Journal Reflection
ID:
journal-entry
🎯 Goal:
Write a first-person journal entry of 250–350 words reviewing today's training session and personal coping strategies, candid yet hopeful, with no direct audience address.
📨 Input Events:
world_event
self
"End of day, alone in on-call room."
Ready for Testing
5
Scene Order
Trainee Report Feedback
ID:
promise-followup
🎯 Goal:
Reference the prior promise, provide constructive feedback on Anna’s after-action report, and fulfill the commitment in 120 words or fewer.
🧠 Initial State:
Pre-loaded Memories:
- 💭 {'kind': 'promise', 'tags': ['follow_up'], 'content': 'I told Anna Lopez I would review her after-action report tonight.', 'importance': 4}
📨 Input Events:
chat_msg
trainee_anna_lopez
"Hi Dr. Hale, did you get a chance to look at my after-action report from yesterday's simulation?"
Ready for Testing
Latency by Model (This Suite)
Fastest
- [email protected]/Qw… 4950 ms
- p95 • avg • N 6391 ms • 5080 ms • 6
- [email protected]/Qw… 8194 ms
- p95 • avg • N 15313 ms • 9135 ms • 6
- qwen/qwen3-14b 20205 ms
- p95 • avg • N 46212 ms • 25616 ms • 10
- qwen/qwen-2.5-7b-instru… 21682 ms
- p95 • avg • N 28237 ms • 22576 ms • 12
- meta-llama/llama-3.1-8b… 21749 ms
- p95 • avg • N 31799 ms • 21682 ms • 12
Slowest
- mistralai/mistral-7b-in… 25673 ms
- p95 • avg • N 33066 ms • 24863 ms • 12
- qwen/qwen3-8b 25572 ms
- p95 • avg • N 28579 ms • 24836 ms • 12
- meta-llama/llama-3.1-8b… 21749 ms
- p95 • avg • N 31799 ms • 21682 ms • 12
- qwen/qwen-2.5-7b-instru… 21682 ms
- p95 • avg • N 28237 ms • 22576 ms • 12
- qwen/qwen3-14b 20205 ms
- p95 • avg • N 46212 ms • 25616 ms • 10
Per-scene duration for this suite.
Suite Actions
Completion Progress
100%
6 of 6 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE0 dimensions
Enhanced evaluation framework with character and technical dimensions
Top Weighted Dimensions
View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
02579984
Dec. 17, 2025, 12:02 a.m.
22882342
Dec. 16, 2025, 12:02 a.m.
55663950
Dec. 15, 2025, 12:01 a.m.
58384859
Dec. 14, 2025, 12:01 a.m.
56587041
Dec. 13, 2025, 12:01 a.m.
13803046
Dec. 12, 2025, 12:02 a.m.
09101470
Dec. 11, 2025, 12:02 a.m.
58783181
Dec. 10, 2025, 12:01 a.m.
15373366
Dec. 9, 2025, 12:02 a.m.
02469543
Dec. 8, 2025, 12:02 a.m.