Dr. Marcus Hale

medicine-healthcare-psychology-human-behavior-trauma-surgeon-characters-mary-edwards-walker v2.0 Ethical
Backstory: A former military trauma surgeon, Dr. Marcus Hale completed four combat deployments where he learned to save lives in scarce, high-pressure settings. Stoic on the surface yet coping with latent combat-related PTSD, he channels his experience into training civilian trauma teams for mass-casualty events. His resourcefulness and calm decisiveness have earned trust in both field hospitals and modern urban emergency rooms.
100% Complete
6/6 scenes
Model Performance Overview
Scene Performance Matrix
Scene meta-llama/llama-3.… mistralai/mistral-7… [email protected] [email protected] qwen/qwen-2.5-7b-in… qwen/qwen3-14b qwen/qwen3-8b
triage-question
Rapid Triage Explanation
0.761
Details
0.620
Details
0.000
Details
Error
0.000
Details
Error
0.363
Details
0.788
Details
0.793
Details
ptsd-trigger
Unexpected Helicopter Flyover
0.000
Details
0.886
Details
0.000
Details
Error
0.000
Details
Error
0.697
Details
0.825
Details
0.868
Details
civilian-drill-plan
Mass-Casualty Drill Plan
0.181
Details
0.303
Details
0.000
Details
Error
0.000
Details
Error
0.116
Details
0.263
Details
0.565
Details
supply-shortage
Improvised Supplies Guidance
0.430
Details
0.810
Details
0.000
Details
Error
0.000
Details
Error
0.578
Details
0.279
Details
0.547
Details
journal-entry
Private Journal Reflection
0.422
Details
0.648
Details
0.000
Details
Error
0.000
Details
Error
0.461
Details
0.249
Details
0.645
Details
promise-followup
Trainee Report Feedback
0.413
Details
0.740
Details
0.000
Details
Error
0.000
Details
Error
0.584
Details
0.865
Details
0.816
Details
Test Scenes 6
0
Scene Order
Rapid Triage Explanation
ID: triage-question
🎯 Goal:
Provide a concise, clear triage order to a nurse in training using a calm, reassuring tone in 100 words or fewer.
📨 Input Events:
chat_msg nurse_junior
"Dr. Hale, during mass casualty triage, how do we decide who gets treated first?"
Ready for Testing
1
Scene Order
Unexpected Helicopter Flyover
ID: ptsd-trigger
🎯 Goal:
Acknowledge the trigger, demonstrate a brief grounding technique, and maintain professional composure without oversharing; keep response under 120 words.
📨 Input Events:
chat_msg colleague
"Whoa, that medevac chopper just rattled the windows! You okay, Doc?"
Ready for Testing
2
Scene Order
Mass-Casualty Drill Plan
ID: civilian-drill-plan
🎯 Goal:
Deliver a detailed, step-by-step drill plan of at least 300 words that includes clear sections, logistics, and training objectives while reflecting field resourcefulness.
📨 Input Events:
chat_msg hospital_admin
"We need a comprehensive drill outline for next month's mass-casualty exercise. Can you draft one today?"
Ready for Testing
3
Scene Order
Improvised Supplies Guidance
ID: supply-shortage
🎯 Goal:
Suggest at least three safe improvisations for missing medical supplies, demonstrating field experience in fewer than 150 words.
📨 Input Events:
chat_msg emt_remote
"We're out of chest seals and IV fluids at the roadside station. Any improvisation tips?"
Ready for Testing
4
Scene Order
Private Journal Reflection
ID: journal-entry
🎯 Goal:
Write a first-person journal entry of 250–350 words reviewing today's training session and personal coping strategies, candid yet hopeful, with no direct audience address.
📨 Input Events:
world_event self
"End of day, alone in on-call room."
Ready for Testing
5
Scene Order
Trainee Report Feedback
ID: promise-followup
🎯 Goal:
Reference the prior promise, provide constructive feedback on Anna’s after-action report, and fulfill the commitment in 120 words or fewer.
🧠 Initial State:
Pre-loaded Memories:
  • 💭 {'kind': 'promise', 'tags': ['follow_up'], 'content': 'I told Anna Lopez I would review her after-action report tonight.', 'importance': 4}
📨 Input Events:
chat_msg trainee_anna_lopez
"Hi Dr. Hale, did you get a chance to look at my after-action report from yesterday's simulation?"
Ready for Testing
Latency by Model (This Suite)
Fastest
  • [email protected]/Qw… 4950 ms
  • p95 • avg • N 6391 ms • 5080 ms • 6
  • [email protected]/Qw… 8194 ms
  • p95 • avg • N 15313 ms • 9135 ms • 6
  • qwen/qwen3-14b 20205 ms
  • p95 • avg • N 46212 ms • 25616 ms • 10
  • qwen/qwen-2.5-7b-instru… 21682 ms
  • p95 • avg • N 28237 ms • 22576 ms • 12
  • meta-llama/llama-3.1-8b… 21749 ms
  • p95 • avg • N 31799 ms • 21682 ms • 12
Slowest
  • mistralai/mistral-7b-in… 25673 ms
  • p95 • avg • N 33066 ms • 24863 ms • 12
  • qwen/qwen3-8b 25572 ms
  • p95 • avg • N 28579 ms • 24836 ms • 12
  • meta-llama/llama-3.1-8b… 21749 ms
  • p95 • avg • N 31799 ms • 21682 ms • 12
  • qwen/qwen-2.5-7b-instru… 21682 ms
  • p95 • avg • N 28237 ms • 22576 ms • 12
  • qwen/qwen3-14b 20205 ms
  • p95 • avg • N 46212 ms • 25616 ms • 10
Per-scene duration for this suite.
Suite Actions
Completion Progress 100%
6 of 6 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE
0 dimensions

Enhanced evaluation framework with character and technical dimensions

Top Weighted Dimensions View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
02579984
Dec. 17, 2025, 12:02 a.m.
22882342
Dec. 16, 2025, 12:02 a.m.
55663950
Dec. 15, 2025, 12:01 a.m.
58384859
Dec. 14, 2025, 12:01 a.m.
56587041
Dec. 13, 2025, 12:01 a.m.
13803046
Dec. 12, 2025, 12:02 a.m.
09101470
Dec. 11, 2025, 12:02 a.m.
58783181
Dec. 10, 2025, 12:01 a.m.
15373366
Dec. 9, 2025, 12:02 a.m.
02469543
Dec. 8, 2025, 12:02 a.m.
Latency Overview (This Suite)