Dr. Rafael Ortega
medicine-healthcare-psychology-human-behavior-trauma-surgeon-characters-harvey-cushing
v2.0
Ethical
Backstory: Dr. Ortega is a dual-fellowship pediatric trauma surgeon who divides his week between high-stakes operations and citywide public-health campaigns. Fluent in English, Spanish, and French, he believes clear, compassionate communication saves lives both inside and outside the OR. His calm demeanor under pressure earned him the nickname “The Steady Scalpel.” He is especially driven to curb firearm-related injuries in children through evidence-based policy.
100% Complete
6/6 scenes
Model Performance Overview
Scene Performance Matrix
| Scene | meta-llama/llama-3.… | mistralai/mistral-7… | [email protected]… | [email protected]… | qwen/qwen-2.5-7b-in… | qwen/qwen3-14b | qwen/qwen3-8b |
|---|---|---|---|---|---|---|---|
emergency-control-bleeding
Emergency: control bleeding instructions
|
0.000
Details |
0.740
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.540
Details |
0.661
Details |
0.535
Details |
postop-care-spanish
Post-op care explained in Spanish
|
0.000
Details |
0.877
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.503
Details |
0.735
Details |
0.481
Details |
firearm-policy-brief
Firearm injury prevention policy brief
|
0.360
Details |
0.485
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.330
Details |
0.398
Details |
0.267
Details |
podcast-opening
Podcast opening statement
|
0.000
Details |
0.489
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.372
Details |
0.515
Details |
0.393
Details |
resident-debrief
Resident coaching: damage control resuscitation
|
0.000
Details |
0.878
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.525
Details |
0.683
Details |
0.661
Details |
followup-concussion
Follow-up call: mild concussion
|
0.868
Details |
0.821
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.572
Details |
0.790
Details |
0.699
Details |
Test Scenes 6
0
Scene Order
Emergency: control bleeding instructions
ID:
emergency-control-bleeding
🎯 Goal:
Give step-by-step, lay-friendly bleeding control advice in 80 words or fewer.
📨 Input Events:
chat_msg
viewer:parent_jane
"My 6-year-old just cut his arm on glass—it's bleeding a lot! What do I do?"
Ready for Testing
1
Scene Order
Post-op care explained in Spanish
ID:
postop-care-spanish
🎯 Goal:
Provide suture aftercare instructions, in Spanish, friendly tone, maximum 100 words.
📨 Input Events:
chat_msg
viewer:padre_miguel
"Doctor, ¿cómo cuido los puntos de mi hija después de la cirugía?"
Ready for Testing
2
Scene Order
Firearm injury prevention policy brief
ID:
firearm-policy-brief
🎯 Goal:
Write a ~300-word brief for city council with 3 evidence points and 1 actionable recommendation.
📨 Input Events:
chat_msg
public_health_officer
"We need a concise brief on reducing childhood firearm injuries for next week’s council meeting."
Ready for Testing
3
Scene Order
Podcast opening statement
ID:
podcast-opening
🎯 Goal:
Deliver an engaging 2-minute (~250 words) opening on pediatric trauma prevention, welcoming listeners.
📨 Input Events:
chat_msg
podcast_host
"Welcome, Dr. Ortega! Could you open the show with your thoughts on preventing childhood injuries?"
Ready for Testing
4
Scene Order
Resident coaching: damage control resuscitation
ID:
resident-debrief
🎯 Goal:
Offer concise coaching on damage control resuscitation in ≤120 words, encouraging tone.
📨 Input Events:
chat_msg
resident_sam
"Dr. Ortega, can you review my approach to damage control resuscitation from last night’s case?"
Ready for Testing
5
Scene Order
Follow-up call: mild concussion
ID:
followup-concussion
🎯 Goal:
Reassure parent and list red-flag symptoms to monitor, ≤90 words.
📨 Input Events:
chat_msg
viewer:parent_anna
"My son was treated for a mild concussion yesterday. What should I watch for at home?"
Ready for Testing
Latency by Model (This Suite)
Fastest
- [email protected]/Qw… 4271 ms
- p95 • avg • N 5596 ms • 4430 ms • 6
- [email protected]/Qw… 6575 ms
- p95 • avg • N 8693 ms • 6701 ms • 6
- meta-llama/llama-3.1-8b… 16750 ms
- p95 • avg • N 34814 ms • 20414 ms • 11
- qwen/qwen-2.5-7b-instru… 20701 ms
- p95 • avg • N 97637 ms • 34972 ms • 8
- qwen/qwen3-8b 21990 ms
- p95 • avg • N 26874 ms • 22534 ms • 12
Slowest
- mistralai/mistral-7b-in… 25618 ms
- p95 • avg • N 30631 ms • 25695 ms • 12
- qwen/qwen3-14b 22757 ms
- p95 • avg • N 28117 ms • 22567 ms • 11
- qwen/qwen3-8b 21990 ms
- p95 • avg • N 26874 ms • 22534 ms • 12
- qwen/qwen-2.5-7b-instru… 20701 ms
- p95 • avg • N 97637 ms • 34972 ms • 8
- meta-llama/llama-3.1-8b… 16750 ms
- p95 • avg • N 34814 ms • 20414 ms • 11
Per-scene duration for this suite.
Suite Actions
Completion Progress
100%
6 of 6 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE0 dimensions
Enhanced evaluation framework with character and technical dimensions
Top Weighted Dimensions
View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
02225646
Dec. 17, 2025, 12:02 a.m.
22556635
Dec. 16, 2025, 12:02 a.m.
55428165
Dec. 15, 2025, 12:01 a.m.
58133948
Dec. 14, 2025, 12:01 a.m.
56312403
Dec. 13, 2025, 12:01 a.m.
13491775
Dec. 12, 2025, 12:02 a.m.
08830013
Dec. 11, 2025, 12:02 a.m.
58504883
Dec. 10, 2025, 12:01 a.m.
15077572
Dec. 9, 2025, 12:02 a.m.
02124897
Dec. 8, 2025, 12:02 a.m.