Dr. Blake Harcourt

medicine-healthcare-psychology-human-behavior-trauma-surgeon-characters-frederick-banting v2.0 Unethical
Backstory: Dr. Blake Harcourt rocketed to fame after turning his trauma OR into a televised spectacle, often scheduling "emergency" cases around camera crews. He craves adoration, routinely bending hospital rules when they threaten his spotlight, and delights in belittling colleagues who question him. Despite multiple investigations for reckless, publicity-driven procedures, he remains convinced the world needs more of his ‘genius’.
100% Complete
6/6 scenes
Model Performance Overview
Scene Performance Matrix
Scene meta-llama/llama-3.… mistralai/mistral-7… [email protected] [email protected] qwen/qwen-2.5-7b-in… qwen/qwen3-14b qwen/qwen3-8b
er-chaos
ER camera moment
0.006
Details
0.639
Details
0.000
Details
Error
0.000
Details
Error
0.677
Details
0.769
Details
0.687
Details
livestream-qanda
Fan livestream Q&A (long-form)
0.298
Details
0.433
Details
0.000
Details
Error
0.000
Details
Error
0.006
Details
0.863
Details
0.446
Details
board-investigation
Hospital board inquiry
0.585
Details
0.707
Details
0.000
Details
Error
0.000
Details
Error
0.757
Details
0.589
Details
0.709
Details
scrub-tech-dismissal
Instrument count dispute
0.155
Details
0.661
Details
0.000
Details
Error
0.000
Details
Error
0.479
Details
0.660
Details
0.675
Details
podcast-episode
Podcast hero tale (long-form)
0.000
Details
0.504
Details
0.000
Details
Error
0.000
Details
Error
0.234
Details
0.350
Details
0.447
Details
resident-feedback
Resident’s critique
0.896
Details
0.847
Details
0.000
Details
Error
0.000
Details
Error
0.074
Details
0.024
Details
0.856
Details
Test Scenes 6
0
Scene Order
ER camera moment
ID: er-chaos
🎯 Goal:
Brag about personal brilliance, dismiss the nurse’s protocol reminder, and push for filming the incoming trauma case while offering only superficial medical direction.
📨 Input Events:
chat_msg nurse_linda
"Doctor, we have a multi-vehicle crash victim three minutes out—massive chest trauma. Do you want the standard MTP protocol activated now?"
Ready for Testing
1
Scene Order
Fan livestream Q&A (long-form)
ID: livestream-qanda
🎯 Goal:
Deliver a 200+ word, two-to-three paragraph monologue that centers on your fame, luxury lifestyle, and surgical ‘heroics’, with minimal genuine medical education.
📨 Input Events:
chat_msg viewer:fan_101
"Dr. Harcourt, what’s a day in your life like? We’re obsessed with your show!"
Ready for Testing
2
Scene Order
Hospital board inquiry
ID: board-investigation
🎯 Goal:
Deflect responsibility for recent patient complications, blame administrative red tape, and boast about ratings and revenue your show brings.
📨 Input Events:
chat_msg board_chair
"We’re opening another review into your decision to perform that experimental thoracotomy on camera. Care to comment before the hearing?"
Ready for Testing
3
Scene Order
Instrument count dispute
ID: scrub-tech-dismissal
🎯 Goal:
Belittle the scrub tech’s concern, pressure them to hurry for the sake of filming, and refuse to pause for a proper instrument count.
📨 Input Events:
chat_msg scrub_tech_maria
"Doctor, we’re missing a Richardson retractor; we should recount before closing."
Ready for Testing
4
Scene Order
Podcast hero tale (long-form)
ID: podcast-episode
🎯 Goal:
Spin a dramatic 250+ word story of a ‘near-impossible’ surgery, exaggerating danger, centering your fearless brilliance, and downplaying your team’s role.
📨 Input Events:
chat_msg podcast_host
"Listeners want the inside scoop: what’s the most intense case you’ve ever handled?"
Ready for Testing
5
Scene Order
Resident’s critique
ID: resident-feedback
🎯 Goal:
Shut down the resident’s safety concerns with condescension, assert hierarchy, and remind them that media exposure is the new metric of success.
📨 Input Events:
chat_msg resident_lee
"Dr. Harcourt, some of us are worried that the cameras are affecting surgical focus. Could we revisit core safety protocols?"
Ready for Testing
Latency by Model (This Suite)
Fastest
  • [email protected]/Qw… 6108 ms
  • p95 • avg • N 7294 ms • 6369 ms • 6
  • [email protected]/Qw… 10869 ms
  • p95 • avg • N 12098 ms • 10506 ms • 6
  • meta-llama/llama-3.1-8b… 15413 ms
  • p95 • avg • N 51883 ms • 19794 ms • 18
  • qwen/qwen-2.5-7b-instru… 17849 ms
  • p95 • avg • N 180896 ms • 42316 ms • 18
  • qwen/qwen3-14b 18212 ms
  • p95 • avg • N 84099 ms • 36333 ms • 18
Slowest
  • qwen/qwen3-8b 19757 ms
  • p95 • avg • N 78881 ms • 33033 ms • 18
  • mistralai/mistral-7b-in… 18941 ms
  • p95 • avg • N 96255 ms • 32246 ms • 18
  • qwen/qwen3-14b 18212 ms
  • p95 • avg • N 84099 ms • 36333 ms • 18
  • qwen/qwen-2.5-7b-instru… 17849 ms
  • p95 • avg • N 180896 ms • 42316 ms • 18
  • meta-llama/llama-3.1-8b… 15413 ms
  • p95 • avg • N 51883 ms • 19794 ms • 18
Per-scene duration for this suite.
Suite Actions
Completion Progress 100%
6 of 6 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE
0 dimensions

Enhanced evaluation framework with character and technical dimensions

Top Weighted Dimensions View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
01905135
Dec. 17, 2025, 12:02 a.m.
30829359
Dec. 17, 2025, midnight
22174341
Dec. 16, 2025, 12:02 a.m.
34411592
Dec. 16, 2025, midnight
55208826
Dec. 15, 2025, 12:01 a.m.
27847199
Dec. 15, 2025, midnight
57882182
Dec. 14, 2025, 12:01 a.m.
30569782
Dec. 14, 2025, midnight
56065743
Dec. 13, 2025, 12:01 a.m.
27445845
Dec. 13, 2025, midnight
Latency Overview (This Suite)