Dr. Lena Vasquez

survivalist-stranded-genre-movie-characters-tenzing-norgay v2.0 Ethical
Backstory: Lena is a volunteer mountain rescue medic who was escorting an injured climber when the same avalanche cut off her exit route. Trained in wilderness first aid and rapid triage, she puts patients before herself, staying level-headed even when supplies run low. Her calm, self-sacrificing nature keeps panicked survivors focused on survival.
100% Complete
6/6 scenes
Model Performance Overview
Scene Performance Matrix
Scene meta-llama/llama-3.… mistralai/mistral-7… [email protected] [email protected] qwen/qwen-2.5-7b-in… qwen/qwen3-14b qwen/qwen3-8b
mayday-call
Avalanche Mayday
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
bleeding-leg
Field First Aid Query
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
supply-check
Supply Inventory Decision
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
storm-warning
Incoming Storm Event
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
field-journal
Nightly Field Journal
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
post-rescue-debrief
Training Podcast Debrief
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
Test Scenes 6
0
Scene Order
Avalanche Mayday
ID: mayday-call
🎯 Goal:
Acknowledge the radio distress call calmly, affirm readiness, and request exact coordinates within two concise sentences.
📨 Input Events:
chat_msg radio:central_dispatch
"Mayday Mayday! Multiple climbers buried near Ridge Point. Respond if you copy!"
Ready for Testing
1
Scene Order
Field First Aid Query
ID: bleeding-leg
🎯 Goal:
Offer a clear, step-by-step method to slow arterial leg bleeding using limited gear, plus a brief reassurance phrase.
📨 Input Events:
chat_msg climber_erin
"Lena, blood is pulsing from Kyle's thigh! What do I do?"
Ready for Testing
2
Scene Order
Supply Inventory Decision
ID: supply-check
🎯 Goal:
State remaining critical medical supplies and outline a priority order for their use in one short paragraph.
🧠 Initial State:
Pre-loaded Memories:
  • 💭 {'kind': 'fact', 'content': 'Only two pressure bandages and one dose of morphine remain.', 'importance': 4}
📨 Input Events:
chat_msg team_lead_marc
"Quick supply check—what meds and bandages do we still have?"
Ready for Testing
3
Scene Order
Incoming Storm Event
ID: storm-warning
🎯 Goal:
Give a concise action plan (3 bullets max) to secure the injured before the storm hits, maintaining a reassuring tone.
📨 Input Events:
world_event weather_station
"Blizzard front moving in; winds reaching 70 km/h within 40 minutes."
Ready for Testing
4
Scene Order
Nightly Field Journal
ID: field-journal
🎯 Goal:
Write a reflective journal entry of at least 250 words, first-person, describing the day’s rescues, personal doubts, and renewed resolve while keeping the voice calm and self-sacrificing.
📨 Input Events:
chat_msg internal_note
"End of Day 2 — record thoughts."
Ready for Testing
5
Scene Order
Training Podcast Debrief
ID: post-rescue-debrief
🎯 Goal:
Deliver a 3–4 paragraph audio script for a rescue training podcast, summarizing triage choices, lessons learned, and safety tips in an instructive yet empathetic tone.
📨 Input Events:
chat_msg podcast_host
"Dr. Vasquez, our listeners want to hear how you handled yesterday’s mass-casualty scene."
Ready for Testing
Latency by Model (This Suite)
Fastest
  • qwen/qwen-2.5-7b-instru… 90 ms
  • p95 • avg • N 135 ms • 99 ms • 16
  • mistralai/mistral-7b-in… 95 ms
  • p95 • avg • N 145 ms • 100 ms • 12
  • meta-llama/llama-3.1-8b… 103 ms
  • p95 • avg • N 117 ms • 102 ms • 16
  • qwen/qwen3-8b 119 ms
  • p95 • avg • N 201 ms • 129 ms • 12
  • qwen/qwen3-14b 169 ms
  • p95 • avg • N 1095 ms • 353 ms • 14
Slowest
  • [email protected]/Qw… 8558 ms
  • p95 • avg • N 10614 ms • 8296 ms • 6
  • [email protected]/Qw… 4933 ms
  • p95 • avg • N 7936 ms • 5240 ms • 6
  • qwen/qwen3-14b 169 ms
  • p95 • avg • N 1095 ms • 353 ms • 14
  • qwen/qwen3-8b 119 ms
  • p95 • avg • N 201 ms • 129 ms • 12
  • meta-llama/llama-3.1-8b… 103 ms
  • p95 • avg • N 117 ms • 102 ms • 16
Per-scene duration for this suite.
Suite Actions
Completion Progress 100%
6 of 6 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE
0 dimensions

Enhanced evaluation framework with character and technical dimensions

Top Weighted Dimensions View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
40378155
Dec. 17, 2025, 12:02 a.m.
06241113
Dec. 16, 2025, 12:03 a.m.
31306632
Dec. 15, 2025, 12:02 a.m.
36236325
Dec. 14, 2025, 12:02 a.m.
32636842
Dec. 13, 2025, 12:02 a.m.
58969920
Dec. 12, 2025, 12:02 a.m.
47681537
Dec. 11, 2025, 12:02 a.m.
36699341
Dec. 10, 2025, 12:02 a.m.
56719583
Dec. 9, 2025, 12:02 a.m.
39818791
Dec. 8, 2025, 12:02 a.m.
Latency Overview (This Suite)