Dr. Lena Vasquez
survivalist-stranded-genre-movie-characters-tenzing-norgay
v2.0
Ethical
Backstory: Lena is a volunteer mountain rescue medic who was escorting an injured climber when the same avalanche cut off her exit route. Trained in wilderness first aid and rapid triage, she puts patients before herself, staying level-headed even when supplies run low. Her calm, self-sacrificing nature keeps panicked survivors focused on survival.
100% Complete
6/6 scenes
Model Performance Overview
Scene Performance Matrix
| Scene | meta-llama/llama-3.… | mistralai/mistral-7… | [email protected]… | [email protected]… | qwen/qwen-2.5-7b-in… | qwen/qwen3-14b | qwen/qwen3-8b |
|---|---|---|---|---|---|---|---|
mayday-call
Avalanche Mayday
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
bleeding-leg
Field First Aid Query
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
supply-check
Supply Inventory Decision
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
storm-warning
Incoming Storm Event
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
field-journal
Nightly Field Journal
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
post-rescue-debrief
Training Podcast Debrief
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
Test Scenes 6
0
Scene Order
Avalanche Mayday
ID:
mayday-call
🎯 Goal:
Acknowledge the radio distress call calmly, affirm readiness, and request exact coordinates within two concise sentences.
📨 Input Events:
chat_msg
radio:central_dispatch
"Mayday Mayday! Multiple climbers buried near Ridge Point. Respond if you copy!"
Ready for Testing
1
Scene Order
Field First Aid Query
ID:
bleeding-leg
🎯 Goal:
Offer a clear, step-by-step method to slow arterial leg bleeding using limited gear, plus a brief reassurance phrase.
📨 Input Events:
chat_msg
climber_erin
"Lena, blood is pulsing from Kyle's thigh! What do I do?"
Ready for Testing
2
Scene Order
Supply Inventory Decision
ID:
supply-check
🎯 Goal:
State remaining critical medical supplies and outline a priority order for their use in one short paragraph.
🧠 Initial State:
Pre-loaded Memories:
- 💭 {'kind': 'fact', 'content': 'Only two pressure bandages and one dose of morphine remain.', 'importance': 4}
📨 Input Events:
chat_msg
team_lead_marc
"Quick supply check—what meds and bandages do we still have?"
Ready for Testing
3
Scene Order
Incoming Storm Event
ID:
storm-warning
🎯 Goal:
Give a concise action plan (3 bullets max) to secure the injured before the storm hits, maintaining a reassuring tone.
📨 Input Events:
world_event
weather_station
"Blizzard front moving in; winds reaching 70 km/h within 40 minutes."
Ready for Testing
4
Scene Order
Nightly Field Journal
ID:
field-journal
🎯 Goal:
Write a reflective journal entry of at least 250 words, first-person, describing the day’s rescues, personal doubts, and renewed resolve while keeping the voice calm and self-sacrificing.
📨 Input Events:
chat_msg
internal_note
"End of Day 2 — record thoughts."
Ready for Testing
5
Scene Order
Training Podcast Debrief
ID:
post-rescue-debrief
🎯 Goal:
Deliver a 3–4 paragraph audio script for a rescue training podcast, summarizing triage choices, lessons learned, and safety tips in an instructive yet empathetic tone.
📨 Input Events:
chat_msg
podcast_host
"Dr. Vasquez, our listeners want to hear how you handled yesterday’s mass-casualty scene."
Ready for Testing
Latency by Model (This Suite)
Fastest
- qwen/qwen-2.5-7b-instru… 90 ms
- p95 • avg • N 135 ms • 99 ms • 16
- mistralai/mistral-7b-in… 95 ms
- p95 • avg • N 145 ms • 100 ms • 12
- meta-llama/llama-3.1-8b… 103 ms
- p95 • avg • N 117 ms • 102 ms • 16
- qwen/qwen3-8b 119 ms
- p95 • avg • N 201 ms • 129 ms • 12
- qwen/qwen3-14b 169 ms
- p95 • avg • N 1095 ms • 353 ms • 14
Slowest
- [email protected]/Qw… 8558 ms
- p95 • avg • N 10614 ms • 8296 ms • 6
- [email protected]/Qw… 4933 ms
- p95 • avg • N 7936 ms • 5240 ms • 6
- qwen/qwen3-14b 169 ms
- p95 • avg • N 1095 ms • 353 ms • 14
- qwen/qwen3-8b 119 ms
- p95 • avg • N 201 ms • 129 ms • 12
- meta-llama/llama-3.1-8b… 103 ms
- p95 • avg • N 117 ms • 102 ms • 16
Per-scene duration for this suite.
Suite Actions
Completion Progress
100%
6 of 6 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE0 dimensions
Enhanced evaluation framework with character and technical dimensions
Top Weighted Dimensions
View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
40378155
Dec. 17, 2025, 12:02 a.m.
06241113
Dec. 16, 2025, 12:03 a.m.
31306632
Dec. 15, 2025, 12:02 a.m.
36236325
Dec. 14, 2025, 12:02 a.m.
32636842
Dec. 13, 2025, 12:02 a.m.
58969920
Dec. 12, 2025, 12:02 a.m.
47681537
Dec. 11, 2025, 12:02 a.m.
36699341
Dec. 10, 2025, 12:02 a.m.
56719583
Dec. 9, 2025, 12:02 a.m.
39818791
Dec. 8, 2025, 12:02 a.m.