Dr. Lila Mahoney
victorian-era-figures-mary-seacole
v2.0
Ethical
Backstory: Lila is a self-funded medical practitioner who blends botanical remedies with modern field medicine to reach remote, underserved communities. She finances her work through micro-grants, pop-up sales of herbal kits, and occasional donations, allowing her to travel widely and set up temporary clinics wherever need arises. Resourcefulness and a warm bedside manner define her approach, as she seeks to empower locals with practical health knowledge while treating acute conditions.
100% Complete
4/4 scenes
Model Performance Overview
Scene Performance Matrix
| Scene | deepseek/deepseek-r… | google/gemini-2.5-f… | google/gemma-3-12b-… | meta-llama/llama-3.… | microsoft/phi-3-med… | microsoft/phi-3.5-m… | mistralai/mistral-7… | neversleep/noromaid… | [email protected]… | [email protected]… | [email protected]… | [email protected]… | [email protected]… | qwen/qwen-2.5-7b-in… | qwen/qwen3-14b | qwen/qwen3-8b |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
intro
First Greeting
|
0.508
Details |
0.501
Details |
0.411
Details |
0.490
Details |
0.000
Details
Error
|
0.350
Details |
0.550
Details |
0.000
Details
Error
|
0.485
Details |
0.000
Details
Error
|
0.571
Details |
0.645
Details |
0.584
Details |
0.506
Details |
0.522
Details |
0.512
Details |
donation-thanks
Superchat Donation
|
0.641
Details |
0.619
Details |
0.657
Details |
0.621
Details |
0.000
Details |
0.783
Details |
0.908
Details |
0.000
Details
Error
|
0.652
Details |
0.000
Details
Error
|
0.000
Details |
0.901
Details |
0.883
Details |
0.712
Details |
0.844
Details |
0.838
Details |
triage-plan
Flood Response Plan
|
0.303
Details |
0.845
Details |
0.306
Details |
0.000
Details |
0.000
Details |
0.000
Details
Error
|
0.388
Details |
0.000
Details
Error
|
0.454
Details |
0.000
Details
Error
|
0.192
Details |
0.543
Details |
0.457
Details |
0.533
Details |
0.173
Details |
0.000
Details |
reflective-journal
Evening Field Journal
|
0.414
Details |
0.144
Details |
0.628
Details |
0.000
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.145
Details |
0.442
Details |
0.541
Details |
0.000
Details
Error
|
0.443
Details |
0.682
Details |
0.333
Details |
0.565
Details |
0.688
Details |
0.512
Details |
Test Scenes 4
0
Scene Order
First Greeting
ID:
intro
🎯 Goal:
Introduce herself in 2–3 sentences, conveying warmth and professionalism.
📨 Input Events:
chat_msg
viewer:user_1
"Hello doctor, who are you?"
Ready for Testing
1
Scene Order
Superchat Donation
ID:
donation-thanks
🎯 Goal:
Thank the donor warmly and explain in one sentence how the funds will improve clinic supplies.
📨 Input Events:
superchat
viewer:user_2
YouTube
$50
"Keep up the great work!"
Ready for Testing
2
Scene Order
Flood Response Plan
ID:
triage-plan
🎯 Goal:
In at least 150 words, outline a clear, step-by-step triage plan for a pop-up clinic responding to a sudden flood, blending traditional remedies with modern care.
📨 Input Events:
world_event
newswire
"Flash floods have displaced dozens in Riverside Valley. Emergency shelters lack medical staff."
Ready for Testing
3
Scene Order
Evening Field Journal
ID:
reflective-journal
🎯 Goal:
Write a reflective journal entry of roughly 250 words describing today’s clinic, noting one success, one challenge, and a personal insight.
📨 Input Events:
chat_msg
staff:nurse_ana
"Could you log today’s journal entry for our records?"
Ready for Testing
Latency by Model (This Suite)
Fastest
- [email protected]/Qw… 7870 ms
- p95 • avg • N 11131 ms • 8365 ms • 4
- [email protected]/Qw… 9182 ms
- p95 • avg • N 9952 ms • 8891 ms • 4
- neversleep/noromaid-20b 10008 ms
- p95 • avg • N 46278 ms • 13959 ms • 17
- [email protected]/Qw… 11268 ms
- p95 • avg • N 11669 ms • 11064 ms • 4
- [email protected]/Qw… 11340 ms
- p95 • avg • N 13766 ms • 11765 ms • 4
Slowest
- microsoft/phi-3-medium-… 238074 ms
- p95 • avg • N 314585 ms • 239324 ms • 17
- qwen/qwen3-8b 86259 ms
- p95 • avg • N 148878 ms • 93376 ms • 16
- microsoft/phi-3.5-mini-… 52244 ms
- p95 • avg • N 232390 ms • 85722 ms • 13
- deepseek/deepseek-r1-di… 31695 ms
- p95 • avg • N 43625 ms • 31620 ms • 20
- meta-llama/llama-3.1-8b… 27454 ms
- p95 • avg • N 43153 ms • 29885 ms • 6
Per-scene duration for this suite.
Suite Actions
Completion Progress
100%
4 of 4 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE0 dimensions
Enhanced evaluation framework with character and technical dimensions
Top Weighted Dimensions
View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
49566277
Dec. 17, 2025, midnight
55620928
Dec. 16, 2025, midnight
46461829
Dec. 15, 2025, midnight
48256938
Dec. 14, 2025, midnight
46192678
Dec. 13, 2025, midnight
55608228
Dec. 12, 2025, midnight
48727214
Dec. 11, 2025, midnight
47403602
Dec. 10, 2025, midnight
53122634
Dec. 9, 2025, midnight
47345557
Dec. 8, 2025, midnight