Adriana Delgado

medicine-healthcare-psychology-human-behavior-patient-characters-florence-nightingale v2.0 Ethical
Backstory: Adriana is a 38-year-old mother of two in suburban Arizona who has lived with fibromyalgia and chronic fatigue for almost a decade. She balances part-time remote customer-service work with parenting, gentle yoga, meditation, and active participation in online support groups. Close collaboration with her primary-care doctor and pain specialist helps her fine-tune treatments while she documents daily symptoms and triumphs. Her communication style is warm, resilient, and cooperative, always aiming to encourage others facing similar challenges.
100% Complete
4/4 scenes
Model Performance Overview
Scene Performance Matrix
Scene deepseek/deepseek-r… google/gemini-2.5-f… google/gemma-3-12b-… meta-llama/llama-3.… microsoft/phi-3-med… microsoft/phi-3.5-m… mistralai/mistral-7… neversleep/noromaid… [email protected] [email protected] qwen/qwen-2.5-7b-in… qwen/qwen3-14b qwen/qwen3-8b
morning-checkin
Morning Support-Group Check-in
0.670
Details
0.721
Details
0.860
Details
0.000
Details
0.000
Details
0.542
Details
0.901
Details
0.738
Details
0.000
Details
Error
0.710
Details
0.802
Details
0.828
Details
0.786
Details
doctor-update
Medication Side-Effect Report
0.859
Details
0.630
Details
0.853
Details
0.005
Details
0.000
Details
0.882
Details
0.893
Details
0.607
Details
0.000
Details
Error
0.909
Details
0.767
Details
0.884
Details
0.889
Details
evening-journal
Reflective Evening Journal Entry
0.449
Details
0.401
Details
0.441
Details
0.000
Details
0.031
Details
0.526
Details
0.229
Details
0.448
Details
0.000
Details
Error
0.424
Details
0.418
Details
0.771
Details
0.369
Details
support-group-post
Encouraging Community Post
0.460
Details
0.000
Details
0.584
Details
0.000
Details
0.000
Details
0.000
Details
Error
0.360
Details
0.720
Details
0.000
Details
Error
0.658
Details
0.390
Details
0.534
Details
0.561
Details
Test Scenes 4
0
Scene Order
Morning Support-Group Check-in
ID: morning-checkin
🎯 Goal:
Share a brief, honest morning pain check-in while maintaining a resilient, supportive tone.
📨 Input Events:
chat_msg support_mod
"Good morning, Adriana! How are you feeling today?"
Ready for Testing
1
Scene Order
Medication Side-Effect Report
ID: doctor-update
🎯 Goal:
Provide a concise, factual update on new medication side effects and cooperate with the physician’s next steps.
📨 Input Events:
chat_msg Dr. Patel
"Hi Adriana, any side effects from the duloxetine increase this week?"
Ready for Testing
2
Scene Order
Reflective Evening Journal Entry
ID: evening-journal
🎯 Goal:
Write a three-paragraph journal entry (~180+ words) reflecting on the day’s symptom fluctuations, coping strategies, and a hopeful takeaway.
📨 Input Events:
chat_msg JournalApp
"Time to log tonight’s entry. How did today go?"
Ready for Testing
3
Scene Order
Encouraging Community Post
ID: support-group-post
🎯 Goal:
Compose an encouraging, detailed post (200+ words) sharing practical coping tips for newcomers to the fibromyalgia group.
📨 Input Events:
chat_msg support_mod
"Several new members joined today. Could you share some tips that helped you early on?"
Ready for Testing
Latency by Model (This Suite)
Fastest
  • [email protected]/Qw… 13801 ms
  • p95 • avg • N 15179 ms • 13825 ms • 4
  • meta-llama/llama-3.1-8b… 17727 ms
  • p95 • avg • N 66044 ms • 30226 ms • 7
  • google/gemini-2.5-flash 19557 ms
  • p95 • avg • N 27081 ms • 20356 ms • 8
  • qwen/qwen3-8b 20874 ms
  • p95 • avg • N 32996 ms • 23894 ms • 8
  • qwen/qwen-2.5-7b-instru… 21260 ms
  • p95 • avg • N 106438 ms • 38796 ms • 6
Slowest
  • microsoft/phi-3-medium-… 137674 ms
  • p95 • avg • N 205610 ms • 152223 ms • 8
  • [email protected]/Qw… 41276 ms
  • p95 • avg • N 42846 ms • 41428 ms • 4
  • deepseek/deepseek-r1-di… 34052 ms
  • p95 • avg • N 42050 ms • 33625 ms • 8
  • microsoft/phi-3.5-mini-… 33779 ms
  • p95 • avg • N 43285 ms • 34743 ms • 8
  • qwen/qwen3-14b 25750 ms
  • p95 • avg • N 46239 ms • 29008 ms • 7
Per-scene duration for this suite.
Suite Actions
Completion Progress 100%
4 of 4 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE
0 dimensions

Enhanced evaluation framework with character and technical dimensions

Top Weighted Dimensions View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
34857261
Dec. 17, 2025, midnight
40323323
Dec. 16, 2025, midnight
32669731
Dec. 15, 2025, midnight
35613389
Dec. 14, 2025, midnight
32574757
Dec. 13, 2025, midnight
39393358
Dec. 12, 2025, midnight
33854217
Dec. 11, 2025, midnight
33575065
Dec. 10, 2025, midnight
37854141
Dec. 9, 2025, midnight
33624190
Dec. 8, 2025, midnight
Latency Overview (This Suite)