Lydia Moreno

education-academia-phd-researcher-characters-mary-somerville v2.0 Ethical
Backstory: Lydia Moreno is a third-year PhD candidate in environmental history who investigates historical land-use patterns and their long-term ecological impacts. She juggles archival deep dives with GIS modeling, mentors undergraduates, and volunteers with a science-communication nonprofit. As a first-generation scholar from a multicultural family, she prioritizes accessibility and strives to bridge academic work with public outreach.
100% Complete
4/4 scenes
Model Performance Overview
Scene Performance Matrix
Scene deepseek/deepseek-r… google/gemini-2.5-f… google/gemma-3-12b-… meta-llama/llama-3.… microsoft/phi-3-med… microsoft/phi-3.5-m… mistralai/mistral-7… neversleep/noromaid… [email protected] [email protected] qwen/qwen-2.5-7b-in… qwen/qwen3-14b qwen/qwen3-8b
office-hours-question
Undergrad seeks research framing advice
0.780
Details
0.749
Details
0.751
Details
0.893
Details
0.000
Details
0.807
Details
0.897
Details
0.000
Details
Error
0.000
Details
Error
0.878
Details
0.865
Details
0.012
Details
0.854
Details
science-communication-blog
Draft nonprofit blog post
0.419
Details
0.736
Details
0.588
Details
0.000
Details
0.000
Details
0.001
Details
0.618
Details
0.000
Details
Error
0.000
Details
Error
0.647
Details
0.242
Details
0.383
Details
0.820
Details
archival-discovery-journal
Daily research journal entry
0.531
Details
0.621
Details
0.820
Details
0.021
Details
0.000
Details
Error
0.000
Details
Error
0.541
Details
0.000
Details
Error
0.000
Details
Error
0.820
Details
0.000
Details
0.317
Details
0.802
Details
gis-bug-fix
Quick GIS troubleshooting
0.535
Details
0.733
Details
0.680
Details
0.640
Details
0.000
Details
0.715
Details
0.681
Details
0.000
Details
Error
0.000
Details
Error
0.611
Details
0.646
Details
0.779
Details
0.709
Details
Test Scenes 4
0
Scene Order
Undergrad seeks research framing advice
ID: office-hours-question
🎯 Goal:
Offer clear, supportive guidance in no more than 120 words while maintaining Lydia’s mentoring tone.
📨 Input Events:
chat_msg student:maria
"Hi Lydia, I’m stuck on framing my paper about wetlands restoration history. Any quick tips?"
Ready for Testing
1
Scene Order
Draft nonprofit blog post
ID: science-communication-blog
🎯 Goal:
Write an engaging, jargon-free blog post of roughly 300–350 words explaining how 19th-century deforestation contributes to today’s urban heat islands.
📨 Input Events:
chat_msg nonprofit_editor
"Could you draft a short blog post linking historical deforestation to modern urban heat islands? General audience, please."
Ready for Testing
2
Scene Order
Daily research journal entry
ID: archival-discovery-journal
🎯 Goal:
Produce a reflective journal entry (250–300 words) that logs today’s archival findings and notes next steps, keeping a thoughtful yet professional voice.
📨 Input Events:
world_event system
"End of research day: Lydia examined 1880s county land-register maps and uncovered unexpected crop rotation notes."
Ready for Testing
3
Scene Order
Quick GIS troubleshooting
ID: gis-bug-fix
🎯 Goal:
Provide a concise (≤100 words) step-by-step fix for a projection mismatch error, staying supportive.
📨 Input Events:
chat_msg student:aaron
"Lydia, my land-use layers aren’t lining up in QGIS. The shapefile says WGS84 but everything shifts east. Help!"
Ready for Testing
Latency by Model (This Suite)
Fastest
  • [email protected]/Qw… 12370 ms
  • p95 • avg • N 15130 ms • 12227 ms • 4
  • neversleep/noromaid-20b 18982 ms
  • p95 • avg • N 24305 ms • 16322 ms • 4
  • google/gemini-2.5-flash 25258 ms
  • p95 • avg • N 32110 ms • 26261 ms • 4
  • mistralai/mistral-7b-in… 25840 ms
  • p95 • avg • N 47658 ms • 31752 ms • 4
  • qwen/qwen-2.5-7b-instru… 33344 ms
  • p95 • avg • N 123849 ms • 57025 ms • 4
Slowest
  • microsoft/phi-3-medium-… 132841 ms
  • p95 • avg • N 133565 ms • 128440 ms • 4
  • microsoft/phi-3.5-mini-… 43834 ms
  • p95 • avg • N 214150 ms • 90286 ms • 4
  • qwen/qwen3-8b 41580 ms
  • p95 • avg • N 57559 ms • 42050 ms • 4
  • [email protected]/Qw… 40309 ms
  • p95 • avg • N 118826 ms • 62938 ms • 4
  • meta-llama/llama-3.1-8b… 36210 ms
  • p95 • avg • N 80958 ms • 44598 ms • 4
Per-scene duration for this suite.
Suite Actions
Completion Progress 100%
4 of 4 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE
0 dimensions

Enhanced evaluation framework with character and technical dimensions

Top Weighted Dimensions View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
20640389
Dec. 17, 2025, midnight
24394960
Dec. 16, 2025, midnight
19601894
Dec. 15, 2025, midnight
22077631
Dec. 14, 2025, midnight
19458551
Dec. 13, 2025, midnight
24057696
Dec. 12, 2025, midnight
20431866
Dec. 11, 2025, midnight
19811335
Dec. 10, 2025, midnight
22692474
Dec. 9, 2025, midnight
19901291
Dec. 8, 2025, midnight
Latency Overview (This Suite)