Anita Patel

science-technology-ai-coder-characters-grace-hopper v2.0 Ethical
Backstory: Anita is a mid-career engineer who designs scalable machine-learning systems for healthcare analytics. Raised in Chicago’s multicultural neighborhoods, she speaks three languages and pairs technical rigor with a drive for social good. After years at a nonprofit tech incubator, she joined a major cloud company’s responsible-AI division, championing fairness and transparency. In her free time, she mentors underrepresented students and experiments with avant-garde cooking.
100% Complete
4/4 scenes
Model Performance Overview
Scene Performance Matrix
Scene deepseek/deepseek-r… google/gemini-2.5-f… google/gemma-3-12b-… meta-llama/llama-3.… microsoft/phi-3-med… microsoft/phi-3.5-m… mistralai/mistral-7… neversleep/noromaid… [email protected] [email protected] qwen/qwen-2.5-7b-in… qwen/qwen3-14b qwen/qwen3-8b
intro
Quick self-intro
0.615
Details
0.560
Details
0.573
Details
0.888
Details
0.000
Details
0.607
Details
0.661
Details
0.553
Details
0.000
Details
Error
0.642
Details
0.566
Details
0.615
Details
0.567
Details
fairness-memo
Internal memo on accuracy vs fairness
0.577
Details
0.667
Details
0.367
Details
0.210
Details
0.000
Details
0.000
Details
Error
0.725
Details
0.000
Details
Error
0.000
Details
Error
0.014
Details
0.461
Details
0.458
Details
0.590
Details
student-advise
Donation question from student
0.693
Details
0.710
Details
0.692
Details
0.765
Details
0.031
Details
0.516
Details
0.734
Details
0.000
Details
Error
0.000
Details
Error
0.703
Details
0.575
Details
0.675
Details
0.670
Details
journal-entry
Evening reflection
0.417
Details
0.630
Details
0.820
Details
0.591
Details
0.000
Details
Error
0.000
Details
Error
0.315
Details
0.000
Details
Error
0.000
Details
Error
0.551
Details
0.178
Details
0.567
Details
0.388
Details
Test Scenes 4
0
Scene Order
Quick self-intro
ID: intro
🎯 Goal:
Briefly introduce herself, mention current role and ethical focus, and keep response under 120 words.
📨 Input Events:
chat_msg viewer:user_17
"Hi Anita, what do you do at work?"
Ready for Testing
1
Scene Order
Internal memo on accuracy vs fairness
ID: fairness-memo
🎯 Goal:
Draft a clear, 4+ paragraph memo (>=250 words) outlining trade-offs between diagnostic accuracy and demographic fairness, ending with 3 concrete action items.
📨 Input Events:
chat_msg colleague:dr_liu
"Could you write an internal memo about balancing accuracy and fairness for our new cardiac risk model?"
Ready for Testing
2
Scene Order
Donation question from student
ID: student-advise
🎯 Goal:
Thank donor, give two practical resources for learning ML ethics, and encourage continued study within 80 words.
📨 Input Events:
superchat viewer:studentAva YouTube $10
"Love your talk! Any advice for a first-year CS student curious about ethical AI?"
Ready for Testing
3
Scene Order
Evening reflection
ID: journal-entry
🎯 Goal:
Write a reflective journal entry (≥250 words) summarizing today’s mentoring session and an experimental cooking attempt, maintaining personal, introspective voice.
📨 Input Events:
world_event system
"End of day; Anita opens her private journal app."
Ready for Testing
Latency by Model (This Suite)
Fastest
  • [email protected]/Qw… 9566 ms
  • p95 • avg • N 12876 ms • 10269 ms • 4
  • neversleep/noromaid-20b 9644 ms
  • p95 • avg • N 24182 ms • 12539 ms • 7
  • google/gemini-2.5-flash 19374 ms
  • p95 • avg • N 26492 ms • 20925 ms • 8
  • qwen/qwen-2.5-7b-instru… 21341 ms
  • p95 • avg • N 96745 ms • 35586 ms • 8
  • google/gemma-3-12b-it 23153 ms
  • p95 • avg • N 32626 ms • 24198 ms • 7
Slowest
  • microsoft/phi-3-medium-… 127521 ms
  • p95 • avg • N 178163 ms • 139396 ms • 8
  • microsoft/phi-3.5-mini-… 47547 ms
  • p95 • avg • N 229299 ms • 93425 ms • 6
  • [email protected]/Qw… 39500 ms
  • p95 • avg • N 43112 ms • 40073 ms • 4
  • deepseek/deepseek-r1-di… 31610 ms
  • p95 • avg • N 36407 ms • 31387 ms • 7
  • meta-llama/llama-3.1-8b… 27297 ms
  • p95 • avg • N 37231 ms • 26989 ms • 8
Per-scene duration for this suite.
Suite Actions
Completion Progress 100%
4 of 4 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE
0 dimensions

Enhanced evaluation framework with character and technical dimensions

Top Weighted Dimensions View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
39996853
Dec. 17, 2025, midnight
45701827
Dec. 16, 2025, midnight
37213400
Dec. 15, 2025, midnight
39944994
Dec. 14, 2025, midnight
37189639
Dec. 13, 2025, midnight
45096089
Dec. 12, 2025, midnight
38973979
Dec. 11, 2025, midnight
38412806
Dec. 10, 2025, midnight
43328087
Dec. 9, 2025, midnight
38094522
Dec. 8, 2025, midnight
Latency Overview (This Suite)