Amara Patel

science-technology-ai-ai-engineer-characters-ada-lovelace v2.0 Ethical
Backstory: Amara Patel is a mid-career AI research engineer who specializes in natural-language processing for low-resource languages. Growing up bilingual, she became passionate about inclusive technology and earned a graduate degree in computational linguistics. She now leads a small R&D team at a global tech company, focusing on ethical data sourcing and transparent model deployment. Outside work, she mentors underrepresented students and contributes to open-source AI projects.
100% Complete
4/4 scenes
Model Performance Overview
Scene Performance Matrix
Scene deepseek/deepseek-r… google/gemini-2.5-f… google/gemma-3-12b-… meta-llama/llama-3.… microsoft/phi-3-med… microsoft/phi-3.5-m… mistralai/mistral-7… neversleep/noromaid… [email protected] [email protected] qwen/qwen-2.5-7b-in… qwen/qwen3-14b qwen/qwen3-8b
daily-standup
Morning stand-up update
0.622
Details
0.600
Details
0.749
Details
0.566
Details
0.000
Details
Error
0.850
Details
0.699
Details
0.626
Details
0.000
Details
Error
0.770
Details
0.637
Details
0.740
Details
0.710
Details
ethical-nlp-blog
Draft blog post on ethical NLP
0.602
Details
0.768
Details
0.406
Details
0.000
Details
0.000
Details
0.000
Details
Error
0.464
Details
0.000
Details
Error
0.000
Details
Error
0.705
Details
0.521
Details
0.519
Details
0.690
Details
grad-mentor
Mentorship question
0.599
Details
0.515
Details
0.620
Details
0.820
Details
0.000
Details
0.509
Details
0.707
Details
0.000
Details
Error
0.000
Details
Error
0.000
Details
0.595
Details
0.000
Details
0.696
Details
toolkit-readme
Open-source toolkit README
0.116
Details
0.460
Details
0.440
Details
0.829
Details
0.033
Details
0.000
Details
Error
0.518
Details
0.000
Details
Error
0.000
Details
Error
0.441
Details
0.450
Details
0.601
Details
0.438
Details
Test Scenes 4
0
Scene Order
Morning stand-up update
ID: daily-standup
🎯 Goal:
Give a concise, methodical update on yesterday’s progress and today’s plan.
📨 Input Events:
chat_msg colleague:Javier
"Hey Amara, quick stand-up: what did you finish yesterday and what’s on your plate today?"
Ready for Testing
1
Scene Order
Draft blog post on ethical NLP
ID: ethical-nlp-blog
🎯 Goal:
Write a clear blog post (≥350 words) explaining best practices for sourcing data for low-resource languages and ensuring transparent deployment.
📨 Input Events:
chat_msg manager:Lina
"Can you draft that blog post on ethical NLP for the company site? Aim for depth and approachability."
Ready for Testing
2
Scene Order
Mentorship question
ID: grad-mentor
🎯 Goal:
Provide a supportive, specific answer about preparing for a computational linguistics grad program.
📨 Input Events:
chat_msg student:Keisha
"Hi Amara! Any advice on what I should focus on to get into a good computational linguistics grad program?"
Ready for Testing
3
Scene Order
Open-source toolkit README
ID: toolkit-readme
🎯 Goal:
Generate a project README (≥200 words) with sections: Overview, Installation, Usage example, Contributing, License.
📨 Input Events:
chat_msg opensource:maintainer
"We need a solid README for the new morpheme-segmentation toolkit before launch."
Ready for Testing
Latency by Model (This Suite)
Fastest
  • [email protected]/Qw… 12224 ms
  • p95 • avg • N 13164 ms • 12257 ms • 4
  • neversleep/noromaid-20b 16090 ms
  • p95 • avg • N 38384 ms • 17806 ms • 8
  • google/gemini-2.5-flash 16885 ms
  • p95 • avg • N 22206 ms • 17601 ms • 8
  • qwen/qwen-2.5-7b-instru… 18015 ms
  • p95 • avg • N 19815 ms • 17478 ms • 8
  • google/gemma-3-12b-it 24773 ms
  • p95 • avg • N 28547 ms • 23568 ms • 8
Slowest
  • microsoft/phi-3-medium-… 136430 ms
  • p95 • avg • N 195210 ms • 147390 ms • 8
  • [email protected]/Qw… 45431 ms
  • p95 • avg • N 46045 ms • 45132 ms • 4
  • microsoft/phi-3.5-mini-… 36846 ms
  • p95 • avg • N 88650 ms • 44632 ms • 8
  • meta-llama/llama-3.1-8b… 32239 ms
  • p95 • avg • N 74098 ms • 38027 ms • 7
  • deepseek/deepseek-r1-di… 30432 ms
  • p95 • avg • N 41275 ms • 32400 ms • 8
Per-scene duration for this suite.
Suite Actions
Completion Progress 100%
4 of 4 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE
0 dimensions

Enhanced evaluation framework with character and technical dimensions

Top Weighted Dimensions View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
39549536
Dec. 17, 2025, midnight
45245002
Dec. 16, 2025, midnight
36693080
Dec. 15, 2025, midnight
39511316
Dec. 14, 2025, midnight
36769069
Dec. 13, 2025, midnight
44583717
Dec. 12, 2025, midnight
38546627
Dec. 11, 2025, midnight
37887362
Dec. 10, 2025, midnight
42857043
Dec. 9, 2025, midnight
37700359
Dec. 8, 2025, midnight
Latency Overview (This Suite)