Natalie Andrews

education-academia-mba-professor-characters-mary-parker-follett v2.0 Ethical
Backstory: Natalie Andrews is a tenured professor who teaches strategic management and organizational behavior at a top-tier business school. Raised in a multicultural household and seasoned by roles in both startups and Fortune 500 firms, she fuses practitioner insight with academic rigor. Famous for her case-based seminars, real-time analytics dashboards, and global virtual team projects, she equips students to lead in complex, fast-moving environments.
100% Complete
4/4 scenes
Model Performance Overview
Scene Performance Matrix
Scene deepseek/deepseek-r… google/gemini-2.5-f… google/gemma-3-12b-… meta-llama/llama-3.… microsoft/phi-3-med… microsoft/phi-3.5-m… mistralai/mistral-7… neversleep/noromaid… [email protected] [email protected] qwen/qwen-2.5-7b-in… qwen/qwen3-14b qwen/qwen3-8b
multicultural-team-advice
First-time multicultural team lead
0.680
Details
0.703
Details
0.786
Details
0.648
Details
0.000
Details
0.000
Details
0.809
Details
0.000
Details
Error
0.000
Details
Error
0.801
Details
0.715
Details
0.830
Details
0.738
Details
podcast-crisis-case
Five-minute case podcast
0.449
Details
0.579
Details
0.367
Details
0.162
Details
0.000
Details
0.000
Details
Error
0.569
Details
0.000
Details
Error
0.000
Details
Error
0.294
Details
0.291
Details
0.225
Details
0.490
Details
dashboard-reaction
Real-time analytics interpretation
0.722
Details
0.574
Details
0.505
Details
0.485
Details
0.000
Details
Error
0.632
Details
0.680
Details
0.000
Details
Error
0.000
Details
Error
0.679
Details
0.616
Details
0.650
Details
0.754
Details
weekly-newsletter
Course newsletter & project kickoff
0.592
Details
0.632
Details
0.445
Details
0.297
Details
0.000
Details
0.399
Details
0.569
Details
0.000
Details
Error
0.000
Details
Error
0.512
Details
0.320
Details
0.118
Details
0.476
Details
Test Scenes 4
0
Scene Order
First-time multicultural team lead
ID: multicultural-team-advice
🎯 Goal:
Offer concise, empathetic guidance that blends theory and practical steps for leading a newly formed multicultural virtual team.
📨 Input Events:
chat_msg student:alex
"Professor Andrews, I’m about to lead a culturally diverse remote team for the first time. Any quick advice?"
Ready for Testing
1
Scene Order
Five-minute case podcast
ID: podcast-crisis-case
🎯 Goal:
Record a ~450-600-word podcast script analyzing how a real company navigated a sudden PR crisis, integrating data points and leadership takeaways in an engaging narrative.
📨 Input Events:
chat_msg producer:maria
"Natalie, can you script today’s 5-minute podcast on a recent corporate crisis for our ‘Strategy in Real Time’ series?"
Ready for Testing
2
Scene Order
Real-time analytics interpretation
ID: dashboard-reaction
🎯 Goal:
Interpret the new data succinctly and propose one actionable classroom discussion question tying the metric to competitive strategy.
📨 Input Events:
world_event analytics_system
"Dashboard update: Major competitor just reduced flagship product price by 15% across APAC market."
Ready for Testing
3
Scene Order
Course newsletter & project kickoff
ID: weekly-newsletter
🎯 Goal:
Draft a 400-500-word newsletter that recaps last week’s insights, previews the strategic innovation module, and clearly outlines expectations for the upcoming global virtual team project.
📨 Input Events:
chat_msg ta:jon
"Hi Professor, could you prepare this week’s newsletter? Students need details on the next module and the virtual team project launch."
Ready for Testing
Latency by Model (This Suite)
Fastest
  • neversleep/noromaid-20b 9803 ms
  • p95 • avg • N 17050 ms • 10829 ms • 4
  • [email protected]/Qw… 11900 ms
  • p95 • avg • N 12253 ms • 11485 ms • 4
  • google/gemma-3-12b-it 20421 ms
  • p95 • avg • N 30651 ms • 23184 ms • 4
  • qwen/qwen-2.5-7b-instru… 24040 ms
  • p95 • avg • N 29454 ms • 24229 ms • 4
  • google/gemini-2.5-flash 26007 ms
  • p95 • avg • N 30436 ms • 25510 ms • 4
Slowest
  • microsoft/phi-3-medium-… 123855 ms
  • p95 • avg • N 128341 ms • 103870 ms • 4
  • microsoft/phi-3.5-mini-… 83665 ms
  • p95 • avg • N 216514 ms • 110245 ms • 4
  • qwen/qwen3-8b 53643 ms
  • p95 • avg • N 69952 ms • 55580 ms • 4
  • [email protected]/Qw… 40054 ms
  • p95 • avg • N 220831 ms • 92765 ms • 4
  • deepseek/deepseek-r1-di… 32813 ms
  • p95 • avg • N 32826 ms • 32067 ms • 4
Per-scene duration for this suite.
Suite Actions
Completion Progress 100%
4 of 4 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE
0 dimensions

Enhanced evaluation framework with character and technical dimensions

Top Weighted Dimensions View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
20462151
Dec. 17, 2025, midnight
24155361
Dec. 16, 2025, midnight
19395055
Dec. 15, 2025, midnight
21784005
Dec. 14, 2025, midnight
19192909
Dec. 13, 2025, midnight
23855965
Dec. 12, 2025, midnight
20200793
Dec. 11, 2025, midnight
19608027
Dec. 10, 2025, midnight
22478077
Dec. 9, 2025, midnight
19701231
Dec. 8, 2025, midnight
Latency Overview (This Suite)