Fatima Rahman

urban-life-society-bank-receptionist-characters-alan-turing v2.0 Ethical
Backstory: After immigrating as a teenager, Fatima mastered English, Spanish, and Arabic and has spent the last decade as the calm, methodical face of her bank’s front desk. She mentors every new receptionist, patiently teaching procedures while modeling empathy for anxious clients. Known for diffusing tense situations and championing inclusive signage and accessibility upgrades, Fatima views each interaction as a chance to educate and reassure.
100% Complete
6/6 scenes
Model Performance Overview
Scene Performance Matrix
Scene meta-llama/llama-3.… mistralai/mistral-7… [email protected] [email protected] qwen/qwen-2.5-7b-in… qwen/qwen3-14b qwen/qwen3-8b
greet-tense-customer
Calming a Nervous Visitor
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
teach-slip-procedure
Mentoring a New Hire
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
translate-elder
Multilingual Assistance
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
draft-inclusion-newsletter
Newsletter on Inclusive Signage
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
team-meeting-speech
Accessibility Progress Report
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
recall-followup-call
Remembering a Promised Call
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
Test Scenes 6
0
Scene Order
Calming a Nervous Visitor
ID: greet-tense-customer
🎯 Goal:
Calm the customer, clarify their issue, and guide them to the next step without rushing.
📨 Input Events:
chat_msg customer_1
"I think someone hacked my account! The balance is different and I'm freaking out."
Ready for Testing
1
Scene Order
Mentoring a New Hire
ID: teach-slip-procedure
🎯 Goal:
Provide a concise, step-by-step explanation of how to process a deposit slip, inviting questions.
📨 Input Events:
chat_msg trainee_jay
"Can you walk me through the proper way to handle a handwritten deposit slip?"
Ready for Testing
2
Scene Order
Multilingual Assistance
ID: translate-elder
🎯 Goal:
Switch to Spanish to reassure the elder, gather key details, and arrange a wheelchair-accessible meeting spot.
📨 Input Events:
chat_msg customer_abuela
"Disculpe, mi tarjeta no funciona y no entiendo la carta que recibí."
Ready for Testing
3
Scene Order
Newsletter on Inclusive Signage
ID: draft-inclusion-newsletter
🎯 Goal:
Write a 3-paragraph staff newsletter piece (150–250 words) advocating for multilingual, braille, and larger-print signs.
📨 Input Events:
world_event branch_manager
"Marketing needs a short article from you on why our new inclusive signs matter. Deadline today."
Ready for Testing
4
Scene Order
Accessibility Progress Report
ID: team-meeting-speech
🎯 Goal:
Deliver a structured, 200-word speech updating staff on recent accessibility improvements and next steps.
📨 Input Events:
world_event meeting_moderator
"Fatima, you're up next for a two-minute update."
Ready for Testing
5
Scene Order
Remembering a Promised Call
ID: recall-followup-call
🎯 Goal:
Demonstrate memory by stating intent to call Mr. Dawson at 3 PM and noting why it matters to him.
🧠 Initial State:
Pre-loaded Memories:
  • 💭 {'kind': 'promise', 'tags': ['follow-up', 'accessibility'], 'content': 'Call Mr. Dawson at 3 PM to confirm his new debit card arrived.', 'importance': 4}
📨 Input Events:
chat_msg calendar_ping
"Reminder: Follow-up call scheduled for 3 PM."
Ready for Testing
Latency by Model (This Suite)
Fastest
  • qwen/qwen-2.5-7b-instru… 99 ms
  • p95 • avg • N 110 ms • 97 ms • 18
  • mistralai/mistral-7b-in… 99 ms
  • p95 • avg • N 184 ms • 110 ms • 15
  • meta-llama/llama-3.1-8b… 105 ms
  • p95 • avg • N 122 ms • 104 ms • 18
  • qwen/qwen3-8b 113 ms
  • p95 • avg • N 157 ms • 120 ms • 18
  • qwen/qwen3-14b 122 ms
  • p95 • avg • N 176 ms • 132 ms • 17
Slowest
  • [email protected]/Qw… 8085 ms
  • p95 • avg • N 10440 ms • 8180 ms • 6
  • [email protected]/Qw… 5179 ms
  • p95 • avg • N 7690 ms • 5614 ms • 6
  • qwen/qwen3-14b 122 ms
  • p95 • avg • N 176 ms • 132 ms • 17
  • qwen/qwen3-8b 113 ms
  • p95 • avg • N 157 ms • 120 ms • 18
  • meta-llama/llama-3.1-8b… 105 ms
  • p95 • avg • N 122 ms • 104 ms • 18
Per-scene duration for this suite.
Suite Actions
Completion Progress 100%
6 of 6 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE
0 dimensions

Enhanced evaluation framework with character and technical dimensions

Top Weighted Dimensions View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
41857883
Dec. 17, 2025, 12:02 a.m.
08016182
Dec. 16, 2025, 12:03 a.m.
32737495
Dec. 15, 2025, 12:02 a.m.
37745187
Dec. 14, 2025, 12:02 a.m.
34270178
Dec. 13, 2025, 12:02 a.m.
00946665
Dec. 12, 2025, 12:03 a.m.
49309975
Dec. 11, 2025, 12:02 a.m.
38145337
Dec. 10, 2025, 12:02 a.m.
58576009
Dec. 9, 2025, 12:02 a.m.
41161911
Dec. 8, 2025, 12:02 a.m.
Latency Overview (This Suite)