Teresa Alvarez
urban-life-society-bank-receptionist-characters-claudette-colvin
v2.0
Ethical
Backstory: Teresa Alvarez grew up in a multicultural city neighborhood, effortlessly switching between English and Spanish to assist neighbors. After earning an associate degree in business administration, she became the welcoming face of a busy downtown bank. Teresa remembers clients by name, offers empathetic guidance, and volunteers on weekends to teach financial literacy.
100% Complete
4/4 scenes
Model Performance Overview
Scene Performance Matrix
| Scene | deepseek/deepseek-r… | google/gemini-2.5-f… | google/gemma-3-12b-… | meta-llama/llama-3.… | microsoft/phi-3-med… | microsoft/phi-3.5-m… | mistralai/mistral-7… | neversleep/noromaid… | [email protected]… | [email protected]… | qwen/qwen-2.5-7b-in… | qwen/qwen3-14b | qwen/qwen3-8b |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
greet-new-client
Greeting a first-time visitor
|
0.661
Details |
0.635
Details |
0.665
Details |
0.533
Details |
0.000
Details
Error
|
0.697
Details |
0.585
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.710
Details |
0.563
Details |
0.000
Details |
0.680
Details |
recognize-returning-client
Remembering a returning client
|
0.607
Details |
0.908
Details |
0.862
Details |
0.666
Details |
0.000
Details
Error
|
0.891
Details |
0.856
Details |
0.742
Details |
0.000
Details
Error
|
0.849
Details |
0.747
Details |
0.776
Details |
0.000
Details |
monthly-newsletter
Drafting the customer newsletter
|
0.311
Details |
0.316
Details |
0.840
Details |
0.000
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.527
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.458
Details |
0.350
Details |
0.348
Details |
0.660
Details |
community-workshop-outline
Financial literacy workshop plan
|
0.425
Details |
0.580
Details |
0.211
Details |
0.239
Details |
0.000
Details |
0.375
Details |
0.482
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.023
Details |
0.244
Details |
0.236
Details |
0.000
Details |
Test Scenes 4
0
Scene Order
Greeting a first-time visitor
ID:
greet-new-client
🎯 Goal:
Politely welcome the visitor, use bilingual phrasing where helpful, and offer next steps for opening an account in under 120 words.
📨 Input Events:
chat_msg
visitor:john_smith
"Hi, I'd like to open a checking account."
Ready for Testing
1
Scene Order
Remembering a returning client
ID:
recognize-returning-client
🎯 Goal:
Address Maria by name, recall her pending debit card, and guide her to the correct desk within 100 words.
🧠 Initial State:
Pre-loaded Memories:
- 💭 {'kind': 'fact', 'tags': ['client'], 'content': 'Maria Luna applied for a new debit card two days ago; card pickup expected today.', 'importance': 4}
📨 Input Events:
chat_msg
visitor:maria_luna
"Hi Teresa, I'm back to pick up my new debit card."
Ready for Testing
2
Scene Order
Drafting the customer newsletter
ID:
monthly-newsletter
🎯 Goal:
Write a friendly, bilingual (English with brief Spanish sub-headings) newsletter of at least 200 words introducing three new savings features and inviting questions.
📨 Input Events:
chat_msg
manager
"Please draft the upcoming customer newsletter highlighting our three new savings features."
Ready for Testing
3
Scene Order
Financial literacy workshop plan
ID:
community-workshop-outline
🎯 Goal:
Provide a clear five-section outline (250–350 words) for Sunday’s class, with each section title in English followed by a Spanish translation in parentheses.
📨 Input Events:
chat_msg
community_center_lead
"Can you send me your outline for Sunday's financial literacy class?"
Ready for Testing
Latency by Model (This Suite)
Fastest
- neversleep/noromaid-20b 7430 ms
- p95 • avg • N 38674 ms • 16528 ms • 7
- [email protected]/Qw… 11322 ms
- p95 • avg • N 12883 ms • 10909 ms • 4
- google/gemini-2.5-flash 18330 ms
- p95 • avg • N 37421 ms • 22980 ms • 8
- google/gemma-3-12b-it 19813 ms
- p95 • avg • N 21381 ms • 19371 ms • 7
- qwen/qwen3-14b 20603 ms
- p95 • avg • N 55705 ms • 27987 ms • 7
Slowest
- microsoft/phi-3-medium-… 107925 ms
- p95 • avg • N 132581 ms • 104843 ms • 6
- microsoft/phi-3.5-mini-… 38214 ms
- p95 • avg • N 79188 ms • 46973 ms • 7
- qwen/qwen3-8b 34584 ms
- p95 • avg • N 411948 ms • 124610 ms • 5
- deepseek/deepseek-r1-di… 33950 ms
- p95 • avg • N 53599 ms • 37509 ms • 8
- [email protected]/Qw… 33665 ms
- p95 • avg • N 130010 ms • 58557 ms • 4
Per-scene duration for this suite.
Suite Actions
Completion Progress
100%
4 of 4 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE0 dimensions
Enhanced evaluation framework with character and technical dimensions
Top Weighted Dimensions
View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
47511216
Dec. 17, 2025, midnight
52986233
Dec. 16, 2025, midnight
44322839
Dec. 15, 2025, midnight
46322509
Dec. 14, 2025, midnight
44098203
Dec. 13, 2025, midnight
53233153
Dec. 12, 2025, midnight
46676420
Dec. 11, 2025, midnight
45483340
Dec. 10, 2025, midnight
50981562
Dec. 9, 2025, midnight
45087141
Dec. 8, 2025, midnight