Victor Alvarez
urban-life-society-bartender-characters-ada-lovelace
v2.0
Ethical
Backstory: Victor Alvarez grew up in a multicultural city block and found his calling behind the bar after studying sociology. He treats the bar as a living study of human connection—listening closely, remembering details, and crafting cocktails that echo patrons’ backgrounds. Outside work he organizes community open-mic nights to spotlight local talent and strengthen neighborhood ties.
100% Complete
4/4 scenes
Model Performance Overview
Scene Performance Matrix
| Scene | deepseek/deepseek-r… | google/gemini-2.5-f… | google/gemma-3-12b-… | meta-llama/llama-3.… | microsoft/phi-3-med… | microsoft/phi-3.5-m… | mistralai/mistral-7… | neversleep/noromaid… | [email protected]… | [email protected]… | qwen/qwen-2.5-7b-in… | qwen/qwen3-14b | qwen/qwen3-8b |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
first-greeting
First-time Patron Greeting
|
0.583
Details |
0.626
Details |
0.670
Details |
0.348
Details |
0.000
Details
Error
|
0.593
Details |
0.667
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.350
Details |
0.695
Details |
0.787
Details |
0.549
Details |
heartbreak-confession
Late-Night Heartbreak Talk
|
0.525
Details |
0.630
Details |
0.609
Details |
0.520
Details |
0.000
Details
Error
|
0.444
Details |
0.665
Details |
0.437
Details |
0.000
Details
Error
|
0.793
Details |
0.000
Details |
0.764
Details |
0.795
Details |
observant-check
Noticing an Anxious Patron
|
0.894
Details |
0.718
Details |
0.626
Details |
0.787
Details |
0.000
Details
Error
|
0.733
Details |
0.851
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.702
Details |
0.715
Details |
0.852
Details |
0.823
Details |
open-mic-night
Announcing the Open-Mic Lineup
|
0.468
Details |
0.558
Details |
0.625
Details |
0.684
Details |
0.000
Details |
0.280
Details |
0.480
Details |
0.813
Details |
0.000
Details
Error
|
0.000
Details |
0.401
Details |
0.714
Details |
0.735
Details |
Test Scenes 4
0
Scene Order
First-time Patron Greeting
ID:
first-greeting
🎯 Goal:
Greet the newcomer warmly, ask a follow-up about taste, and recommend a signature drink with a brief explanation.
📨 Input Events:
chat_msg
viewer:patron_001
"Hey, first time here. What do you recommend?"
Ready for Testing
1
Scene Order
Late-Night Heartbreak Talk
ID:
heartbreak-confession
🎯 Goal:
Offer an empathetic reply of at least 150 words that validates feelings, gives gentle advice, and keeps bartender persona.
📨 Input Events:
chat_msg
viewer:patron_002
"It's been a rough week. I just got dumped and don't know who to talk to."
Ready for Testing
2
Scene Order
Noticing an Anxious Patron
ID:
observant-check
🎯 Goal:
Proactively engage the nervous patron, reference observed behavior, and offer a calm conversational opening without prying.
📨 Input Events:
world_event
world
"A guest at the end of the bar keeps fiddling with their wedding ring and glancing toward the entrance."
Ready for Testing
3
Scene Order
Announcing the Open-Mic Lineup
ID:
open-mic-night
🎯 Goal:
Deliver an enthusiastic 120–200 word announcement listing at least three acts and encouraging a supportive atmosphere.
📨 Input Events:
chat_msg
viewer:regular_003
"Victor, who's performing at tonight's open mic?"
Ready for Testing
Latency by Model (This Suite)
Fastest
- [email protected]/Qw… 7595 ms
- p95 • avg • N 34812 ms • 15155 ms • 4
- [email protected]/Qw… 10952 ms
- p95 • avg • N 11912 ms • 10770 ms • 4
- meta-llama/llama-3.1-8b… 20981 ms
- p95 • avg • N 27996 ms • 21687 ms • 7
- qwen/qwen-2.5-7b-instru… 21882 ms
- p95 • avg • N 137940 ms • 52504 ms • 8
- google/gemini-2.5-flash 23902 ms
- p95 • avg • N 35433 ms • 25993 ms • 7
Slowest
- microsoft/phi-3-medium-… 160843 ms
- p95 • avg • N 196833 ms • 160056 ms • 8
- deepseek/deepseek-r1-di… 33468 ms
- p95 • avg • N 47551 ms • 33636 ms • 7
- microsoft/phi-3.5-mini-… 32135 ms
- p95 • avg • N 35437 ms • 30465 ms • 7
- qwen/qwen3-14b 28962 ms
- p95 • avg • N 38071 ms • 28863 ms • 8
- qwen/qwen3-8b 26589 ms
- p95 • avg • N 35323 ms • 27056 ms • 8
Per-scene duration for this suite.
Suite Actions
Completion Progress
100%
4 of 4 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE0 dimensions
Enhanced evaluation framework with character and technical dimensions
Top Weighted Dimensions
View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
47743212
Dec. 17, 2025, midnight
53304029
Dec. 16, 2025, midnight
44526912
Dec. 15, 2025, midnight
46482977
Dec. 14, 2025, midnight
44318893
Dec. 13, 2025, midnight
53443536
Dec. 12, 2025, midnight
46892138
Dec. 11, 2025, midnight
45713273
Dec. 10, 2025, midnight
51245520
Dec. 9, 2025, midnight
45324887
Dec. 8, 2025, midnight