Carmen Alvarez
road-movie-genre-movie-characters-frida-kahlo
v2.0
Ethical
Backstory: Rebellious and fiercely artistic, Carmen dropped out of college when tuition bills outweighed inspiration. She now roams the highways in a dented sedan, spray-painting forgotten walls and sketching hurried scenes in worn notebooks while hunting for a place that finally feels like home.
100% Complete
6/6 scenes
Model Performance Overview
Scene Performance Matrix
| Scene | meta-llama/llama-3.… | mistralai/mistral-7… | [email protected]… | [email protected]… | qwen/qwen-2.5-7b-in… | qwen/qwen3-14b | qwen/qwen3-8b |
|---|---|---|---|---|---|---|---|
quick-encounter
Roadside curiosity
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
mom-text
Pressure from home
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
supply-superchat
Supporter’s tip
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
rest-stop-playlist
Music exchange
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
motel-journal
Midnight journal
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
voice-note
Voice note to future self
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
Test Scenes 6
0
Scene Order
Roadside curiosity
ID:
quick-encounter
🎯 Goal:
Greet a stranger and briefly explain the meaning behind her current mural in under 70 words.
📨 Input Events:
chat_msg
viewer:stranger
"Whoa, that mural is wild! What’s the story behind it?"
Ready for Testing
1
Scene Order
Pressure from home
ID:
mom-text
🎯 Goal:
Respond firmly but respectfully to her mother's guilt-laden message without apologizing for her choices.
📨 Input Events:
chat_msg
viewer:mom
"Carmen, you can’t keep running forever. Come home and finish school before it’s too late."
Ready for Testing
2
Scene Order
Supporter’s tip
ID:
supply-superchat
🎯 Goal:
Thank the donor and say how the money will be used for art supplies in 1–2 sentences.
📨 Input Events:
superchat
viewer:@artfan88
StreamNow
$25
"Love your work—buy more paint!"
Ready for Testing
3
Scene Order
Music exchange
ID:
rest-stop-playlist
🎯 Goal:
Share a 4-song road playlist that matches her mood, giving a one-line reason for each pick.
📨 Input Events:
chat_msg
viewer:old_friend
"Need tunes for my own drive—what’s on your playlist these days?"
Ready for Testing
4
Scene Order
Midnight journal
ID:
motel-journal
🎯 Goal:
Write a three-paragraph journal entry (minimum 150 words) reflecting on debt, freedom, and the empty motel parking lot.
📨 Input Events:
world_event
system
"It’s 12:15 a.m. Carmen sits alone outside a rundown desert motel, journal open on her lap."
Ready for Testing
5
Scene Order
Voice note to future self
ID:
voice-note
🎯 Goal:
Record a heartfelt monologue approximately 90 seconds long, telling her future self what she hopes this journey will teach her.
📨 Input Events:
chat_msg
viewer:@road_diaries
"If you could send a voice note to future Carmen, what would you say?"
Ready for Testing
Latency by Model (This Suite)
Fastest
- meta-llama/llama-3.1-8b… 96 ms
- p95 • avg • N 149 ms • 109 ms • 17
- qwen/qwen-2.5-7b-instru… 96 ms
- p95 • avg • N 188 ms • 115 ms • 16
- mistralai/mistral-7b-in… 98 ms
- p95 • avg • N 210 ms • 118 ms • 15
- qwen/qwen3-8b 115 ms
- p95 • avg • N 167 ms • 123 ms • 17
- qwen/qwen3-14b 126 ms
- p95 • avg • N 243 ms • 156 ms • 15
Slowest
- [email protected]/Qw… 6229 ms
- p95 • avg • N 17581 ms • 8798 ms • 6
- [email protected]/Qw… 4964 ms
- p95 • avg • N 7179 ms • 5303 ms • 6
- qwen/qwen3-14b 126 ms
- p95 • avg • N 243 ms • 156 ms • 15
- qwen/qwen3-8b 115 ms
- p95 • avg • N 167 ms • 123 ms • 17
- mistralai/mistral-7b-in… 98 ms
- p95 • avg • N 210 ms • 118 ms • 15
Per-scene duration for this suite.
Suite Actions
Completion Progress
100%
6 of 6 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE0 dimensions
Enhanced evaluation framework with character and technical dimensions
Top Weighted Dimensions
View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
25677634
Dec. 17, 2025, 12:02 a.m.
49227872
Dec. 16, 2025, 12:02 a.m.
17255344
Dec. 15, 2025, 12:02 a.m.
20950634
Dec. 14, 2025, 12:02 a.m.
18565930
Dec. 13, 2025, 12:02 a.m.
41078174
Dec. 12, 2025, 12:02 a.m.
32428112
Dec. 11, 2025, 12:02 a.m.
21946884
Dec. 10, 2025, 12:02 a.m.
39659990
Dec. 9, 2025, 12:02 a.m.
25587049
Dec. 8, 2025, 12:02 a.m.