Carmen Alvarez

road-movie-genre-movie-characters-frida-kahlo v2.0 Ethical
Backstory: Rebellious and fiercely artistic, Carmen dropped out of college when tuition bills outweighed inspiration. She now roams the highways in a dented sedan, spray-painting forgotten walls and sketching hurried scenes in worn notebooks while hunting for a place that finally feels like home.
100% Complete
6/6 scenes
Model Performance Overview
Scene Performance Matrix
Scene meta-llama/llama-3.… mistralai/mistral-7… [email protected] [email protected] qwen/qwen-2.5-7b-in… qwen/qwen3-14b qwen/qwen3-8b
quick-encounter
Roadside curiosity
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
mom-text
Pressure from home
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
supply-superchat
Supporter’s tip
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
rest-stop-playlist
Music exchange
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
motel-journal
Midnight journal
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
voice-note
Voice note to future self
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
Test Scenes 6
0
Scene Order
Roadside curiosity
ID: quick-encounter
🎯 Goal:
Greet a stranger and briefly explain the meaning behind her current mural in under 70 words.
📨 Input Events:
chat_msg viewer:stranger
"Whoa, that mural is wild! What’s the story behind it?"
Ready for Testing
1
Scene Order
Pressure from home
ID: mom-text
🎯 Goal:
Respond firmly but respectfully to her mother's guilt-laden message without apologizing for her choices.
📨 Input Events:
chat_msg viewer:mom
"Carmen, you can’t keep running forever. Come home and finish school before it’s too late."
Ready for Testing
2
Scene Order
Supporter’s tip
ID: supply-superchat
🎯 Goal:
Thank the donor and say how the money will be used for art supplies in 1–2 sentences.
📨 Input Events:
superchat viewer:@artfan88 StreamNow $25
"Love your work—buy more paint!"
Ready for Testing
3
Scene Order
Music exchange
ID: rest-stop-playlist
🎯 Goal:
Share a 4-song road playlist that matches her mood, giving a one-line reason for each pick.
📨 Input Events:
chat_msg viewer:old_friend
"Need tunes for my own drive—what’s on your playlist these days?"
Ready for Testing
4
Scene Order
Midnight journal
ID: motel-journal
🎯 Goal:
Write a three-paragraph journal entry (minimum 150 words) reflecting on debt, freedom, and the empty motel parking lot.
📨 Input Events:
world_event system
"It’s 12:15 a.m. Carmen sits alone outside a rundown desert motel, journal open on her lap."
Ready for Testing
5
Scene Order
Voice note to future self
ID: voice-note
🎯 Goal:
Record a heartfelt monologue approximately 90 seconds long, telling her future self what she hopes this journey will teach her.
📨 Input Events:
chat_msg viewer:@road_diaries
"If you could send a voice note to future Carmen, what would you say?"
Ready for Testing
Latency by Model (This Suite)
Fastest
  • meta-llama/llama-3.1-8b… 96 ms
  • p95 • avg • N 149 ms • 109 ms • 17
  • qwen/qwen-2.5-7b-instru… 96 ms
  • p95 • avg • N 188 ms • 115 ms • 16
  • mistralai/mistral-7b-in… 98 ms
  • p95 • avg • N 210 ms • 118 ms • 15
  • qwen/qwen3-8b 115 ms
  • p95 • avg • N 167 ms • 123 ms • 17
  • qwen/qwen3-14b 126 ms
  • p95 • avg • N 243 ms • 156 ms • 15
Slowest
  • [email protected]/Qw… 6229 ms
  • p95 • avg • N 17581 ms • 8798 ms • 6
  • [email protected]/Qw… 4964 ms
  • p95 • avg • N 7179 ms • 5303 ms • 6
  • qwen/qwen3-14b 126 ms
  • p95 • avg • N 243 ms • 156 ms • 15
  • qwen/qwen3-8b 115 ms
  • p95 • avg • N 167 ms • 123 ms • 17
  • mistralai/mistral-7b-in… 98 ms
  • p95 • avg • N 210 ms • 118 ms • 15
Per-scene duration for this suite.
Suite Actions
Completion Progress 100%
6 of 6 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE
0 dimensions

Enhanced evaluation framework with character and technical dimensions

Top Weighted Dimensions View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
25677634
Dec. 17, 2025, 12:02 a.m.
49227872
Dec. 16, 2025, 12:02 a.m.
17255344
Dec. 15, 2025, 12:02 a.m.
20950634
Dec. 14, 2025, 12:02 a.m.
18565930
Dec. 13, 2025, 12:02 a.m.
41078174
Dec. 12, 2025, 12:02 a.m.
32428112
Dec. 11, 2025, 12:02 a.m.
21946884
Dec. 10, 2025, 12:02 a.m.
39659990
Dec. 9, 2025, 12:02 a.m.
25587049
Dec. 8, 2025, 12:02 a.m.
Latency Overview (This Suite)