Marcus Delgado
food-hospitality-culinary-arts-food-critic-characters-james-beard
v2.0
Ethical
Backstory: Raised in a lively border city, Marcus spent childhood weekends sampling tacos, pupusas, and elotes from curbside griddles. His palate grew fearless and his slang-laced banter drew friends; now he runs a hit vlog spotlighting affordable street eats and the personal tales of the cooks behind each cart.
17% Complete
1/6 scenes
Model Performance Overview
Scene Performance Matrix
| Scene | meta-llama/llama-3.… | mistralai/mistral-7… | neversleep/noromaid… | [email protected]… | [email protected]… | qwen/qwen-2.5-7b-in… | qwen/qwen3-14b | qwen/qwen3-8b |
|---|---|---|---|---|---|---|---|---|
intro-trust
Why trust Marcus?
|
0.854
Details |
0.879
Details |
— |
0.000
Details
Error
|
0.000
Details
Error
|
0.843
Details |
0.899
Details |
0.898
Details |
festival-plan
Border Bites Festival plan
|
0.397
Details |
0.725
Details |
— |
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details |
0.337
Details |
0.456
Details |
yakitori-episode
Long-form vlog script: Yakitori night
|
0.365
Details |
0.561
Details |
— |
0.000
Details
Error
|
0.000
Details
Error
|
0.425
Details |
0.430
Details |
0.746
Details |
quick-taco-tip
Superchat taco request
|
0.482
Details |
0.766
Details |
— |
0.000
Details
Error
|
0.000
Details
Error
|
0.579
Details |
0.730
Details |
0.686
Details |
week-recap-blog
Long-form blog recap
|
0.385
Details |
0.550
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details |
0.432
Details |
0.380
Details |
name-check
Memory test: vendor name
|
0.863
Details |
0.869
Details |
— |
0.000
Details
Error
|
0.000
Details
Error
|
0.634
Details |
0.619
Details |
0.754
Details |
Test Scenes 6
0
Scene Order
Why trust Marcus?
ID:
intro-trust
🎯 Goal:
Deliver a lively, slang-tinged intro that proves street-food credibility and passion for vendor stories.
📨 Input Events:
chat_msg
viewer:curious_diner
"Who are you and why should we trust your taste buds?"
Ready for Testing
1
Scene Order
Border Bites Festival plan
ID:
festival-plan
🎯 Goal:
Pick three carts Marcus will hit first, stating dishes and a quick plan that values affordability and adventure.
📨 Input Events:
world_event
event_system
"The Border Bites Festival opens tomorrow with 40 food stalls."
Ready for Testing
2
Scene Order
Long-form vlog script: Yakitori night
ID:
yakitori-episode
🎯 Goal:
Write a 200-word (minimum) vlog segment that narrates tasting street-style yakitori, weaves in the grill master’s backstory, and keeps Marcus’s colloquial vibe.
📨 Input Events:
chat_msg
viewer:sakura_skewer
"Hey Marcus, do a full episode on that Japanese yakitori cart you teased!"
Ready for Testing
3
Scene Order
Superchat taco request
ID:
quick-taco-tip
🎯 Goal:
Reply with one $5-friendly taco spot in L.A., including vendor name, location hint, and signature salsa.
📨 Input Events:
superchat
viewer:foodie123
YouTube
$5
"Got $5 and a craving for tacos in LA—hook me up!"
Ready for Testing
4
Scene Order
Long-form blog recap
ID:
week-recap-blog
🎯 Goal:
Produce a 300-word (minimum) blog-style recap of a week touring border-town stalls, highlighting five dishes and the cooks’ personal stories while maintaining upbeat, conversational tone.
📨 Input Events:
chat_msg
viewer:blog_subscriber
"Can we get a written recap of your whole border-town week?"
Ready for Testing
5
Scene Order
Memory test: vendor name
ID:
name-check
🎯 Goal:
Correctly recall the name of the L.A. taco vendor given earlier (from quick_taco_tip) and answer casually.
📨 Input Events:
chat_msg
viewer:memory_test
"Wait, what was that taco vendor's name you mentioned for L.A.?"
Ready for Testing
Latency by Model (This Suite)
Fastest
- neversleep/noromaid-20b 60 ms
- p95 • avg • N 60 ms • 60 ms • 1
- [email protected]/Qw… 7887 ms
- p95 • avg • N 14211 ms • 9072 ms • 6
- qwen/qwen-2.5-7b-instru… 18355 ms
- p95 • avg • N 139523 ms • 40100 ms • 11
- qwen/qwen3-14b 21061 ms
- p95 • avg • N 28301 ms • 21977 ms • 11
- meta-llama/llama-3.1-8b… 23305 ms
- p95 • avg • N 26755 ms • 22663 ms • 10
Slowest
- [email protected]/Qw… 40316 ms
- p95 • avg • N 42076 ms • 40143 ms • 6
- mistralai/mistral-7b-in… 30000 ms
- p95 • avg • N 45031 ms • 31979 ms • 12
- qwen/qwen3-8b 27592 ms
- p95 • avg • N 31795 ms • 27160 ms • 12
- meta-llama/llama-3.1-8b… 23305 ms
- p95 • avg • N 26755 ms • 22663 ms • 10
- qwen/qwen3-14b 21061 ms
- p95 • avg • N 28301 ms • 21977 ms • 11
Per-scene duration for this suite.
Suite Actions
Completion Progress
17%
1 of 6 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE0 dimensions
Enhanced evaluation framework with character and technical dimensions
Top Weighted Dimensions
View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
37877904
Dec. 17, 2025, 12:01 a.m.
53497243
Dec. 16, 2025, 12:01 a.m.
33348841
Dec. 15, 2025, 12:01 a.m.
34884347
Dec. 14, 2025, 12:01 a.m.
33833017
Dec. 13, 2025, 12:01 a.m.
46882273
Dec. 12, 2025, 12:01 a.m.
43077564
Dec. 11, 2025, 12:01 a.m.
35570071
Dec. 10, 2025, 12:01 a.m.
49260645
Dec. 9, 2025, 12:01 a.m.
37689594
Dec. 8, 2025, 12:01 a.m.