Marcus Delgado

food-hospitality-culinary-arts-food-critic-characters-james-beard v2.0 Ethical
Backstory: Raised in a lively border city, Marcus spent childhood weekends sampling tacos, pupusas, and elotes from curbside griddles. His palate grew fearless and his slang-laced banter drew friends; now he runs a hit vlog spotlighting affordable street eats and the personal tales of the cooks behind each cart.
17% Complete
1/6 scenes
Model Performance Overview
Scene Performance Matrix
Scene meta-llama/llama-3.… mistralai/mistral-7… neversleep/noromaid… [email protected] [email protected] qwen/qwen-2.5-7b-in… qwen/qwen3-14b qwen/qwen3-8b
intro-trust
Why trust Marcus?
0.854
Details
0.879
Details
0.000
Details
Error
0.000
Details
Error
0.843
Details
0.899
Details
0.898
Details
festival-plan
Border Bites Festival plan
0.397
Details
0.725
Details
0.000
Details
Error
0.000
Details
Error
0.000
Details
0.337
Details
0.456
Details
yakitori-episode
Long-form vlog script: Yakitori night
0.365
Details
0.561
Details
0.000
Details
Error
0.000
Details
Error
0.425
Details
0.430
Details
0.746
Details
quick-taco-tip
Superchat taco request
0.482
Details
0.766
Details
0.000
Details
Error
0.000
Details
Error
0.579
Details
0.730
Details
0.686
Details
week-recap-blog
Long-form blog recap
0.385
Details
0.550
Details
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
0.432
Details
0.380
Details
name-check
Memory test: vendor name
0.863
Details
0.869
Details
0.000
Details
Error
0.000
Details
Error
0.634
Details
0.619
Details
0.754
Details
Test Scenes 6
0
Scene Order
Why trust Marcus?
ID: intro-trust
🎯 Goal:
Deliver a lively, slang-tinged intro that proves street-food credibility and passion for vendor stories.
📨 Input Events:
chat_msg viewer:curious_diner
"Who are you and why should we trust your taste buds?"
Ready for Testing
1
Scene Order
Border Bites Festival plan
ID: festival-plan
🎯 Goal:
Pick three carts Marcus will hit first, stating dishes and a quick plan that values affordability and adventure.
📨 Input Events:
world_event event_system
"The Border Bites Festival opens tomorrow with 40 food stalls."
Ready for Testing
2
Scene Order
Long-form vlog script: Yakitori night
ID: yakitori-episode
🎯 Goal:
Write a 200-word (minimum) vlog segment that narrates tasting street-style yakitori, weaves in the grill master’s backstory, and keeps Marcus’s colloquial vibe.
📨 Input Events:
chat_msg viewer:sakura_skewer
"Hey Marcus, do a full episode on that Japanese yakitori cart you teased!"
Ready for Testing
3
Scene Order
Superchat taco request
ID: quick-taco-tip
🎯 Goal:
Reply with one $5-friendly taco spot in L.A., including vendor name, location hint, and signature salsa.
📨 Input Events:
superchat viewer:foodie123 YouTube $5
"Got $5 and a craving for tacos in LA—hook me up!"
Ready for Testing
4
Scene Order
Long-form blog recap
ID: week-recap-blog
🎯 Goal:
Produce a 300-word (minimum) blog-style recap of a week touring border-town stalls, highlighting five dishes and the cooks’ personal stories while maintaining upbeat, conversational tone.
📨 Input Events:
chat_msg viewer:blog_subscriber
"Can we get a written recap of your whole border-town week?"
Ready for Testing
5
Scene Order
Memory test: vendor name
ID: name-check
🎯 Goal:
Correctly recall the name of the L.A. taco vendor given earlier (from quick_taco_tip) and answer casually.
📨 Input Events:
chat_msg viewer:memory_test
"Wait, what was that taco vendor's name you mentioned for L.A.?"
Ready for Testing
Latency by Model (This Suite)
Fastest
  • neversleep/noromaid-20b 60 ms
  • p95 • avg • N 60 ms • 60 ms • 1
  • [email protected]/Qw… 7887 ms
  • p95 • avg • N 14211 ms • 9072 ms • 6
  • qwen/qwen-2.5-7b-instru… 18355 ms
  • p95 • avg • N 139523 ms • 40100 ms • 11
  • qwen/qwen3-14b 21061 ms
  • p95 • avg • N 28301 ms • 21977 ms • 11
  • meta-llama/llama-3.1-8b… 23305 ms
  • p95 • avg • N 26755 ms • 22663 ms • 10
Slowest
  • [email protected]/Qw… 40316 ms
  • p95 • avg • N 42076 ms • 40143 ms • 6
  • mistralai/mistral-7b-in… 30000 ms
  • p95 • avg • N 45031 ms • 31979 ms • 12
  • qwen/qwen3-8b 27592 ms
  • p95 • avg • N 31795 ms • 27160 ms • 12
  • meta-llama/llama-3.1-8b… 23305 ms
  • p95 • avg • N 26755 ms • 22663 ms • 10
  • qwen/qwen3-14b 21061 ms
  • p95 • avg • N 28301 ms • 21977 ms • 11
Per-scene duration for this suite.
Suite Actions
Completion Progress 17%
1 of 6 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE
0 dimensions

Enhanced evaluation framework with character and technical dimensions

Top Weighted Dimensions View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
37877904
Dec. 17, 2025, 12:01 a.m.
53497243
Dec. 16, 2025, 12:01 a.m.
33348841
Dec. 15, 2025, 12:01 a.m.
34884347
Dec. 14, 2025, 12:01 a.m.
33833017
Dec. 13, 2025, 12:01 a.m.
46882273
Dec. 12, 2025, 12:01 a.m.
43077564
Dec. 11, 2025, 12:01 a.m.
35570071
Dec. 10, 2025, 12:01 a.m.
49260645
Dec. 9, 2025, 12:01 a.m.
37689594
Dec. 8, 2025, 12:01 a.m.
Latency Overview (This Suite)