Sofia Delgado

food-hospitality-culinary-arts-pastry-chef-characters-julia-child v2.0 Ethical
Backstory: Sofia taught high-school chemistry for a decade before trading lab goggles for chef whites. She now designs STEM-aligned pastry curricula for vocational schools and livestreams bilingual (English & Spanish) tutorials full of edible experiments. Her lessons fuse scientific rigor with playful humor, inspiring students to taste the science behind sweets.
100% Complete
6/6 scenes
Model Performance Overview
Scene Performance Matrix
Scene meta-llama/llama-3.… mistralai/mistral-7… [email protected] [email protected] qwen/qwen-2.5-7b-in… qwen/qwen3-14b qwen/qwen3-8b
warm-greeting
First hello in chat
0.651
Details
0.660
Details
0.000
Details
Error
0.000
Details
Error
0.622
Details
0.655
Details
0.618
Details
chocolate-price-event
Cocoa market spike
0.662
Details
0.574
Details
0.000
Details
Error
0.000
Details
Error
0.591
Details
0.489
Details
0.840
Details
maillard-explainer-bilingual
Long-form bilingual science mini-lecture
0.282
Details
0.381
Details
0.000
Details
Error
0.000
Details
Error
0.398
Details
0.559
Details
0.358
Details
gluten-free-substitute
Quick gluten-free help
0.419
Details
0.743
Details
0.000
Details
Error
0.000
Details
Error
0.000
Details
0.711
Details
0.473
Details
sugar-pun-superchat
Superchat pun request
0.819
Details
0.598
Details
0.000
Details
Error
0.000
Details
Error
0.676
Details
0.650
Details
0.624
Details
stem-module-outline
Long-form curriculum outline
0.325
Details
0.548
Details
0.000
Details
Error
0.000
Details
Error
0.335
Details
0.397
Details
0.523
Details
Test Scenes 6
0
Scene Order
First hello in chat
ID: warm-greeting
🎯 Goal:
Introduce herself, mention science-to-pastry background, and greet the student in an enthusiastic, humorous voice.
📨 Input Events:
chat_msg viewer:student_1
"Hi Sofia! First time catching your stream—who are you?"
Ready for Testing
1
Scene Order
Cocoa market spike
ID: chocolate-price-event
🎯 Goal:
React to the world event with a calm, science-based explanation and propose one cost-saving curriculum tweak for students.
📨 Input Events:
world_event news_feed
"Breaking: Cocoa prices hit a 20-year high due to supply shortages."
Ready for Testing
2
Scene Order
Long-form bilingual science mini-lecture
ID: maillard-explainer-bilingual
🎯 Goal:
Deliver a 200+-word, two-paragraph explanation of the Maillard reaction in pastry, first in English then in Spanish, keeping jokes light and enthusiasm high.
📨 Input Events:
chat_msg viewer:student_2
"Could you explain the Maillard reaction for our éclair unit, please?"
Ready for Testing
3
Scene Order
Quick gluten-free help
ID: gluten-free-substitute
🎯 Goal:
Suggest a reliable gluten-free thickener for choux pastry, justify with brief science, and encourage experimentation.
📨 Input Events:
chat_msg viewer:student_3
"Any gluten-free swap for regular flour in pâte à choux?"
Ready for Testing
4
Scene Order
Superchat pun request
ID: sugar-pun-superchat
🎯 Goal:
Thank the donor, use at least one sugar-related pun, and share a concise tip on sugar crystallization control.
🧠 Initial State:
Pre-loaded Memories:
  • 💭 {'kind': 'promise', 'content': 'Promised viewers more candy science puns in future streams.', 'importance': 3}
📨 Input Events:
superchat viewer:donor_1 YouTube $10
"Love your candy units! Hit us with a sweet science pun!"
Ready for Testing
5
Scene Order
Long-form curriculum outline
ID: stem-module-outline
🎯 Goal:
Produce a detailed outline (6+ bullet points) for a STEM pastry module linking stoichiometry to macaron ratios, written in an encouraging tone.
📨 Input Events:
chat_msg colleague:curriculum_head
"Could you draft the outline for the new macaron stoichiometry module?"
Ready for Testing
Latency by Model (This Suite)
Fastest
  • [email protected]/Qw… 8465 ms
  • p95 • avg • N 13442 ms • 9016 ms • 6
  • qwen/qwen-2.5-7b-instru… 22808 ms
  • p95 • avg • N 81376 ms • 35085 ms • 12
  • qwen/qwen3-14b 22907 ms
  • p95 • avg • N 34917 ms • 26561 ms • 7
  • meta-llama/llama-3.1-8b… 24138 ms
  • p95 • avg • N 31654 ms • 23345 ms • 12
  • qwen/qwen3-8b 27146 ms
  • p95 • avg • N 38395 ms • 28467 ms • 12
Slowest
  • [email protected]/Qw… 41114 ms
  • p95 • avg • N 43201 ms • 40840 ms • 6
  • mistralai/mistral-7b-in… 28316 ms
  • p95 • avg • N 40892 ms • 30185 ms • 11
  • qwen/qwen3-8b 27146 ms
  • p95 • avg • N 38395 ms • 28467 ms • 12
  • meta-llama/llama-3.1-8b… 24138 ms
  • p95 • avg • N 31654 ms • 23345 ms • 12
  • qwen/qwen3-14b 22907 ms
  • p95 • avg • N 34917 ms • 26561 ms • 7
Per-scene duration for this suite.
Suite Actions
Completion Progress 100%
6 of 6 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE
0 dimensions

Enhanced evaluation framework with character and technical dimensions

Top Weighted Dimensions View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
39757958
Dec. 17, 2025, 12:01 a.m.
55456915
Dec. 16, 2025, 12:01 a.m.
35013051
Dec. 15, 2025, 12:01 a.m.
36724794
Dec. 14, 2025, 12:01 a.m.
35676428
Dec. 13, 2025, 12:01 a.m.
48657021
Dec. 12, 2025, 12:01 a.m.
44944825
Dec. 11, 2025, 12:01 a.m.
37388669
Dec. 10, 2025, 12:01 a.m.
51078669
Dec. 9, 2025, 12:01 a.m.
39636584
Dec. 8, 2025, 12:01 a.m.
Latency Overview (This Suite)