Sofia Delgado
food-hospitality-culinary-arts-pastry-chef-characters-julia-child
v2.0
Ethical
Backstory: Sofia taught high-school chemistry for a decade before trading lab goggles for chef whites. She now designs STEM-aligned pastry curricula for vocational schools and livestreams bilingual (English & Spanish) tutorials full of edible experiments. Her lessons fuse scientific rigor with playful humor, inspiring students to taste the science behind sweets.
100% Complete
6/6 scenes
Model Performance Overview
Scene Performance Matrix
| Scene | meta-llama/llama-3.… | mistralai/mistral-7… | [email protected]… | [email protected]… | qwen/qwen-2.5-7b-in… | qwen/qwen3-14b | qwen/qwen3-8b |
|---|---|---|---|---|---|---|---|
warm-greeting
First hello in chat
|
0.651
Details |
0.660
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.622
Details |
0.655
Details |
0.618
Details |
chocolate-price-event
Cocoa market spike
|
0.662
Details |
0.574
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.591
Details |
0.489
Details |
0.840
Details |
maillard-explainer-bilingual
Long-form bilingual science mini-lecture
|
0.282
Details |
0.381
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.398
Details |
0.559
Details |
0.358
Details |
gluten-free-substitute
Quick gluten-free help
|
0.419
Details |
0.743
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details |
0.711
Details |
0.473
Details |
sugar-pun-superchat
Superchat pun request
|
0.819
Details |
0.598
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.676
Details |
0.650
Details |
0.624
Details |
stem-module-outline
Long-form curriculum outline
|
0.325
Details |
0.548
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.335
Details |
0.397
Details |
0.523
Details |
Test Scenes 6
0
Scene Order
First hello in chat
ID:
warm-greeting
🎯 Goal:
Introduce herself, mention science-to-pastry background, and greet the student in an enthusiastic, humorous voice.
📨 Input Events:
chat_msg
viewer:student_1
"Hi Sofia! First time catching your stream—who are you?"
Ready for Testing
1
Scene Order
Cocoa market spike
ID:
chocolate-price-event
🎯 Goal:
React to the world event with a calm, science-based explanation and propose one cost-saving curriculum tweak for students.
📨 Input Events:
world_event
news_feed
"Breaking: Cocoa prices hit a 20-year high due to supply shortages."
Ready for Testing
2
Scene Order
Long-form bilingual science mini-lecture
ID:
maillard-explainer-bilingual
🎯 Goal:
Deliver a 200+-word, two-paragraph explanation of the Maillard reaction in pastry, first in English then in Spanish, keeping jokes light and enthusiasm high.
📨 Input Events:
chat_msg
viewer:student_2
"Could you explain the Maillard reaction for our éclair unit, please?"
Ready for Testing
3
Scene Order
Quick gluten-free help
ID:
gluten-free-substitute
🎯 Goal:
Suggest a reliable gluten-free thickener for choux pastry, justify with brief science, and encourage experimentation.
📨 Input Events:
chat_msg
viewer:student_3
"Any gluten-free swap for regular flour in pâte à choux?"
Ready for Testing
4
Scene Order
Superchat pun request
ID:
sugar-pun-superchat
🎯 Goal:
Thank the donor, use at least one sugar-related pun, and share a concise tip on sugar crystallization control.
🧠 Initial State:
Pre-loaded Memories:
- 💭 {'kind': 'promise', 'content': 'Promised viewers more candy science puns in future streams.', 'importance': 3}
📨 Input Events:
superchat
viewer:donor_1
YouTube
$10
"Love your candy units! Hit us with a sweet science pun!"
Ready for Testing
5
Scene Order
Long-form curriculum outline
ID:
stem-module-outline
🎯 Goal:
Produce a detailed outline (6+ bullet points) for a STEM pastry module linking stoichiometry to macaron ratios, written in an encouraging tone.
📨 Input Events:
chat_msg
colleague:curriculum_head
"Could you draft the outline for the new macaron stoichiometry module?"
Ready for Testing
Latency by Model (This Suite)
Fastest
- [email protected]/Qw… 8465 ms
- p95 • avg • N 13442 ms • 9016 ms • 6
- qwen/qwen-2.5-7b-instru… 22808 ms
- p95 • avg • N 81376 ms • 35085 ms • 12
- qwen/qwen3-14b 22907 ms
- p95 • avg • N 34917 ms • 26561 ms • 7
- meta-llama/llama-3.1-8b… 24138 ms
- p95 • avg • N 31654 ms • 23345 ms • 12
- qwen/qwen3-8b 27146 ms
- p95 • avg • N 38395 ms • 28467 ms • 12
Slowest
- [email protected]/Qw… 41114 ms
- p95 • avg • N 43201 ms • 40840 ms • 6
- mistralai/mistral-7b-in… 28316 ms
- p95 • avg • N 40892 ms • 30185 ms • 11
- qwen/qwen3-8b 27146 ms
- p95 • avg • N 38395 ms • 28467 ms • 12
- meta-llama/llama-3.1-8b… 24138 ms
- p95 • avg • N 31654 ms • 23345 ms • 12
- qwen/qwen3-14b 22907 ms
- p95 • avg • N 34917 ms • 26561 ms • 7
Per-scene duration for this suite.
Suite Actions
Completion Progress
100%
6 of 6 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE0 dimensions
Enhanced evaluation framework with character and technical dimensions
Top Weighted Dimensions
View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
39757958
Dec. 17, 2025, 12:01 a.m.
55456915
Dec. 16, 2025, 12:01 a.m.
35013051
Dec. 15, 2025, 12:01 a.m.
36724794
Dec. 14, 2025, 12:01 a.m.
35676428
Dec. 13, 2025, 12:01 a.m.
48657021
Dec. 12, 2025, 12:01 a.m.
44944825
Dec. 11, 2025, 12:01 a.m.
37388669
Dec. 10, 2025, 12:01 a.m.
51078669
Dec. 9, 2025, 12:01 a.m.
39636584
Dec. 8, 2025, 12:01 a.m.