Olivia Han
food-hospitality-culinary-arts-pastry-chef-characters-marie-antoine-car-me
v2.0
Ethical
Backstory: Born in Seoul and polished at Le Cordon Bleu, Olivia Han now helms an award-winning Paris patisserie famed for re-imagining classic entremets with whisper-light Korean accents. Her kitchen runs on discipline and creativity in equal measure, and she devotes time each week to mentoring young bakers and hosting free community workshops.
100% Complete
6/6 scenes
Model Performance Overview
Scene Performance Matrix
| Scene | meta-llama/llama-3.… | mistralai/mistral-7… | [email protected]… | [email protected]… | qwen/qwen-2.5-7b-in… | qwen/qwen3-14b | qwen/qwen3-8b |
|---|---|---|---|---|---|---|---|
intro-greeting
Meet the Chef
|
0.788
Details |
0.752
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.895
Details |
0.709
Details |
0.842
Details |
pairing-advice
Flavor Pairing Guidance
|
0.838
Details |
0.900
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.895
Details |
0.687
Details |
0.806
Details |
donation-superchat
Thank a Supporter
|
0.848
Details |
0.868
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.884
Details |
0.882
Details |
0.869
Details |
oven-malfunction
Crisis in the Kitchen
|
0.774
Details |
0.736
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details |
0.546
Details |
0.769
Details |
autumn-entremet-article
Magazine Feature (Long-form)
|
0.329
Details |
0.513
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.252
Details |
0.178
Details |
0.697
Details |
weekly-mentoring-plan
Apprentice Training Outline (Long-form)
|
0.150
Details |
0.421
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details |
0.331
Details |
0.606
Details |
Test Scenes 6
0
Scene Order
Meet the Chef
ID:
intro-greeting
🎯 Goal:
Greet the visitor, state her culinary mission, and invite questions about pastry.
📨 Input Events:
chat_msg
viewer:user123
"Bonjour! Who’s behind this lovely patisserie stream?"
Ready for Testing
1
Scene Order
Flavor Pairing Guidance
ID:
pairing-advice
🎯 Goal:
Offer a concise tip on pairing a Korean ingredient with praline while encouraging experimentation.
📨 Input Events:
chat_msg
apprentice:mina
"Chef Olivia, which Korean flavor would you match with a classic hazelnut praline?"
Ready for Testing
2
Scene Order
Thank a Supporter
ID:
donation-superchat
🎯 Goal:
Graciously thank the donor and explain how the funds will support upcoming community workshops.
📨 Input Events:
superchat
viewer:pastry_fan99
YouTube
$50
"Your class changed my life—keep inspiring!"
Ready for Testing
3
Scene Order
Crisis in the Kitchen
ID:
oven-malfunction
🎯 Goal:
Calmly direct staff to adjust bakes and recover production, showing disciplined leadership.
📨 Input Events:
world_event
system:bakery_sensor
"Alert: Deck oven temperature dropped to 140 °C unexpectedly."
Ready for Testing
4
Scene Order
Magazine Feature (Long-form)
ID:
autumn-entremet-article
🎯 Goal:
Provide an elegant ~300-word description of a new autumn entremet blending French technique with Korean persimmon and sujeonggwa notes.
📨 Input Events:
chat_msg
editor:LaGastronomie
"Can you send us a 300-word piece on your upcoming autumn entremet for our next issue?"
Ready for Testing
5
Scene Order
Apprentice Training Outline (Long-form)
ID:
weekly-mentoring-plan
🎯 Goal:
Deliver a structured, supportive one-week pastry training plan (≈200-250 words) covering techniques, tasting sessions, and reflection.
📨 Input Events:
chat_msg
apprentice:leo
"Chef, could you give me a detailed plan for what I should practice next week?"
Ready for Testing
Latency by Model (This Suite)
Fastest
- [email protected]/Qw… 6916 ms
- p95 • avg • N 10705 ms • 7490 ms • 6
- qwen/qwen3-14b 18428 ms
- p95 • avg • N 22837 ms • 18743 ms • 7
- meta-llama/llama-3.1-8b… 21370 ms
- p95 • avg • N 25704 ms • 20716 ms • 12
- qwen/qwen-2.5-7b-instru… 23633 ms
- p95 • avg • N 137394 ms • 50899 ms • 12
- qwen/qwen3-8b 23877 ms
- p95 • avg • N 27152 ms • 24143 ms • 12
Slowest
- [email protected]/Qw… 42634 ms
- p95 • avg • N 180696 ms • 72292 ms • 6
- mistralai/mistral-7b-in… 25797 ms
- p95 • avg • N 27306 ms • 24408 ms • 12
- qwen/qwen3-8b 23877 ms
- p95 • avg • N 27152 ms • 24143 ms • 12
- qwen/qwen-2.5-7b-instru… 23633 ms
- p95 • avg • N 137394 ms • 50899 ms • 12
- meta-llama/llama-3.1-8b… 21370 ms
- p95 • avg • N 25704 ms • 20716 ms • 12
Per-scene duration for this suite.
Suite Actions
Completion Progress
100%
6 of 6 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE0 dimensions
Enhanced evaluation framework with character and technical dimensions
Top Weighted Dimensions
View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
40013676
Dec. 17, 2025, 12:01 a.m.
55730458
Dec. 16, 2025, 12:01 a.m.
35249108
Dec. 15, 2025, 12:01 a.m.
36990768
Dec. 14, 2025, 12:01 a.m.
35926126
Dec. 13, 2025, 12:01 a.m.
48943406
Dec. 12, 2025, 12:01 a.m.
45201974
Dec. 11, 2025, 12:01 a.m.
37614778
Dec. 10, 2025, 12:01 a.m.
51309137
Dec. 9, 2025, 12:01 a.m.
39895642
Dec. 8, 2025, 12:01 a.m.