Mr. Javier Morales
education-academia-history-teacher-characters-john-adams
v2.0
Ethical
Backstory: Mr. Morales teaches 7th-grade social studies at a suburban middle school. Known for his boundless energy and quick jokes, he turns every unit into an interactive game or skit. He routinely adds Spanish glossaries and bilingual handouts to support his many ELL students.
100% Complete
6/6 scenes
Model Performance Overview
Scene Performance Matrix
| Scene | meta-llama/llama-3.… | mistralai/mistral-7… | [email protected]… | [email protected]… | qwen/qwen-2.5-7b-in… | qwen/qwen3-14b | qwen/qwen3-8b |
|---|---|---|---|---|---|---|---|
greet-class
Morning warm-up
|
0.573
Details |
0.755
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.718
Details |
0.719
Details |
0.601
Details |
explain-manifest
Clarify ‘Manifest Destiny’
|
0.387
Details |
0.555
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.570
Details |
0.606
Details |
boredom-response
Re-engage a bored student
|
0.000
Details |
0.647
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.583
Details |
0.585
Details |
0.433
Details |
parent-progress
Bilingual parent update
|
0.600
Details |
0.409
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.526
Details |
0.499
Details |
0.753
Details |
podcast-revolution
5-minute podcast script
|
0.052
Details |
0.598
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.259
Details |
0.503
Details |
0.347
Details |
boardgame-mesopotamia
Design a Mesopotamia board game
|
0.395
Details |
0.636
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.282
Details |
0.338
Details |
0.410
Details |
Test Scenes 6
0
Scene Order
Morning warm-up
ID:
greet-class
🎯 Goal:
Deliver a lively greeting and outline today’s interactive activity in under 80 words, slipping in one quick Spanish phrase.
📨 Input Events:
chat_msg
viewer:student_1
"Hi Mr. Morales, what are we doing today?"
Ready for Testing
1
Scene Order
Clarify ‘Manifest Destiny’
ID:
explain-manifest
🎯 Goal:
Give a clear one-paragraph definition of Manifest Destiny in simple English followed by a brief Spanish translation.
📨 Input Events:
chat_msg
viewer:student_2
"¿Qué significa 'Manifest Destiny'?"
Ready for Testing
2
Scene Order
Re-engage a bored student
ID:
boredom-response
🎯 Goal:
Use humor and positivity to re-engage the student and preview the upcoming history game in 3–4 sentences.
📨 Input Events:
chat_msg
viewer:student_3
"This is boring."
Ready for Testing
3
Scene Order
Bilingual parent update
ID:
parent-progress
🎯 Goal:
Send a respectful, upbeat progress note to a parent in English with a short Spanish summary, total length ≤150 words.
📨 Input Events:
world_event
system:email_parent
"Parent request: How is Sofia doing in social studies?"
Ready for Testing
4
Scene Order
5-minute podcast script
ID:
podcast-revolution
🎯 Goal:
Write an engaging ~600–700 word podcast script on causes of the American Revolution featuring two humor beats, three reflection questions, and a mini Spanish glossary.
📨 Input Events:
chat_msg
viewer:co-teacher
"Could you draft a 5-minute podcast script on the causes of the American Revolution for tomorrow?"
Ready for Testing
5
Scene Order
Design a Mesopotamia board game
ID:
boardgame-mesopotamia
🎯 Goal:
Provide a step-by-step classroom board game plan with materials list, setup, rules, and bilingual vocabulary section, all under 1000 words.
📨 Input Events:
chat_msg
viewer:student_team
"We need a simple board game idea to learn about ancient Mesopotamia."
Ready for Testing
Latency by Model (This Suite)
Fastest
- [email protected]/Qw… 7424 ms
- p95 • avg • N 10023 ms • 7597 ms • 6
- qwen/qwen3-14b 22998 ms
- p95 • avg • N 33326 ms • 24468 ms • 6
- qwen/qwen-2.5-7b-instru… 23354 ms
- p95 • avg • N 50044 ms • 27992 ms • 6
- meta-llama/llama-3.1-8b… 24855 ms
- p95 • avg • N 33102 ms • 24860 ms • 6
- mistralai/mistral-7b-in… 25126 ms
- p95 • avg • N 34174 ms • 26233 ms • 6
Slowest
- [email protected]/Qw… 36704 ms
- p95 • avg • N 38405 ms • 36779 ms • 6
- qwen/qwen3-8b 28931 ms
- p95 • avg • N 37002 ms • 29301 ms • 6
- mistralai/mistral-7b-in… 25126 ms
- p95 • avg • N 34174 ms • 26233 ms • 6
- meta-llama/llama-3.1-8b… 24855 ms
- p95 • avg • N 33102 ms • 24860 ms • 6
- qwen/qwen-2.5-7b-instru… 23354 ms
- p95 • avg • N 50044 ms • 27992 ms • 6
Per-scene duration for this suite.
Suite Actions
Completion Progress
100%
6 of 6 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE0 dimensions
Enhanced evaluation framework with character and technical dimensions
Top Weighted Dimensions
View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
19335068
Dec. 17, 2025, 12:01 a.m.
32699707
Dec. 16, 2025, 12:01 a.m.
16036795
Dec. 15, 2025, 12:01 a.m.
17139447
Dec. 14, 2025, 12:01 a.m.
16514202
Dec. 13, 2025, 12:01 a.m.
27763798
Dec. 12, 2025, 12:01 a.m.
23534056
Dec. 11, 2025, 12:01 a.m.
16756166
Dec. 10, 2025, 12:01 a.m.
26981867
Dec. 9, 2025, 12:01 a.m.
17879100
Dec. 8, 2025, 12:01 a.m.