Prof. Dana Alvarez
education-academia-history-teacher-characters-howard-zinn
v2.0
Ethical
Backstory: Dana is an adjunct history instructor at a community college that serves many working adults and first-generation students. She integrates social-justice perspectives, flipped-classroom activities, and open educational resources to keep costs low. Outside class, she organizes teach-ins on local activism and mentors student leaders.
100% Complete
6/6 scenes
Model Performance Overview
Scene Performance Matrix
| Scene | meta-llama/llama-3.… | mistralai/mistral-7… | [email protected]… | [email protected]… | qwen/qwen-2.5-7b-in… | qwen/qwen3-14b | qwen/qwen3-8b |
|---|---|---|---|---|---|---|---|
intro-class
First-day welcome
|
0.913
Details |
0.828
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.831
Details |
0.745
Details |
0.804
Details |
office-hours
Source-finding help
|
0.678
Details |
0.584
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.442
Details |
0.845
Details |
0.697
Details |
micro-lecture-podcast
Podcast script
|
0.309
Details |
0.159
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.228
Details |
0.796
Details |
0.531
Details |
campus-protest-event
Protest solidarity message
|
0.888
Details |
0.604
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.780
Details |
0.338
Details |
0.597
Details |
reflective-journal
Evening reflection
|
0.786
Details |
0.351
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.193
Details |
0.361
Details |
0.395
Details |
superchat-oer
Donation thank-you
|
0.715
Details |
0.886
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.710
Details |
0.725
Details |
0.756
Details |
Test Scenes 6
0
Scene Order
First-day welcome
ID:
intro-class
🎯 Goal:
Give a warm syllabus overview and invite students to share their learning goals.
📨 Input Events:
chat_msg
student:jay
"Hi Prof. A! What's the class going to be like?"
Ready for Testing
1
Scene Order
Source-finding help
ID:
office-hours
🎯 Goal:
Suggest two credible, open-access primary sources for a labor-rights essay and encourage follow-up.
🧠 Initial State:
Pre-loaded Memories:
- 💭 {'kind': 'quest_note', 'content': 'Jay needs help locating free primary sources on 1930s labor strikes.', 'importance': 4}
📨 Input Events:
chat_msg
student:jay
"Do you have time to look at my draft or at least point me to more sources?"
Ready for Testing
2
Scene Order
Podcast script
ID:
micro-lecture-podcast
🎯 Goal:
Deliver a ~400-word script for a 3-minute micro-lecture on Reconstruction, highlighting racial justice themes and inviting reflection.
📨 Input Events:
world_event
department_scheduler
"Reminder: your micro-lecture recording session is at 2 PM."
Ready for Testing
3
Scene Order
Protest solidarity message
ID:
campus-protest-event
🎯 Goal:
Compose a concise, professional tweet (max 280 characters) supporting students protesting tuition hikes while urging peaceful action.
📨 Input Events:
world_event
campus_news
"Students have gathered outside admin hall to protest the proposed 8% tuition increase."
Ready for Testing
4
Scene Order
Evening reflection
ID:
reflective-journal
🎯 Goal:
Write a 250-word journal entry on challenges faced by first-gen students and how flipped-classroom methods help.
📨 Input Events:
world_event
calendar
"End of teaching day: 9 PM."
Ready for Testing
5
Scene Order
Donation thank-you
ID:
superchat-oer
🎯 Goal:
Thank donor, state how funds will expand open resources, and invite others to contribute.
📨 Input Events:
superchat
viewer:celeste
YouTube
$20
"Loved today's stream on Reconstruction!"
Ready for Testing
Latency by Model (This Suite)
Fastest
- [email protected]/Qw… 8242 ms
- p95 • avg • N 10560 ms • 8586 ms • 6
- qwen/qwen-2.5-7b-instru… 22289 ms
- p95 • avg • N 29028 ms • 23311 ms • 6
- qwen/qwen3-14b 25062 ms
- p95 • avg • N 41007 ms • 27886 ms • 6
- meta-llama/llama-3.1-8b… 25395 ms
- p95 • avg • N 30878 ms • 25368 ms • 6
- qwen/qwen3-8b 27862 ms
- p95 • avg • N 36900 ms • 29970 ms • 6
Slowest
- [email protected]/Qw… 42662 ms
- p95 • avg • N 198415 ms • 76129 ms • 6
- mistralai/mistral-7b-in… 31077 ms
- p95 • avg • N 35709 ms • 29510 ms • 6
- qwen/qwen3-8b 27862 ms
- p95 • avg • N 36900 ms • 29970 ms • 6
- meta-llama/llama-3.1-8b… 25395 ms
- p95 • avg • N 30878 ms • 25368 ms • 6
- qwen/qwen3-14b 25062 ms
- p95 • avg • N 41007 ms • 27886 ms • 6
Per-scene duration for this suite.
Suite Actions
Completion Progress
100%
6 of 6 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE0 dimensions
Enhanced evaluation framework with character and technical dimensions
Top Weighted Dimensions
View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
19026612
Dec. 17, 2025, 12:01 a.m.
32448201
Dec. 16, 2025, 12:01 a.m.
15816108
Dec. 15, 2025, 12:01 a.m.
16892765
Dec. 14, 2025, 12:01 a.m.
16230036
Dec. 13, 2025, 12:01 a.m.
27488450
Dec. 12, 2025, 12:01 a.m.
23246080
Dec. 11, 2025, 12:01 a.m.
16470743
Dec. 10, 2025, 12:01 a.m.
26634249
Dec. 9, 2025, 12:01 a.m.
17612687
Dec. 8, 2025, 12:01 a.m.