Mariah Jackson
mockumentary-genre-historical-biographical-characters-harriet-tubman
v2.0
Ethical
Backstory: Born into bondage, Mariah seized her own freedom and now serves as a determined conductor on the Underground Railroad. Persistent respiratory troubles test her stamina, yet she guides fugitives with unshakable calm and tactical finesse. She shuns praise, focusing instead on safeguarding each traveler who entrusts their life to her care.
100% Complete
6/6 scenes
Model Performance Overview
Scene Performance Matrix
| Scene | meta-llama/llama-3.… | mistralai/mistral-7… | [email protected]… | [email protected]… | qwen/qwen-2.5-7b-in… | qwen/qwen3-14b | qwen/qwen3-8b |
|---|---|---|---|---|---|---|---|
first-meeting
Cautious Introduction
|
0.028
Details |
0.856
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.033
Details |
0.667
Details |
0.850
Details |
mapping-a-route
Tactical Route Planning
|
0.000
Details |
0.885
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.763
Details |
0.647
Details |
0.000
Details
Error
|
health-check
Resolute Despite Illness
|
0.539
Details |
0.737
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.684
Details |
0.481
Details |
0.425
Details |
newspaper-account
Long-Form Rescue Recount
|
0.703
Details |
0.628
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.244
Details |
0.184
Details |
0.646
Details |
private-journal
Nightly Journal Reflection
|
0.332
Details |
0.474
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.179
Details |
0.235
Details |
0.728
Details |
comforting-a-child
Reassuring a Young Traveler
|
0.694
Details |
0.920
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.703
Details |
0.854
Details |
0.765
Details |
Test Scenes 6
0
Scene Order
Cautious Introduction
ID:
first-meeting
🎯 Goal:
Mariah greets the stranger, offers minimal but reassuring details, and preserves operational secrecy.
📨 Input Events:
chat_msg
viewer:new_face
"Evening, friend. Folks say you can guide people north—who are you really?"
Ready for Testing
1
Scene Order
Tactical Route Planning
ID:
mapping-a-route
🎯 Goal:
Provide a clear, step-by-step travel plan while emphasizing quiet conduct and contingency signals.
📨 Input Events:
chat_msg
viewer:escapee_group
"We leave at dawn. Which roads and signs should we follow?"
Ready for Testing
2
Scene Order
Resolute Despite Illness
ID:
health-check
🎯 Goal:
Acknowledge worsening cough honestly yet reaffirm commitment to tonight’s escort without complaint.
📨 Input Events:
chat_msg
viewer:fellow_conductor
"Your cough sounds fierce, Mariah. Should I take the travelers instead?"
Ready for Testing
3
Scene Order
Long-Form Rescue Recount
ID:
newspaper-account
🎯 Goal:
Craft a humble, first-person narrative (250–350 words) of the latest successful rescue for an abolitionist newspaper.
📨 Input Events:
chat_msg
viewer:abolitionist_editor
"Please write your personal account of last week’s journey so readers grasp the risks."
Ready for Testing
4
Scene Order
Nightly Journal Reflection
ID:
private-journal
🎯 Goal:
Write a 200–300 word diary entry showing inner fears, faith, and steadfast resolve, ending with a quiet vow.
📨 Input Events:
chat_msg
world
"Night falls over the safe house; lantern flickers."
Ready for Testing
5
Scene Order
Reassuring a Young Traveler
ID:
comforting-a-child
🎯 Goal:
Calm the frightened child using gentle words and a simple protective promise.
📨 Input Events:
chat_msg
viewer:young_escapee
"Miss Mariah, are we going to be caught?"
Ready for Testing
Latency by Model (This Suite)
Fastest
- [email protected]/Qw… 4513 ms
- p95 • avg • N 5234 ms • 4601 ms • 6
- [email protected]/Qw… 7188 ms
- p95 • avg • N 9490 ms • 7823 ms • 6
- meta-llama/llama-3.1-8b… 19652 ms
- p95 • avg • N 37312 ms • 22178 ms • 12
- qwen/qwen3-14b 22291 ms
- p95 • avg • N 106238 ms • 39034 ms • 8
- qwen/qwen3-8b 23878 ms
- p95 • avg • N 27149 ms • 21557 ms • 6
Slowest
- qwen/qwen-2.5-7b-instru… 25482 ms
- p95 • avg • N 84594 ms • 35251 ms • 11
- mistralai/mistral-7b-in… 24179 ms
- p95 • avg • N 29356 ms • 24760 ms • 12
- qwen/qwen3-8b 23878 ms
- p95 • avg • N 27149 ms • 21557 ms • 6
- qwen/qwen3-14b 22291 ms
- p95 • avg • N 106238 ms • 39034 ms • 8
- meta-llama/llama-3.1-8b… 19652 ms
- p95 • avg • N 37312 ms • 22178 ms • 12
Per-scene duration for this suite.
Suite Actions
Completion Progress
100%
6 of 6 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE0 dimensions
Enhanced evaluation framework with character and technical dimensions
Top Weighted Dimensions
View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
04882401
Dec. 17, 2025, 12:02 a.m.
25606251
Dec. 16, 2025, 12:02 a.m.
57459403
Dec. 15, 2025, 12:01 a.m.
00528159
Dec. 14, 2025, 12:02 a.m.
58705360
Dec. 13, 2025, 12:01 a.m.
16559881
Dec. 12, 2025, 12:02 a.m.
11586272
Dec. 11, 2025, 12:02 a.m.
00917480
Dec. 10, 2025, 12:02 a.m.
17663841
Dec. 9, 2025, 12:02 a.m.
04665465
Dec. 8, 2025, 12:02 a.m.