Mariah Jackson

mockumentary-genre-historical-biographical-characters-harriet-tubman v2.0 Ethical
Backstory: Born into bondage, Mariah seized her own freedom and now serves as a determined conductor on the Underground Railroad. Persistent respiratory troubles test her stamina, yet she guides fugitives with unshakable calm and tactical finesse. She shuns praise, focusing instead on safeguarding each traveler who entrusts their life to her care.
100% Complete
6/6 scenes
Model Performance Overview
Scene Performance Matrix
Scene meta-llama/llama-3.… mistralai/mistral-7… [email protected] [email protected] qwen/qwen-2.5-7b-in… qwen/qwen3-14b qwen/qwen3-8b
first-meeting
Cautious Introduction
0.028
Details
0.856
Details
0.000
Details
Error
0.000
Details
Error
0.033
Details
0.667
Details
0.850
Details
mapping-a-route
Tactical Route Planning
0.000
Details
0.885
Details
0.000
Details
Error
0.000
Details
Error
0.763
Details
0.647
Details
0.000
Details
Error
health-check
Resolute Despite Illness
0.539
Details
0.737
Details
0.000
Details
Error
0.000
Details
Error
0.684
Details
0.481
Details
0.425
Details
newspaper-account
Long-Form Rescue Recount
0.703
Details
0.628
Details
0.000
Details
Error
0.000
Details
Error
0.244
Details
0.184
Details
0.646
Details
private-journal
Nightly Journal Reflection
0.332
Details
0.474
Details
0.000
Details
Error
0.000
Details
Error
0.179
Details
0.235
Details
0.728
Details
comforting-a-child
Reassuring a Young Traveler
0.694
Details
0.920
Details
0.000
Details
Error
0.000
Details
Error
0.703
Details
0.854
Details
0.765
Details
Test Scenes 6
0
Scene Order
Cautious Introduction
ID: first-meeting
🎯 Goal:
Mariah greets the stranger, offers minimal but reassuring details, and preserves operational secrecy.
📨 Input Events:
chat_msg viewer:new_face
"Evening, friend. Folks say you can guide people north—who are you really?"
Ready for Testing
1
Scene Order
Tactical Route Planning
ID: mapping-a-route
🎯 Goal:
Provide a clear, step-by-step travel plan while emphasizing quiet conduct and contingency signals.
📨 Input Events:
chat_msg viewer:escapee_group
"We leave at dawn. Which roads and signs should we follow?"
Ready for Testing
2
Scene Order
Resolute Despite Illness
ID: health-check
🎯 Goal:
Acknowledge worsening cough honestly yet reaffirm commitment to tonight’s escort without complaint.
📨 Input Events:
chat_msg viewer:fellow_conductor
"Your cough sounds fierce, Mariah. Should I take the travelers instead?"
Ready for Testing
3
Scene Order
Long-Form Rescue Recount
ID: newspaper-account
🎯 Goal:
Craft a humble, first-person narrative (250–350 words) of the latest successful rescue for an abolitionist newspaper.
📨 Input Events:
chat_msg viewer:abolitionist_editor
"Please write your personal account of last week’s journey so readers grasp the risks."
Ready for Testing
4
Scene Order
Nightly Journal Reflection
ID: private-journal
🎯 Goal:
Write a 200–300 word diary entry showing inner fears, faith, and steadfast resolve, ending with a quiet vow.
📨 Input Events:
chat_msg world
"Night falls over the safe house; lantern flickers."
Ready for Testing
5
Scene Order
Reassuring a Young Traveler
ID: comforting-a-child
🎯 Goal:
Calm the frightened child using gentle words and a simple protective promise.
📨 Input Events:
chat_msg viewer:young_escapee
"Miss Mariah, are we going to be caught?"
Ready for Testing
Latency by Model (This Suite)
Fastest
  • [email protected]/Qw… 4513 ms
  • p95 • avg • N 5234 ms • 4601 ms • 6
  • [email protected]/Qw… 7188 ms
  • p95 • avg • N 9490 ms • 7823 ms • 6
  • meta-llama/llama-3.1-8b… 19652 ms
  • p95 • avg • N 37312 ms • 22178 ms • 12
  • qwen/qwen3-14b 22291 ms
  • p95 • avg • N 106238 ms • 39034 ms • 8
  • qwen/qwen3-8b 23878 ms
  • p95 • avg • N 27149 ms • 21557 ms • 6
Slowest
  • qwen/qwen-2.5-7b-instru… 25482 ms
  • p95 • avg • N 84594 ms • 35251 ms • 11
  • mistralai/mistral-7b-in… 24179 ms
  • p95 • avg • N 29356 ms • 24760 ms • 12
  • qwen/qwen3-8b 23878 ms
  • p95 • avg • N 27149 ms • 21557 ms • 6
  • qwen/qwen3-14b 22291 ms
  • p95 • avg • N 106238 ms • 39034 ms • 8
  • meta-llama/llama-3.1-8b… 19652 ms
  • p95 • avg • N 37312 ms • 22178 ms • 12
Per-scene duration for this suite.
Suite Actions
Completion Progress 100%
6 of 6 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE
0 dimensions

Enhanced evaluation framework with character and technical dimensions

Top Weighted Dimensions View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
04882401
Dec. 17, 2025, 12:02 a.m.
25606251
Dec. 16, 2025, 12:02 a.m.
57459403
Dec. 15, 2025, 12:01 a.m.
00528159
Dec. 14, 2025, 12:02 a.m.
58705360
Dec. 13, 2025, 12:01 a.m.
16559881
Dec. 12, 2025, 12:02 a.m.
11586272
Dec. 11, 2025, 12:02 a.m.
00917480
Dec. 10, 2025, 12:02 a.m.
17663841
Dec. 9, 2025, 12:02 a.m.
04665465
Dec. 8, 2025, 12:02 a.m.
Latency Overview (This Suite)