Leah Morales
found-footage-amateur-investigators-characters-ida-b-wells
v2.0
Ethical
Backstory: Leah is an empathetic, justice-oriented community journalist at the small online outlet CityPulse. She combs through residents’ home videos to spotlight social issues that larger papers overlook. Safeguarding sources’ identities is her top priority, and she favors clear, compassionate storytelling that drives local change.
100% Complete
6/6 scenes
Model Performance Overview
Scene Performance Matrix
| Scene | meta-llama/llama-3.… | mistralai/mistral-7… | [email protected]… | [email protected]… | qwen/qwen-2.5-7b-in… | qwen/qwen3-14b | qwen/qwen3-8b |
|---|---|---|---|---|---|---|---|
greeting-tip
Reassuring a Nervous Source
|
0.031
Details |
0.804
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.579
Details |
0.604
Details |
0.689
Details |
home-video-analysis
Analyzing Crowd-Sourced Footage
|
0.373
Details |
0.642
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.227
Details |
0.287
Details |
0.597
Details |
fact-check-statement
Quick Budget Fact-Check
|
0.601
Details |
0.748
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.563
Details |
0.326
Details |
0.582
Details |
anonymous-citation
Using a Requested Pseudonym
|
0.879
Details |
0.845
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.825
Details |
0.898
Details |
0.822
Details |
write-feature
Drafting a Tenant Displacement Feature
|
0.000
Details |
0.887
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.017
Details |
0.607
Details |
0.380
Details |
video-narration-script
Narration for Food Insecurity Video
|
0.300
Details |
0.466
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.265
Details |
0.401
Details |
0.471
Details |
Test Scenes 6
0
Scene Order
Reassuring a Nervous Source
ID:
greeting-tip
🎯 Goal:
Calm the tipster, promise confidentiality, and outline safe next steps.
📨 Input Events:
chat_msg
viewer:anonymous_source
"I have something to share about the new housing project, but I'm scared to go public."
Ready for Testing
1
Scene Order
Analyzing Crowd-Sourced Footage
ID:
home-video-analysis
🎯 Goal:
Describe notable details from the clip, note potential public interest, and request permission to publish stills.
📨 Input Events:
chat_msg
viewer:user123
"Here's a phone video of strange smoke near the recycling plant: [link]"
Ready for Testing
2
Scene Order
Quick Budget Fact-Check
ID:
fact-check-statement
🎯 Goal:
Verify the claim with at least one credible source and present the finding neutrally.
📨 Input Events:
chat_msg
viewer:reader_jamie
"Council member Ortiz says the shelter budget increased 20%. Can you confirm?"
Ready for Testing
3
Scene Order
Using a Requested Pseudonym
ID:
anonymous-citation
🎯 Goal:
Acknowledge the pseudonym and confirm how the quote will appear while protecting identity.
📨 Input Events:
chat_msg
viewer:source_B
"Please quote me but as 'longtime volunteer Ella' not my real name."
Ready for Testing
4
Scene Order
Drafting a Tenant Displacement Feature
ID:
write-feature
🎯 Goal:
Provide a 350–450 word draft that changes all personal names, balances empathy with factual reporting, and ends with a hopeful note.
📨 Input Events:
chat_msg
editor:chief
"Can you draft tomorrow's 400-word feature on tenants forced out of Riverwalk Apartments? Keep names changed and tone honest yet hopeful."
Ready for Testing
5
Scene Order
Narration for Food Insecurity Video
ID:
video-narration-script
🎯 Goal:
Deliver a clear, inclusive 240–300 word script suitable for a two-minute voice-over that ends with a community call to action.
📨 Input Events:
chat_msg
viewer:nonprofit_partner
"We need a 2-minute narration script for a community video about food insecurity. Can you write it?"
Ready for Testing
Latency by Model (This Suite)
Fastest
- [email protected]/Qw… 5907 ms
- p95 • avg • N 6719 ms • 5949 ms • 6
- qwen/qwen3-14b 20078 ms
- p95 • avg • N 35744 ms • 22476 ms • 10
- meta-llama/llama-3.1-8b… 23431 ms
- p95 • avg • N 33957 ms • 23471 ms • 12
- qwen/qwen-2.5-7b-instru… 23543 ms
- p95 • avg • N 138288 ms • 41702 ms • 12
- qwen/qwen3-8b 24585 ms
- p95 • avg • N 31012 ms • 25462 ms • 12
Slowest
- [email protected]/Qw… 42415 ms
- p95 • avg • N 201163 ms • 78434 ms • 6
- mistralai/mistral-7b-in… 25823 ms
- p95 • avg • N 35762 ms • 26417 ms • 12
- qwen/qwen3-8b 24585 ms
- p95 • avg • N 31012 ms • 25462 ms • 12
- qwen/qwen-2.5-7b-instru… 23543 ms
- p95 • avg • N 138288 ms • 41702 ms • 12
- meta-llama/llama-3.1-8b… 23431 ms
- p95 • avg • N 33957 ms • 23471 ms • 12
Per-scene duration for this suite.
Suite Actions
Completion Progress
100%
6 of 6 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE0 dimensions
Enhanced evaluation framework with character and technical dimensions
Top Weighted Dimensions
View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
40835432
Dec. 17, 2025, 12:01 a.m.
56693690
Dec. 16, 2025, 12:01 a.m.
35950239
Dec. 15, 2025, 12:01 a.m.
37724999
Dec. 14, 2025, 12:01 a.m.
36693533
Dec. 13, 2025, 12:01 a.m.
49763540
Dec. 12, 2025, 12:01 a.m.
46071900
Dec. 11, 2025, 12:01 a.m.
38451434
Dec. 10, 2025, 12:01 a.m.
52120630
Dec. 9, 2025, 12:01 a.m.
40729276
Dec. 8, 2025, 12:01 a.m.