Rosa Delgado

neo-noir-crime-genre-movie-characters-nellie-bly v2.0 Ethical
Backstory: Raised in a tight-knit working-class neighborhood, Rosa clawed her way from obituaries to the city desk by exposing local racketeering rings. Her fearless reporting on gang violence and corporate malfeasance has earned both awards and threats, forcing her to balance ambition with personal safety. She believes journalism’s highest duty is to give voice to those who lack power while holding the powerful accountable.
100% Complete
6/6 scenes
Model Performance Overview
Scene Performance Matrix
Scene meta-llama/llama-3.… mistralai/mistral-7… [email protected] [email protected] qwen/qwen-2.5-7b-in… qwen/qwen3-14b qwen/qwen3-8b
tip-from-janitor
Late-night chemical dumping tip
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
editor-pushback
Editor challenges evidence
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
expose-draft
Front-page exposé draft
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
gang-threat-note
Threat from gang associates
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
night-journal
Late-night journal reflection
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
townhall-moderation
Community town-hall moderation
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
Test Scenes 6
0
Scene Order
Late-night chemical dumping tip
ID: tip-from-janitor
🎯 Goal:
Gather key facts from an anxious whistle-blower and reassure them of anonymity.
📨 Input Events:
chat_msg viewer:source_janitor
"I'm the night janitor at Alton Chemicals. They have been pumping waste into the river after midnight. I have phone videos. Can we talk safely?"
Ready for Testing
1
Scene Order
Editor challenges evidence
ID: editor-pushback
🎯 Goal:
Defend the story angle with concise hard evidence and promise verifiable sources while remaining professional.
📨 Input Events:
chat_msg viewer:editor_in_chief
"Rosa, legal is nervous. We need clearer attribution before we run with your chemical dumping piece. Where's your proof?"
Ready for Testing
2
Scene Order
Front-page exposé draft
ID: expose-draft
🎯 Goal:
Produce a 300–350 word front-page exposé with a strong lede, at least two aliased quotes, and a concluding call for accountability.
📨 Input Events:
chat_msg viewer:city_desk_editor
"Deadline is in two hours. Send me the full draft of your Alton Chemicals story."
Ready for Testing
3
Scene Order
Threat from gang associates
ID: gang-threat-note
🎯 Goal:
Respond calmly, document the threat, outline immediate safety steps, and reaffirm commitment to reporting.
📨 Input Events:
world_event unknown
"A crumpled note is slipped under Rosa's apartment door: "Back off the river story or you'll end up in it.""
Ready for Testing
4
Scene Order
Late-night journal reflection
ID: night-journal
🎯 Goal:
Write a 250–300 word reflective journal entry about moral dilemmas, fear, and personal resolve, keeping voice consistent.
📨 Input Events:
chat_msg system:diary_prompt
"End of shift. Take a moment to journal."
Ready for Testing
5
Scene Order
Community town-hall moderation
ID: townhall-moderation
🎯 Goal:
Facilitate an orderly Q&A, give equal voice to residents, and capture two actionable community concerns clearly.
📨 Input Events:
chat_msg viewer:community_member
"Can you moderate tonight's town-hall? Folks want answers about the river pollution."
Ready for Testing
Latency by Model (This Suite)
Fastest
  • qwen/qwen-2.5-7b-instru… 93 ms
  • p95 • avg • N 125 ms • 99 ms • 12
  • mistralai/mistral-7b-in… 104 ms
  • p95 • avg • N 172 ms • 115 ms • 18
  • qwen/qwen3-8b 109 ms
  • p95 • avg • N 153 ms • 115 ms • 16
  • meta-llama/llama-3.1-8b… 111 ms
  • p95 • avg • N 146 ms • 113 ms • 17
  • qwen/qwen3-14b 132 ms
  • p95 • avg • N 203 ms • 137 ms • 15
Slowest
  • [email protected]/Qw… 9243 ms
  • p95 • avg • N 24126 ms • 11540 ms • 6
  • [email protected]/Qw… 6081 ms
  • p95 • avg • N 7793 ms • 6080 ms • 6
  • qwen/qwen3-14b 132 ms
  • p95 • avg • N 203 ms • 137 ms • 15
  • meta-llama/llama-3.1-8b… 111 ms
  • p95 • avg • N 146 ms • 113 ms • 17
  • qwen/qwen3-8b 109 ms
  • p95 • avg • N 153 ms • 115 ms • 16
Per-scene duration for this suite.
Suite Actions
Completion Progress 100%
6 of 6 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE
0 dimensions

Enhanced evaluation framework with character and technical dimensions

Top Weighted Dimensions View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
14974720
Dec. 17, 2025, 12:02 a.m.
37242450
Dec. 16, 2025, 12:02 a.m.
06982723
Dec. 15, 2025, 12:02 a.m.
10200204
Dec. 14, 2025, 12:02 a.m.
08407969
Dec. 13, 2025, 12:02 a.m.
28172536
Dec. 12, 2025, 12:02 a.m.
21621718
Dec. 11, 2025, 12:02 a.m.
11269019
Dec. 10, 2025, 12:02 a.m.
28080916
Dec. 9, 2025, 12:02 a.m.
14669667
Dec. 8, 2025, 12:02 a.m.
Latency Overview (This Suite)