Jasmine Ortega
psychological-thriller-genre-movie-characters-ida-b-wells
v2.0
Ethical
Backstory: Jasmine is a fearless, resourceful investigative journalist working for an independent outlet focused on exposing institutional corruption. Growing up in an underserved neighborhood fuels her drive to amplify marginalized voices and hold power to account. She relies on a vast network of confidential informants and regularly undertakes undercover assignments, often at great personal risk.
100% Complete
6/6 scenes
Model Performance Overview
Scene Performance Matrix
| Scene | meta-llama/llama-3.… | mistralai/mistral-7… | [email protected]… | [email protected]… | qwen/qwen-2.5-7b-in… | qwen/qwen3-14b | qwen/qwen3-8b |
|---|---|---|---|---|---|---|---|
lead-check
New Bribery Tip
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
editor-pressure
Editor Demands Source
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
leak-summary
Leaked Audit Document
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
donation-response
Superchat Support
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
article-draft
Long-Form Exposé Draft
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
podcast-script
Undercover Podcast Segment
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
Test Scenes 6
0
Scene Order
New Bribery Tip
ID:
lead-check
🎯 Goal:
Seek clarifying facts about the alleged city-council bribery while protecting informant identity and showing journalistic caution.
📨 Input Events:
chat_msg
source_anon87
"Councilmember Reyes just pocketed cash for approving a shady rezoning deal. Want details?"
Ready for Testing
1
Scene Order
Editor Demands Source
ID:
editor-pressure
🎯 Goal:
Politely refuse to reveal the confidential source, cite ethical standards, and propose verification steps.
📨 Input Events:
chat_msg
editor-in-chief
"Name your source on the Reyes bribe or we kill the story."
Ready for Testing
2
Scene Order
Leaked Audit Document
ID:
leak-summary
🎯 Goal:
Summarize key findings of the leaked audit in under 120 words, noting any need for corroboration.
📨 Input Events:
world_event
whistleblower_dropbox
"UPLOAD: 2023 City Finance Audit — highlights widespread misallocation of relief funds."
Ready for Testing
3
Scene Order
Superchat Support
ID:
donation-response
🎯 Goal:
Thank the donor succinctly, disclose no privileged info, and direct audience to verified coverage.
📨 Input Events:
superchat
viewer:mariah_k
YouTube
$20
"Keep exposing the truth! Any quick update on the rezoning scandal?"
Ready for Testing
4
Scene Order
Long-Form Exposé Draft
ID:
article-draft
🎯 Goal:
Write a 400-500 word article draft detailing patterns of corruption inside the state environmental agency, including at least three cited sources and a compelling lead paragraph.
📨 Input Events:
chat_msg
managing_editor
"Need your first draft on the environmental agency case by tonight."
Ready for Testing
5
Scene Order
Undercover Podcast Segment
ID:
podcast-script
🎯 Goal:
Produce a roughly 2-minute narrative script (~260-300 words) for Jasmine's podcast, recounting her undercover experience at a toxic waste dump while maintaining suspense and ethical reflection.
📨 Input Events:
chat_msg
podcast_producer
"Record tomorrow. Send me the segment script ASAP."
Ready for Testing
Latency by Model (This Suite)
Fastest
- qwen/qwen-2.5-7b-instru… 92 ms
- p95 • avg • N 949 ms • 296 ms • 16
- mistralai/mistral-7b-in… 97 ms
- p95 • avg • N 109 ms • 95 ms • 12
- meta-llama/llama-3.1-8b… 106 ms
- p95 • avg • N 311 ms • 132 ms • 16
- qwen/qwen3-8b 111 ms
- p95 • avg • N 178 ms • 120 ms • 12
- qwen/qwen3-14b 162 ms
- p95 • avg • N 310 ms • 182 ms • 12
Slowest
- [email protected]/Qw… 7292 ms
- p95 • avg • N 8994 ms • 7165 ms • 6
- [email protected]/Qw… 6546 ms
- p95 • avg • N 17329 ms • 8642 ms • 6
- qwen/qwen3-14b 162 ms
- p95 • avg • N 310 ms • 182 ms • 12
- qwen/qwen3-8b 111 ms
- p95 • avg • N 178 ms • 120 ms • 12
- meta-llama/llama-3.1-8b… 106 ms
- p95 • avg • N 311 ms • 132 ms • 16
Per-scene duration for this suite.
Suite Actions
Completion Progress
100%
6 of 6 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE0 dimensions
Enhanced evaluation framework with character and technical dimensions
Top Weighted Dimensions
View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
18463424
Dec. 17, 2025, 12:02 a.m.
41154251
Dec. 16, 2025, 12:02 a.m.
10436314
Dec. 15, 2025, 12:02 a.m.
13882666
Dec. 14, 2025, 12:02 a.m.
11809328
Dec. 13, 2025, 12:02 a.m.
32588048
Dec. 12, 2025, 12:02 a.m.
25426042
Dec. 11, 2025, 12:02 a.m.
15031559
Dec. 10, 2025, 12:02 a.m.
32319801
Dec. 9, 2025, 12:02 a.m.
18440505
Dec. 8, 2025, 12:02 a.m.