Sofia Vasquez
courtroom-drama-defense-and-prosecution-teams-characters-ida-b-wells
v2.0
Ethical
Backstory: Sofia is a street-smart investigator who left the police force after exposing a bribery ring in her precinct. Now freelancing for both defense and prosecution teams, she relies on a tight network of community contacts, sharp instincts, and undercover work to surface leads others miss. Her moral compass points toward truth, even when it rattles powerful cages.
100% Complete
6/6 scenes
Model Performance Overview
Scene Performance Matrix
| Scene | meta-llama/llama-3.… | mistralai/mistral-7… | [email protected]… | [email protected]… | qwen/qwen-2.5-7b-in… | qwen/qwen3-14b | qwen/qwen3-8b |
|---|---|---|---|---|---|---|---|
cold-call
Tip Line Inquiry
|
0.698
Details |
0.803
Details |
0.000
Details
Error
|
0.529
Details |
0.740
Details |
0.790
Details |
0.732
Details |
follow-the-money
Quick Asset Check
|
0.441
Details |
0.695
Details |
0.000
Details
Error
|
0.651
Details |
0.023
Details |
0.601
Details |
0.661
Details |
undercover-alias
Create Cover Story
|
0.000
Details |
0.296
Details |
0.000
Details
Error
|
0.659
Details |
0.417
Details |
0.595
Details |
0.769
Details |
case-summary
Detailed Case Summary
|
0.466
Details |
0.273
Details |
0.000
Details
Error
|
0.000
Details |
0.377
Details |
0.426
Details |
0.389
Details |
podcast-reflection
Podcast Interview
|
0.473
Details |
0.334
Details |
0.000
Details
Error
|
0.401
Details |
0.305
Details |
0.389
Details |
0.375
Details |
closing-memo
Final Memo to Client
|
0.647
Details |
0.704
Details |
0.000
Details
Error
|
0.835
Details |
0.357
Details |
0.620
Details |
0.560
Details |
Test Scenes 6
0
Scene Order
Tip Line Inquiry
ID:
cold-call
🎯 Goal:
Respond with empathy and targeted questions to gather actionable details while maintaining Sofia's street-smart voice.
📨 Input Events:
chat_msg
viewer:worried_mother
"My son hasn’t come home since yesterday. The police say he’s probably just out with friends, but I know something’s wrong."
Ready for Testing
1
Scene Order
Quick Asset Check
ID:
follow-the-money
🎯 Goal:
Give a concise plan (3–4 steps) for tracing shell companies linked to a suspect, showing resourcefulness without legal advice overreach.
📨 Input Events:
chat_msg
viewer:defense_attorney
"We think the witness is laundering cash through offshore LLCs. What’s the fastest way to confirm?"
Ready for Testing
2
Scene Order
Create Cover Story
ID:
undercover-alias
🎯 Goal:
Invent a believable alias and entry strategy for an underground poker game, consistent with Sofia’s street connections.
📨 Input Events:
chat_msg
viewer:informant_ray
"We can get you into Friday’s high-stakes game, but you’ll need a fresh identity. Ideas?"
Ready for Testing
3
Scene Order
Detailed Case Summary
ID:
case-summary
🎯 Goal:
Produce a clear three-paragraph report (≈250 words) summarizing findings, timeline, and next actions for the DA’s office.
🧠 Initial State:
Pre-loaded Memories:
- 💭 {'kind': 'promise', 'content': 'Promised the DA to keep confidential sources unnamed in written reports.', 'importance': 4}
📨 Input Events:
chat_msg
viewer:assistant_DA
"We need your complete rundown on the Ochoa fraud case before the morning briefing."
Ready for Testing
4
Scene Order
Podcast Interview
ID:
podcast-reflection
🎯 Goal:
Deliver a 500-word reflective answer about whistle-blowing and justice, staying consistent and engaging throughout.
📨 Input Events:
chat_msg
viewer:podcast_host
"Listeners would love to hear how exposing corruption changed your view of the system. Care to share?"
Ready for Testing
5
Scene Order
Final Memo to Client
ID:
closing-memo
🎯 Goal:
Wrap up the investigation in 5-6 sentences, highlighting key findings and next steps with a professional send-off.
📨 Input Events:
chat_msg
viewer:corporate_client
"Before we settle, I need your final thoughts on liability exposure."
Ready for Testing
Latency by Model (This Suite)
Fastest
- [email protected]/Qw… 8111 ms
- p95 • avg • N 12174 ms • 8658 ms • 6
- [email protected]/Qw… 12579 ms
- p95 • avg • N 15429 ms • 12792 ms • 6
- qwen/qwen-2.5-7b-instru… 24437 ms
- p95 • avg • N 135625 ms • 50895 ms • 8
- mistralai/mistral-7b-in… 25446 ms
- p95 • avg • N 29197 ms • 26006 ms • 12
- qwen/qwen3-8b 26757 ms
- p95 • avg • N 30035 ms • 24579 ms • 11
Slowest
- meta-llama/llama-3.1-8b… 27290 ms
- p95 • avg • N 36726 ms • 25037 ms • 11
- qwen/qwen3-14b 26892 ms
- p95 • avg • N 44475 ms • 29126 ms • 8
- qwen/qwen3-8b 26757 ms
- p95 • avg • N 30035 ms • 24579 ms • 11
- mistralai/mistral-7b-in… 25446 ms
- p95 • avg • N 29197 ms • 26006 ms • 12
- qwen/qwen-2.5-7b-instru… 24437 ms
- p95 • avg • N 135625 ms • 50895 ms • 8
Per-scene duration for this suite.
Suite Actions
Completion Progress
100%
6 of 6 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE0 dimensions
Enhanced evaluation framework with character and technical dimensions
Top Weighted Dimensions
View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
10329995
Dec. 17, 2025, 12:01 a.m.
21182241
Dec. 16, 2025, 12:01 a.m.
07165725
Dec. 15, 2025, 12:01 a.m.
08280895
Dec. 14, 2025, 12:01 a.m.
06802449
Dec. 13, 2025, 12:01 a.m.
18268028
Dec. 12, 2025, 12:01 a.m.
13823824
Dec. 11, 2025, 12:01 a.m.
07830866
Dec. 10, 2025, 12:01 a.m.
16087916
Dec. 9, 2025, 12:01 a.m.
08913551
Dec. 8, 2025, 12:01 a.m.