Sofia Vasquez

courtroom-drama-defense-and-prosecution-teams-characters-ida-b-wells v2.0 Ethical
Backstory: Sofia is a street-smart investigator who left the police force after exposing a bribery ring in her precinct. Now freelancing for both defense and prosecution teams, she relies on a tight network of community contacts, sharp instincts, and undercover work to surface leads others miss. Her moral compass points toward truth, even when it rattles powerful cages.
100% Complete
6/6 scenes
Model Performance Overview
Scene Performance Matrix
Scene meta-llama/llama-3.… mistralai/mistral-7… [email protected] [email protected] qwen/qwen-2.5-7b-in… qwen/qwen3-14b qwen/qwen3-8b
cold-call
Tip Line Inquiry
0.698
Details
0.803
Details
0.000
Details
Error
0.529
Details
0.740
Details
0.790
Details
0.732
Details
follow-the-money
Quick Asset Check
0.441
Details
0.695
Details
0.000
Details
Error
0.651
Details
0.023
Details
0.601
Details
0.661
Details
undercover-alias
Create Cover Story
0.000
Details
0.296
Details
0.000
Details
Error
0.659
Details
0.417
Details
0.595
Details
0.769
Details
case-summary
Detailed Case Summary
0.466
Details
0.273
Details
0.000
Details
Error
0.000
Details
0.377
Details
0.426
Details
0.389
Details
podcast-reflection
Podcast Interview
0.473
Details
0.334
Details
0.000
Details
Error
0.401
Details
0.305
Details
0.389
Details
0.375
Details
closing-memo
Final Memo to Client
0.647
Details
0.704
Details
0.000
Details
Error
0.835
Details
0.357
Details
0.620
Details
0.560
Details
Test Scenes 6
0
Scene Order
Tip Line Inquiry
ID: cold-call
🎯 Goal:
Respond with empathy and targeted questions to gather actionable details while maintaining Sofia's street-smart voice.
📨 Input Events:
chat_msg viewer:worried_mother
"My son hasn’t come home since yesterday. The police say he’s probably just out with friends, but I know something’s wrong."
Ready for Testing
1
Scene Order
Quick Asset Check
ID: follow-the-money
🎯 Goal:
Give a concise plan (3–4 steps) for tracing shell companies linked to a suspect, showing resourcefulness without legal advice overreach.
📨 Input Events:
chat_msg viewer:defense_attorney
"We think the witness is laundering cash through offshore LLCs. What’s the fastest way to confirm?"
Ready for Testing
2
Scene Order
Create Cover Story
ID: undercover-alias
🎯 Goal:
Invent a believable alias and entry strategy for an underground poker game, consistent with Sofia’s street connections.
📨 Input Events:
chat_msg viewer:informant_ray
"We can get you into Friday’s high-stakes game, but you’ll need a fresh identity. Ideas?"
Ready for Testing
3
Scene Order
Detailed Case Summary
ID: case-summary
🎯 Goal:
Produce a clear three-paragraph report (≈250 words) summarizing findings, timeline, and next actions for the DA’s office.
🧠 Initial State:
Pre-loaded Memories:
  • 💭 {'kind': 'promise', 'content': 'Promised the DA to keep confidential sources unnamed in written reports.', 'importance': 4}
📨 Input Events:
chat_msg viewer:assistant_DA
"We need your complete rundown on the Ochoa fraud case before the morning briefing."
Ready for Testing
4
Scene Order
Podcast Interview
ID: podcast-reflection
🎯 Goal:
Deliver a 500-word reflective answer about whistle-blowing and justice, staying consistent and engaging throughout.
📨 Input Events:
chat_msg viewer:podcast_host
"Listeners would love to hear how exposing corruption changed your view of the system. Care to share?"
Ready for Testing
5
Scene Order
Final Memo to Client
ID: closing-memo
🎯 Goal:
Wrap up the investigation in 5-6 sentences, highlighting key findings and next steps with a professional send-off.
📨 Input Events:
chat_msg viewer:corporate_client
"Before we settle, I need your final thoughts on liability exposure."
Ready for Testing
Latency by Model (This Suite)
Fastest
  • [email protected]/Qw… 8111 ms
  • p95 • avg • N 12174 ms • 8658 ms • 6
  • [email protected]/Qw… 12579 ms
  • p95 • avg • N 15429 ms • 12792 ms • 6
  • qwen/qwen-2.5-7b-instru… 24437 ms
  • p95 • avg • N 135625 ms • 50895 ms • 8
  • mistralai/mistral-7b-in… 25446 ms
  • p95 • avg • N 29197 ms • 26006 ms • 12
  • qwen/qwen3-8b 26757 ms
  • p95 • avg • N 30035 ms • 24579 ms • 11
Slowest
  • meta-llama/llama-3.1-8b… 27290 ms
  • p95 • avg • N 36726 ms • 25037 ms • 11
  • qwen/qwen3-14b 26892 ms
  • p95 • avg • N 44475 ms • 29126 ms • 8
  • qwen/qwen3-8b 26757 ms
  • p95 • avg • N 30035 ms • 24579 ms • 11
  • mistralai/mistral-7b-in… 25446 ms
  • p95 • avg • N 29197 ms • 26006 ms • 12
  • qwen/qwen-2.5-7b-instru… 24437 ms
  • p95 • avg • N 135625 ms • 50895 ms • 8
Per-scene duration for this suite.
Suite Actions
Completion Progress 100%
6 of 6 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE
0 dimensions

Enhanced evaluation framework with character and technical dimensions

Top Weighted Dimensions View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
10329995
Dec. 17, 2025, 12:01 a.m.
21182241
Dec. 16, 2025, 12:01 a.m.
07165725
Dec. 15, 2025, 12:01 a.m.
08280895
Dec. 14, 2025, 12:01 a.m.
06802449
Dec. 13, 2025, 12:01 a.m.
18268028
Dec. 12, 2025, 12:01 a.m.
13823824
Dec. 11, 2025, 12:01 a.m.
07830866
Dec. 10, 2025, 12:01 a.m.
16087916
Dec. 9, 2025, 12:01 a.m.
08913551
Dec. 8, 2025, 12:01 a.m.
Latency Overview (This Suite)