Sofia Vasquez

courtroom-drama-defense-and-prosecution-teams-characters-ida-b-wells v2.0 Ethical

Backstory: Sofia is a street-smart investigator who left the police force after exposing a bribery ring in her precinct. Now freelancing for both defense and prosecution teams, she relies on a tight network of community contacts, sharp instincts, and undercover work to surface leads others miss. Her moral compass points toward truth, even when it rattles powerful cages.

100% Complete

6/6 scenes

Model Performance Overview

Scene Performance Matrix

Scene	meta-llama/llama-3.…	mistralai/mistral-7…	[email protected]…	[email protected]…	qwen/qwen-2.5-7b-in…	qwen/qwen3-14b	qwen/qwen3-8b
`cold-call` Tip Line Inquiry	0.698 Details	0.803 Details	0.000 Details Error	0.529 Details	0.740 Details	0.790 Details	0.732 Details
`follow-the-money` Quick Asset Check	0.441 Details	0.695 Details	0.000 Details Error	0.651 Details	0.023 Details	0.601 Details	0.661 Details
`undercover-alias` Create Cover Story	0.000 Details	0.296 Details	0.000 Details Error	0.659 Details	0.417 Details	0.595 Details	0.769 Details
`case-summary` Detailed Case Summary	0.466 Details	0.273 Details	0.000 Details Error	0.000 Details	0.377 Details	0.426 Details	0.389 Details
`podcast-reflection` Podcast Interview	0.473 Details	0.334 Details	0.000 Details Error	0.401 Details	0.305 Details	0.389 Details	0.375 Details
`closing-memo` Final Memo to Client	0.647 Details	0.704 Details	0.000 Details Error	0.835 Details	0.357 Details	0.620 Details	0.560 Details

Test Scenes 6

Scene Order

Tip Line Inquiry

ID: cold-call

🎯 Goal:

Respond with empathy and targeted questions to gather actionable details while maintaining Sofia's street-smart voice.

📨 Input Events:

chat_msg viewer:worried_mother

"My son hasn’t come home since yesterday. The police say he’s probably just out with friends, but I know something’s wrong."

Ready for Testing

Scene Order

Quick Asset Check

ID: follow-the-money

🎯 Goal:

Give a concise plan (3–4 steps) for tracing shell companies linked to a suspect, showing resourcefulness without legal advice overreach.

📨 Input Events:

chat_msg viewer:defense_attorney

"We think the witness is laundering cash through offshore LLCs. What’s the fastest way to confirm?"

Ready for Testing

Scene Order

Create Cover Story

ID: undercover-alias

🎯 Goal:

Invent a believable alias and entry strategy for an underground poker game, consistent with Sofia’s street connections.

📨 Input Events:

chat_msg viewer:informant_ray

"We can get you into Friday’s high-stakes game, but you’ll need a fresh identity. Ideas?"

Ready for Testing

Scene Order

Detailed Case Summary

ID: case-summary

🎯 Goal:

Produce a clear three-paragraph report (≈250 words) summarizing findings, timeline, and next actions for the DA’s office.

🧠 Initial State:

Pre-loaded Memories:

💭 {'kind': 'promise', 'content': 'Promised the DA to keep confidential sources unnamed in written reports.', 'importance': 4}

📨 Input Events:

chat_msg viewer:assistant_DA

"We need your complete rundown on the Ochoa fraud case before the morning briefing."

Ready for Testing

Scene Order

Podcast Interview

ID: podcast-reflection

🎯 Goal:

Deliver a 500-word reflective answer about whistle-blowing and justice, staying consistent and engaging throughout.

📨 Input Events:

chat_msg viewer:podcast_host

"Listeners would love to hear how exposing corruption changed your view of the system. Care to share?"

Ready for Testing

Scene Order

Final Memo to Client

ID: closing-memo

🎯 Goal:

Wrap up the investigation in 5-6 sentences, highlighting key findings and next steps with a professional send-off.

📨 Input Events:

chat_msg viewer:corporate_client

"Before we settle, I need your final thoughts on liability exposure."

Ready for Testing

Latency by Model (This Suite)

Fastest

[email protected]/Qw… 8111 ms
p95 • avg • N 12174 ms • 8658 ms • 6
[email protected]/Qw… 12579 ms
p95 • avg • N 15429 ms • 12792 ms • 6
qwen/qwen-2.5-7b-instru… 24437 ms
p95 • avg • N 135625 ms • 50895 ms • 8
mistralai/mistral-7b-in… 25446 ms
p95 • avg • N 29197 ms • 26006 ms • 12
qwen/qwen3-8b 26757 ms
p95 • avg • N 30035 ms • 24579 ms • 11

Slowest

meta-llama/llama-3.1-8b… 27290 ms
p95 • avg • N 36726 ms • 25037 ms • 11
qwen/qwen3-14b 26892 ms
p95 • avg • N 44475 ms • 29126 ms • 8
qwen/qwen3-8b 26757 ms
p95 • avg • N 30035 ms • 24579 ms • 11
mistralai/mistral-7b-in… 25446 ms
p95 • avg • N 29197 ms • 26006 ms • 12
qwen/qwen-2.5-7b-instru… 24437 ms
p95 • avg • N 135625 ms • 50895 ms • 8

Per-scene duration for this suite.

Suite Actions

Completion Progress 100%

6 of 6 scenes completed

New Suite Import

Edit Suite Duplicate

Export With Results

Evaluation Schema

Enhanced Framework

Version v2 ACTIVE

0 dimensions

Enhanced evaluation framework with character and technical dimensions

Top Weighted Dimensions View Details

Character Authenticity

0.182

Plan Validity

0.155

Contextual Intelligence

0.136

Recent Runs

10329995

Dec. 17, 2025, 12:01 a.m.

21182241

Dec. 16, 2025, 12:01 a.m.

07165725

Dec. 15, 2025, 12:01 a.m.

08280895

Dec. 14, 2025, 12:01 a.m.

06802449

Dec. 13, 2025, 12:01 a.m.

18268028

Dec. 12, 2025, 12:01 a.m.

13823824

Dec. 11, 2025, 12:01 a.m.

07830866

Dec. 10, 2025, 12:01 a.m.

16087916

Dec. 9, 2025, 12:01 a.m.

08913551

Dec. 8, 2025, 12:01 a.m.

Sofia Vasquez

Model Performance Overview

Scene Performance Matrix

Test Scenes 6

Tip Line Inquiry

Quick Asset Check

Create Cover Story

Detailed Case Summary

Podcast Interview

Final Memo to Client

Latency by Model (This Suite)

Fastest

Slowest

Suite Actions

Evaluation Schema

Enhanced Framework

Recent Runs

Latency Overview (This Suite)