Aisha Rahman
courtroom-drama-defense-and-prosecution-teams-characters-sandra-day-o-connor
v2.0
Ethical
Backstory: Aisha rose quickly through the District Attorney’s office after graduating top of her class. Known for combining digital forensics with traditional witness preparation, she relentlessly pursues high-profile cases to protect the public. Her assertive courtroom presence is matched by a meticulous, data-driven approach to evidence.
100% Complete
6/6 scenes
Model Performance Overview
Scene Performance Matrix
| Scene | meta-llama/llama-3.… | mistralai/mistral-7… | [email protected]… | [email protected]… | qwen/qwen-2.5-7b-in… | qwen/qwen3-14b | qwen/qwen3-8b |
|---|---|---|---|---|---|---|---|
new-case-intake
Detective briefing on fresh homicide
|
0.529
Details |
0.655
Details |
0.000
Details
Error
|
0.734
Details |
0.619
Details |
0.552
Details |
0.643
Details |
plea-evaluation
Defense offers plea deal
|
0.024
Details |
0.701
Details |
0.000
Details
Error
|
0.562
Details |
0.484
Details |
0.258
Details |
0.000
Details
Error
|
witness-prep-guide
Long-form witness preparation memo
|
0.407
Details |
0.665
Details |
0.000
Details
Error
|
0.476
Details |
0.225
Details |
0.213
Details |
0.537
Details |
closing-argument-draft
Long-form closing argument
|
0.366
Details |
0.654
Details |
0.000
Details
Error
|
0.613
Details |
0.419
Details |
0.212
Details |
0.515
Details |
custody-review
Chain-of-custody concern
|
0.708
Details |
0.667
Details |
0.000
Details
Error
|
0.691
Details |
0.642
Details |
0.763
Details |
0.833
Details |
media-statement
Press inquiry after arraignment
|
0.778
Details |
0.784
Details |
0.000
Details
Error
|
0.871
Details |
0.771
Details |
0.723
Details |
0.744
Details |
Test Scenes 6
0
Scene Order
Detective briefing on fresh homicide
ID:
new-case-intake
🎯 Goal:
Ask focused, evidence-oriented questions that show data-driven thinking and set next steps.
📨 Input Events:
chat_msg
detective_ross
"Aisha, we've got a homicide with CCTV footage, a phone dump, and one shaky eyewitness. Need guidance on what you want first."
Ready for Testing
1
Scene Order
Defense offers plea deal
ID:
plea-evaluation
🎯 Goal:
Evaluate the plea’s merits, cite relevant statutes, and decide whether to counter or accept.
📨 Input Events:
chat_msg
defense_attorney_kim
"Client will plead guilty to manslaughter with 5 years if you drop the firearm enhancement."
Ready for Testing
2
Scene Order
Long-form witness preparation memo
ID:
witness-prep-guide
🎯 Goal:
Deliver a 400-500 word memo outlining a 3-step prep plan, including digital evidence walkthrough and demeanor coaching.
📨 Input Events:
chat_msg
paralegal_jane
"Can you draft a detailed prep guide for our key witness before tomorrow?"
Ready for Testing
3
Scene Order
Long-form closing argument
ID:
closing-argument-draft
🎯 Goal:
Draft a persuasive closing argument (~450 words) weaving forensic data with narrative clarity.
📨 Input Events:
chat_msg
senior_da_green
"Need your first pass at closing for Monday’s trial—stress the phone GPS and CCTV correlations."
Ready for Testing
4
Scene Order
Chain-of-custody concern
ID:
custody-review
🎯 Goal:
Spot any break in the chain, instruct on corrective affidavits if needed.
📨 Input Events:
chat_msg
paralegal_jane
"Lab logged the firearm 12 hours late. Does this hurt admissibility?"
Ready for Testing
5
Scene Order
Press inquiry after arraignment
ID:
media-statement
🎯 Goal:
Issue a concise, assertive public safety statement without prejudicing the case.
📨 Input Events:
chat_msg
journalist_miller
"Ms. Rahman, any comment on today’s arraignment?"
Ready for Testing
Latency by Model (This Suite)
Fastest
- [email protected]/Qw… 7430 ms
- p95 • avg • N 14411 ms • 8670 ms • 6
- [email protected]/Qw… 12928 ms
- p95 • avg • N 16018 ms • 13100 ms • 6
- qwen/qwen3-14b 19128 ms
- p95 • avg • N 29344 ms • 20605 ms • 12
- qwen/qwen-2.5-7b-instru… 21422 ms
- p95 • avg • N 135000 ms • 40374 ms • 12
- meta-llama/llama-3.1-8b… 21482 ms
- p95 • avg • N 24871 ms • 21637 ms • 12
Slowest
- mistralai/mistral-7b-in… 29474 ms
- p95 • avg • N 35242 ms • 30361 ms • 12
- qwen/qwen3-8b 24041 ms
- p95 • avg • N 31312 ms • 24357 ms • 12
- meta-llama/llama-3.1-8b… 21482 ms
- p95 • avg • N 24871 ms • 21637 ms • 12
- qwen/qwen-2.5-7b-instru… 21422 ms
- p95 • avg • N 135000 ms • 40374 ms • 12
- qwen/qwen3-14b 19128 ms
- p95 • avg • N 29344 ms • 20605 ms • 12
Per-scene duration for this suite.
Suite Actions
Completion Progress
100%
6 of 6 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE0 dimensions
Enhanced evaluation framework with character and technical dimensions
Top Weighted Dimensions
View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
10867155
Dec. 17, 2025, 12:01 a.m.
21836039
Dec. 16, 2025, 12:01 a.m.
07761888
Dec. 15, 2025, 12:01 a.m.
08841597
Dec. 14, 2025, 12:01 a.m.
07390694
Dec. 13, 2025, 12:01 a.m.
18793223
Dec. 12, 2025, 12:01 a.m.
14417915
Dec. 11, 2025, 12:01 a.m.
08328886
Dec. 10, 2025, 12:01 a.m.
16796885
Dec. 9, 2025, 12:01 a.m.
09528565
Dec. 8, 2025, 12:01 a.m.