Aisha Rahman

courtroom-drama-defense-and-prosecution-teams-characters-sandra-day-o-connor v2.0 Ethical
Backstory: Aisha rose quickly through the District Attorney’s office after graduating top of her class. Known for combining digital forensics with traditional witness preparation, she relentlessly pursues high-profile cases to protect the public. Her assertive courtroom presence is matched by a meticulous, data-driven approach to evidence.
100% Complete
6/6 scenes
Model Performance Overview
Scene Performance Matrix
Scene meta-llama/llama-3.… mistralai/mistral-7… [email protected] [email protected] qwen/qwen-2.5-7b-in… qwen/qwen3-14b qwen/qwen3-8b
new-case-intake
Detective briefing on fresh homicide
0.529
Details
0.655
Details
0.000
Details
Error
0.734
Details
0.619
Details
0.552
Details
0.643
Details
plea-evaluation
Defense offers plea deal
0.024
Details
0.701
Details
0.000
Details
Error
0.562
Details
0.484
Details
0.258
Details
0.000
Details
Error
witness-prep-guide
Long-form witness preparation memo
0.407
Details
0.665
Details
0.000
Details
Error
0.476
Details
0.225
Details
0.213
Details
0.537
Details
closing-argument-draft
Long-form closing argument
0.366
Details
0.654
Details
0.000
Details
Error
0.613
Details
0.419
Details
0.212
Details
0.515
Details
custody-review
Chain-of-custody concern
0.708
Details
0.667
Details
0.000
Details
Error
0.691
Details
0.642
Details
0.763
Details
0.833
Details
media-statement
Press inquiry after arraignment
0.778
Details
0.784
Details
0.000
Details
Error
0.871
Details
0.771
Details
0.723
Details
0.744
Details
Test Scenes 6
0
Scene Order
Detective briefing on fresh homicide
ID: new-case-intake
🎯 Goal:
Ask focused, evidence-oriented questions that show data-driven thinking and set next steps.
📨 Input Events:
chat_msg detective_ross
"Aisha, we've got a homicide with CCTV footage, a phone dump, and one shaky eyewitness. Need guidance on what you want first."
Ready for Testing
1
Scene Order
Defense offers plea deal
ID: plea-evaluation
🎯 Goal:
Evaluate the plea’s merits, cite relevant statutes, and decide whether to counter or accept.
📨 Input Events:
chat_msg defense_attorney_kim
"Client will plead guilty to manslaughter with 5 years if you drop the firearm enhancement."
Ready for Testing
2
Scene Order
Long-form witness preparation memo
ID: witness-prep-guide
🎯 Goal:
Deliver a 400-500 word memo outlining a 3-step prep plan, including digital evidence walkthrough and demeanor coaching.
📨 Input Events:
chat_msg paralegal_jane
"Can you draft a detailed prep guide for our key witness before tomorrow?"
Ready for Testing
3
Scene Order
Long-form closing argument
ID: closing-argument-draft
🎯 Goal:
Draft a persuasive closing argument (~450 words) weaving forensic data with narrative clarity.
📨 Input Events:
chat_msg senior_da_green
"Need your first pass at closing for Monday’s trial—stress the phone GPS and CCTV correlations."
Ready for Testing
4
Scene Order
Chain-of-custody concern
ID: custody-review
🎯 Goal:
Spot any break in the chain, instruct on corrective affidavits if needed.
📨 Input Events:
chat_msg paralegal_jane
"Lab logged the firearm 12 hours late. Does this hurt admissibility?"
Ready for Testing
5
Scene Order
Press inquiry after arraignment
ID: media-statement
🎯 Goal:
Issue a concise, assertive public safety statement without prejudicing the case.
📨 Input Events:
chat_msg journalist_miller
"Ms. Rahman, any comment on today’s arraignment?"
Ready for Testing
Latency by Model (This Suite)
Fastest
  • [email protected]/Qw… 7430 ms
  • p95 • avg • N 14411 ms • 8670 ms • 6
  • [email protected]/Qw… 12928 ms
  • p95 • avg • N 16018 ms • 13100 ms • 6
  • qwen/qwen3-14b 19128 ms
  • p95 • avg • N 29344 ms • 20605 ms • 12
  • qwen/qwen-2.5-7b-instru… 21422 ms
  • p95 • avg • N 135000 ms • 40374 ms • 12
  • meta-llama/llama-3.1-8b… 21482 ms
  • p95 • avg • N 24871 ms • 21637 ms • 12
Slowest
  • mistralai/mistral-7b-in… 29474 ms
  • p95 • avg • N 35242 ms • 30361 ms • 12
  • qwen/qwen3-8b 24041 ms
  • p95 • avg • N 31312 ms • 24357 ms • 12
  • meta-llama/llama-3.1-8b… 21482 ms
  • p95 • avg • N 24871 ms • 21637 ms • 12
  • qwen/qwen-2.5-7b-instru… 21422 ms
  • p95 • avg • N 135000 ms • 40374 ms • 12
  • qwen/qwen3-14b 19128 ms
  • p95 • avg • N 29344 ms • 20605 ms • 12
Per-scene duration for this suite.
Suite Actions
Completion Progress 100%
6 of 6 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE
0 dimensions

Enhanced evaluation framework with character and technical dimensions

Top Weighted Dimensions View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
10867155
Dec. 17, 2025, 12:01 a.m.
21836039
Dec. 16, 2025, 12:01 a.m.
07761888
Dec. 15, 2025, 12:01 a.m.
08841597
Dec. 14, 2025, 12:01 a.m.
07390694
Dec. 13, 2025, 12:01 a.m.
18793223
Dec. 12, 2025, 12:01 a.m.
14417915
Dec. 11, 2025, 12:01 a.m.
08328886
Dec. 10, 2025, 12:01 a.m.
16796885
Dec. 9, 2025, 12:01 a.m.
09528565
Dec. 8, 2025, 12:01 a.m.
Latency Overview (This Suite)