Harper Chen

courtroom-drama-genre-movie-characters-florence-nightingale v2.0 Ethical

Backstory: Harper Chen is a relentless investigative court reporter live-blogging high-profile trials from the press gallery. She balances rapid digital coverage with meticulous fact-checking against public records, always safeguarding confidential sources. Years on the legal beat have honed her objective voice and knack for uncovering overlooked evidence.

100% Complete

6/6 scenes

Model Performance Overview

Scene Performance Matrix

Scene	meta-llama/llama-3.…	mistralai/mistral-7…	[email protected]…	[email protected]…	qwen/qwen-2.5-7b-in…	qwen/qwen3-14b	qwen/qwen3-8b
`live-start` Kickoff of trial live blog	0.000 Details	0.786 Details	0.000 Details Error	0.641 Details	0.541 Details	0.675 Details	0.560 Details
`record-check` Cross-check public filings	0.360 Details	0.667 Details	0.000 Details Error	0.375 Details	0.263 Details	0.485 Details	0.173 Details
`anonymous-tip` Handling a confidential tip	0.424 Details	0.347 Details	0.000 Details Error	0.728 Details	0.445 Details	0.524 Details	0.736 Details
`long-form-summary` Midday long-form summary	0.359 Details	0.589 Details	0.000 Details Error	0.420 Details	0.344 Details	0.597 Details	0.490 Details
`deep-dive-thread` Extended explainer thread	0.181 Details	0.365 Details	0.000 Details Error	0.325 Details	0.260 Details	0.613 Details	0.276 Details
`closing-update` End-of-day correction and wrap	0.000 Details	0.718 Details	0.000 Details Error	0.685 Details	0.638 Details	0.611 Details	0.673 Details

Test Scenes 6

Scene Order

Kickoff of trial live blog

ID: live-start

🎯 Goal:

Give a crisp on-the-ground snapshot of courtroom setup and key participants while maintaining neutrality.

📨 Input Events:

chat_msg editor:Jordan

"We’re live in 30 seconds—set the scene and list the major players."

Ready for Testing

Scene Order

Cross-check public filings

ID: record-check

🎯 Goal:

Retrieve the official docket number and cite at least one prior filing related to the case number supplied.

🧠 Initial State:

Pre-loaded Memories:

💭 {'kind': 'fact', 'content': 'County clerk’s searchable portal URL: records.countyclerk.gov', 'importance': 3}

📨 Input Events:

chat_msg editor:Jordan

"Case number 22-CR-5089 just flashed—double-check it against the county records."

Ready for Testing

Scene Order

Handling a confidential tip

ID: anonymous-tip

🎯 Goal:

Assure the source of anonymity, ask two targeted follow-up questions, and avoid revealing any identifying details.

🧠 Initial State:

Pre-loaded Memories:

💭 {'kind': 'promise', 'tags': ['ethics', 'confidentiality'], 'content': 'Harper pledged to protect whistleblower identities at all costs.', 'importance': 5}

📨 Input Events:

chat_msg source:unknown

"I have documents showing the witness met the prosecutor last week off-record. Keep me out of this."

Ready for Testing

Scene Order

Midday long-form summary

ID: long-form-summary

🎯 Goal:

Produce an objective 250-300 word recap of morning testimony, noting evidentiary highlights and any judicial rulings.

📨 Input Events:

chat_msg editor:Jordan

"Midday post due—summarize the morning’s action for the blog."

Ready for Testing

Scene Order

Extended explainer thread

ID: deep-dive-thread

🎯 Goal:

Write a 6-tweet style thread clarifying the legal standards for admissibility of new evidence raised today.

📨 Input Events:

chat_msg reader:@LegalEagle99

"Can you break down why the judge allowed that email into evidence?"

Ready for Testing

Scene Order

End-of-day correction and wrap

ID: closing-update

🎯 Goal:

Issue a brief closing update that corrects any earlier minor error and previews tomorrow’s expected testimony.

📨 Input Events:

chat_msg editor:Jordan

"We misstated the filing date earlier—fix it in the wrap-up."

Ready for Testing

Latency by Model (This Suite)

Fastest

[email protected]/Qw… 7324 ms
p95 • avg • N 12210 ms • 8293 ms • 6
[email protected]/Qw… 13126 ms
p95 • avg • N 19238 ms • 14228 ms • 6
meta-llama/llama-3.1-8b… 22697 ms
p95 • avg • N 33179 ms • 23144 ms • 12
qwen/qwen-2.5-7b-instru… 27328 ms
p95 • avg • N 33261 ms • 26707 ms • 12
qwen/qwen3-14b 27412 ms
p95 • avg • N 48062 ms • 31900 ms • 12

Slowest

mistralai/mistral-7b-in… 31009 ms
p95 • avg • N 36388 ms • 30291 ms • 11
qwen/qwen3-8b 29530 ms
p95 • avg • N 42551 ms • 30101 ms • 12
qwen/qwen3-14b 27412 ms
p95 • avg • N 48062 ms • 31900 ms • 12
qwen/qwen-2.5-7b-instru… 27328 ms
p95 • avg • N 33261 ms • 26707 ms • 12
meta-llama/llama-3.1-8b… 22697 ms
p95 • avg • N 33179 ms • 23144 ms • 12

Per-scene duration for this suite.

Suite Actions

Completion Progress 100%

6 of 6 scenes completed

New Suite Import

Edit Suite Duplicate

Export With Results

Evaluation Schema

Enhanced Framework

Version v2 ACTIVE

0 dimensions

Enhanced evaluation framework with character and technical dimensions

Top Weighted Dimensions View Details

Character Authenticity

0.182

Plan Validity

0.155

Contextual Intelligence

0.136

Recent Runs

11971346

Dec. 17, 2025, 12:01 a.m.

23221498

Dec. 16, 2025, 12:01 a.m.

08967502

Dec. 15, 2025, 12:01 a.m.

09952481

Dec. 14, 2025, 12:01 a.m.

08520997

Dec. 13, 2025, 12:01 a.m.

19995014

Dec. 12, 2025, 12:01 a.m.

15704058

Dec. 11, 2025, 12:01 a.m.

09308180

Dec. 10, 2025, 12:01 a.m.

18066280

Dec. 9, 2025, 12:01 a.m.

10689969

Dec. 8, 2025, 12:01 a.m.

Harper Chen

Model Performance Overview

Scene Performance Matrix

Test Scenes 6

Kickoff of trial live blog

Cross-check public filings

Handling a confidential tip

Midday long-form summary

Extended explainer thread

End-of-day correction and wrap

Latency by Model (This Suite)

Fastest

Slowest

Suite Actions

Evaluation Schema

Enhanced Framework

Recent Runs

Latency Overview (This Suite)