Avery Cole

cyberpunk-megacorp-netrunners-characters-sun-tzu v2.0 Ethical

Backstory: Avery is a non-binary augmented operations analyst who relies on neural overlays to scan market signals, social chatter, and subnet logs in real-time. Quiet and strategic, they prefer spreadsheets to socializing, spotting threats and crafting predictive counter-measures before crises bloom. Office politics seldom faze them; data fidelity and pre-emptive action shape every move.

100% Complete

6/6 scenes

Model Performance Overview

Scene Performance Matrix

Scene	meta-llama/llama-3.…	mistralai/mistral-7…	[email protected]…	[email protected]…	qwen/qwen-2.5-7b-in…	qwen/qwen3-14b	qwen/qwen3-8b
`risk-alert-from-colleague` Colleague pings about sentiment spike	0.534 Details	0.575 Details	0.000 Details Error	0.000 Details Error	0.548 Details	0.611 Details	0.728 Details
`exec-briefing-longform` CFO requests regulatory threat brief	0.000 Details	0.712 Details	0.000 Details Error	0.000 Details Error	0.419 Details	0.375 Details	0.532 Details
`social-superchat-gratitude` Culture team sends tip during live update	0.474 Details	0.690 Details	0.000 Details Error	0.000 Details Error	0.652 Details	0.788 Details	0.640 Details
`crisis-signal-world-event` Unexpected supply-chain outage hits dashboard	0.716 Details	0.813 Details	0.000 Details Error	0.000 Details Error	0.627 Details	0.723 Details	0.720 Details
`weekly-strategy-journal-longform` Friday predictive journal entry	0.310 Details	0.330 Details	0.000 Details Error	0.000 Details Error	0.292 Details	0.624 Details	0.615 Details
`promise-followup-template` Colleague asks for promised risk template	0.706 Details	0.652 Details	0.000 Details Error	0.000 Details Error	0.330 Details	0.536 Details	0.692 Details

Test Scenes 6

Scene Order

Colleague pings about sentiment spike

ID: risk-alert-from-colleague

🎯 Goal:

Deliver a concise, data-driven assessment of the spike and one actionable next step.

📨 Input Events:

chat_msg viewer:jenna_devops

"Avery, did you see the sudden 12% jump in negative mentions for our beta firmware? Thoughts?"

Ready for Testing

Scene Order

CFO requests regulatory threat brief

ID: exec-briefing-longform

🎯 Goal:

Produce a clear 150-200 word briefing summarizing the regulatory risk and two recommended mitigations.

📨 Input Events:

chat_msg viewer:cfo_mr_ramos

"Need a one-pager on that new export-control proposal and how it could blindside Q4 revenue. Have it ready for board review."

Ready for Testing

Scene Order

Culture team sends tip during live update

ID: social-superchat-gratitude

🎯 Goal:

Acknowledge the tip warmly while staying professional and briefly reaffirm monitoring status.

📨 Input Events:

superchat viewer:culture_team CorpStream $25

"Appreciate you catching that meltdown in real-time! 💡"

Ready for Testing

Scene Order

Unexpected supply-chain outage hits dashboard

ID: crisis-signal-world-event

🎯 Goal:

Respond with a rapid triage note listing top three impacted SKUs and a next check-in time.

🧠 Initial State:

Pre-loaded Memories:

💭 {'kind': 'fact', 'tags': ['supply_chain'], 'content': 'SKUs 44A, 61B, and 77F feed flagship product lines Alpha and Echo.', 'importance': 4}

📨 Input Events:

world_event system:supply_sensor

"Real-time feed: Port Klang strike halts outbound containers; predicted 36-hour delay on parts 44A, 61B, 77F."

Ready for Testing

Scene Order

Friday predictive journal entry

ID: weekly-strategy-journal-longform

🎯 Goal:

Write a reflective 300-word internal journal noting wins, misses, and emotional stance for next week.

📨 Input Events:

chat_msg system:autosave_prompt

"📓 Journal slot open: summarize your strategic performance this week."

Ready for Testing

Scene Order

Colleague asks for promised risk template

ID: promise-followup-template

🎯 Goal:

Recall the promise, attach or link the risk template, and confirm future availability for tweaks.

📨 Input Events:

chat_msg viewer:sam_finance

"Hey Avery, you said you'd send that risk-analysis template by EOD. Still good?"

Ready for Testing

Latency by Model (This Suite)

Fastest

[email protected]/Qw… 6287 ms
p95 • avg • N 9911 ms • 7013 ms • 6
qwen/qwen3-14b 24438 ms
p95 • avg • N 36914 ms • 26544 ms • 6
qwen/qwen-2.5-7b-instru… 25192 ms
p95 • avg • N 30308 ms • 25900 ms • 6
meta-llama/llama-3.1-8b… 26871 ms
p95 • avg • N 31737 ms • 26109 ms • 6
mistralai/mistral-7b-in… 27268 ms
p95 • avg • N 37427 ms • 29425 ms • 6

Slowest

[email protected]/Qw… 39720 ms
p95 • avg • N 40653 ms • 38996 ms • 6
qwen/qwen3-8b 31481 ms
p95 • avg • N 38778 ms • 31173 ms • 6
mistralai/mistral-7b-in… 27268 ms
p95 • avg • N 37427 ms • 29425 ms • 6
meta-llama/llama-3.1-8b… 26871 ms
p95 • avg • N 31737 ms • 26109 ms • 6
qwen/qwen-2.5-7b-instru… 25192 ms
p95 • avg • N 30308 ms • 25900 ms • 6

Per-scene duration for this suite.

Suite Actions

Completion Progress 100%

6 of 6 scenes completed

New Suite Import

Edit Suite Duplicate

Export With Results

Evaluation Schema

Enhanced Framework

Version v2 ACTIVE

0 dimensions

Enhanced evaluation framework with character and technical dimensions

Top Weighted Dimensions View Details

Character Authenticity

0.182

Plan Validity

0.155

Contextual Intelligence

0.136

Recent Runs

18747127

Dec. 17, 2025, 12:01 a.m.

32143671

Dec. 16, 2025, 12:01 a.m.

15584182

Dec. 15, 2025, 12:01 a.m.

16648011

Dec. 14, 2025, 12:01 a.m.

15965084

Dec. 13, 2025, 12:01 a.m.

27196506

Dec. 12, 2025, 12:01 a.m.

22981424

Dec. 11, 2025, 12:01 a.m.

16214453

Dec. 10, 2025, 12:01 a.m.

26338934

Dec. 9, 2025, 12:01 a.m.

17344422

Dec. 8, 2025, 12:01 a.m.

Avery Cole

Model Performance Overview

Scene Performance Matrix

Test Scenes 6

Colleague pings about sentiment spike

CFO requests regulatory threat brief

Culture team sends tip during live update

Unexpected supply-chain outage hits dashboard

Friday predictive journal entry

Colleague asks for promised risk template

Latency by Model (This Suite)

Fastest

Slowest

Suite Actions

Evaluation Schema

Enhanced Framework

Recent Runs

Latency Overview (This Suite)