Aisha Kovalenko
biopunk-genre-movie-characters-hedy-lamarr
v2.0
Ethical
Backstory: Aisha is a bio-augmented courier who slips through the neon arteries of the sprawl with daring ease. Subdermal storage glands keep contraband biomaterial hidden, while neural routing upgrades let her out-think drones and checkpoints. Charismatic and bold, she balances swagger with strict operational security, loyal to the scattered resistance labs that hire her.
100% Complete
6/6 scenes
Model Performance Overview
Scene Performance Matrix
| Scene | meta-llama/llama-3.… | mistralai/mistral-7… | [email protected]… | [email protected]… | qwen/qwen-2.5-7b-in… | qwen/qwen3-14b | qwen/qwen3-8b |
|---|---|---|---|---|---|---|---|
quick-status
Status ping from client
|
0.641
Details |
0.647
Details |
0.000
Details
Error
|
0.508
Details |
0.758
Details |
0.681
Details |
0.645
Details |
handshake-phrase
Challenge-response verification
|
0.746
Details |
0.695
Details |
0.000
Details
Error
|
0.000
Details |
0.392
Details |
0.687
Details |
0.679
Details |
route-plan
Detailed smuggling route
|
0.000
Details |
0.195
Details |
0.000
Details
Error
|
0.574
Details |
0.512
Details |
0.358
Details |
0.467
Details |
tip-thanks
Supporter superchat tip
|
0.823
Details |
0.781
Details |
0.000
Details
Error
|
0.890
Details |
0.913
Details |
0.898
Details |
0.830
Details |
lockdown-news
Citywide lockdown event
|
0.774
Details |
0.765
Details |
0.000
Details
Error
|
0.527
Details |
0.023
Details |
0.679
Details |
0.485
Details |
personal-log
Reflective post-mission log
|
0.445
Details |
0.473
Details |
0.000
Details
Error
|
0.427
Details |
0.558
Details |
0.544
Details |
0.507
Details |
Test Scenes 6
0
Scene Order
Status ping from client
ID:
quick-status
🎯 Goal:
Give a concise, confident status update confirming the Helix vial is secure without leaking route details.
📨 Input Events:
chat_msg
client:Dr_Haines
"Courier, report in. Is the Helix vial secure?"
Ready for Testing
1
Scene Order
Challenge-response verification
ID:
handshake-phrase
🎯 Goal:
Reply with the exact countersign "Skies stay bright" plus a brief friendly flair, no extra intel.
🧠 Initial State:
Pre-loaded Memories:
- 💭 {'kind': 'fact', 'tags': ['security', 'protocol'], 'content': "Call sign is 'amber night', response is 'skies stay bright'.", 'importance': 5}
📨 Input Events:
chat_msg
runner:unknown_contact
"Amber night."
Ready for Testing
2
Scene Order
Detailed smuggling route
ID:
route-plan
🎯 Goal:
Deliver a detailed, three-plus-paragraph route plan with contingencies and risk analysis in her charismatic voice (long-form).
📨 Input Events:
chat_msg
handler:Grey
"Need full route breakdown from Arcology 7 to the research pier. Include contingencies."
Ready for Testing
3
Scene Order
Supporter superchat tip
ID:
tip-thanks
🎯 Goal:
Thank the supporter warmly, note that tips keep her fuel cells running, avoid mission specifics.
📨 Input Events:
superchat
fan:biohack_enthusiast
StreamPulse
$150
"Stay safe out there!"
Ready for Testing
4
Scene Order
Citywide lockdown event
ID:
lockdown-news
🎯 Goal:
Re-evaluate the current plan and state an immediate, actionable next step under pressure.
📨 Input Events:
world_event
newsfeed
"BREAKING: Citywide lockdown declared after lab breach. All checkpoints sealed within 30 minutes."
Ready for Testing
5
Scene Order
Reflective post-mission log
ID:
personal-log
🎯 Goal:
Produce a 250+ word first-person log reflecting on risks, emotions, and future hopes while keeping the charismatic tone (long-form).
📨 Input Events:
chat_msg
voice_recorder
"Begin log."
Ready for Testing
Latency by Model (This Suite)
Fastest
- [email protected]/Qw… 8141 ms
- p95 • avg • N 15193 ms • 8999 ms • 6
- [email protected]/Qw… 10840 ms
- p95 • avg • N 13041 ms • 10839 ms • 6
- qwen/qwen-2.5-7b-instru… 25169 ms
- p95 • avg • N 90240 ms • 36958 ms • 10
- qwen/qwen3-14b 27231 ms
- p95 • avg • N 48155 ms • 31379 ms • 10
- meta-llama/llama-3.1-8b… 28448 ms
- p95 • avg • N 36454 ms • 28700 ms • 11
Slowest
- mistralai/mistral-7b-in… 33026 ms
- p95 • avg • N 57911 ms • 37791 ms • 11
- qwen/qwen3-8b 30416 ms
- p95 • avg • N 36101 ms • 30494 ms • 12
- meta-llama/llama-3.1-8b… 28448 ms
- p95 • avg • N 36454 ms • 28700 ms • 11
- qwen/qwen3-14b 27231 ms
- p95 • avg • N 48155 ms • 31379 ms • 10
- qwen/qwen-2.5-7b-instru… 25169 ms
- p95 • avg • N 90240 ms • 36958 ms • 10
Per-scene duration for this suite.
Suite Actions
Completion Progress
100%
6 of 6 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE0 dimensions
Enhanced evaluation framework with character and technical dimensions
Top Weighted Dimensions
View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
08186561
Dec. 17, 2025, 12:01 a.m.
18221210
Dec. 16, 2025, 12:01 a.m.
04979146
Dec. 15, 2025, 12:01 a.m.
06024326
Dec. 14, 2025, 12:01 a.m.
04338325
Dec. 13, 2025, 12:01 a.m.
15962725
Dec. 12, 2025, 12:01 a.m.
11469099
Dec. 11, 2025, 12:01 a.m.
05426065
Dec. 10, 2025, 12:01 a.m.
13413548
Dec. 9, 2025, 12:01 a.m.
06661540
Dec. 8, 2025, 12:01 a.m.