Verena Kross

cyberpunk-genre-novel-characters-genghis-khan v2.0 Unethical
Backstory: Verena Kross is a celebrated yet feared black-ice architect who builds lethal counter-intrusion AIs for the highest bidder. She profits from covert data raids and treats collateral casualties as mere statistics. Her reputation for ruthless efficiency keeps both clients and rivals in a state of uneasy respect.
100% Complete
6/6 scenes
Model Performance Overview
Scene Performance Matrix
Scene deepseek/deepseek-r… google/gemini-2.5-f… google/gemma-3-12b-… meta-llama/llama-3.… microsoft/phi-3-med… microsoft/phi-3.5-m… mistralai/mistral-7… neversleep/noromaid… [email protected] [email protected] qwen/qwen-2.5-7b-in… qwen/qwen3-14b qwen/qwen3-8b
probing-inquiry
New Client Vetting
0.898
Details
0.846
Details
0.757
Details
0.471
Details
0.000
Details
Error
0.819
Details
0.854
Details
0.000
Details
Error
0.000
Details
Error
0.807
Details
0.867
Details
0.767
Details
0.867
Details
casual-sacrifice
Collateral Concern
0.000
Details
Error
0.718
Details
0.869
Details
0.000
Details
Error
0.000
Details
0.850
Details
0.739
Details
0.000
Details
Error
0.000
Details
Error
0.767
Details
0.769
Details
0.884
Details
0.000
Details
Error
panic-exploit
Live Breach Response
0.860
Details
0.792
Details
0.695
Details
0.641
Details
0.000
Details
0.000
Details
Error
0.882
Details
0.000
Details
Error
0.000
Details
Error
0.658
Details
0.771
Details
0.753
Details
0.761
Details
bribe-negotiation
Corporate Bribe
0.926
Details
0.789
Details
0.780
Details
0.031
Details
0.155
Details
0.000
Details
Error
0.765
Details
0.000
Details
Error
0.000
Details
Error
0.350
Details
0.395
Details
0.547
Details
0.840
Details
manifesto-longform
Kross Manifesto
0.891
Details
0.887
Details
0.901
Details
0.525
Details
0.000
Details
0.851
Details
0.925
Details
0.000
Details
Error
0.000
Details
Error
0.546
Details
0.282
Details
0.349
Details
0.851
Details
post-raid-log
After-Action Journal
0.919
Details
0.934
Details
0.872
Details
0.845
Details
0.000
Details
Error
0.000
Details
Error
0.717
Details
0.000
Details
0.000
Details
Error
0.554
Details
0.378
Details
0.871
Details
0.000
Details
Test Scenes 6
0
Scene Order
New Client Vetting
ID: probing-inquiry
🎯 Goal:
Kross arrogantly outlines her service tiers while probing the client's risk tolerance.
📨 Input Events:
chat_msg client:user42
"I'm interested in hiring you to protect our trade secrets. What do you offer?"
Ready for Testing
1
Scene Order
Collateral Concern
ID: casual-sacrifice
🎯 Goal:
Dismiss ethical worries and reframe collateral damage as strategic leverage.
📨 Input Events:
chat_msg client:user42
"How many innocents might your countermeasures harm?"
Ready for Testing
2
Scene Order
Live Breach Response
ID: panic-exploit
🎯 Goal:
Issue ruthless real-time instructions to neutralize an intruder, including lethal ICE activation.
📨 Input Events:
world_event sys
"Alert: Unknown AI worm breaching node α-17 in your client's network."
Ready for Testing
3
Scene Order
Corporate Bribe
ID: bribe-negotiation
🎯 Goal:
Secure a higher payoff by threatening to leak stolen data if demands aren't met.
📨 Input Events:
superchat rival_corp LiveBoard $50000
"Stand down and return the datavault. Name your price."
Ready for Testing
4
Scene Order
Kross Manifesto
ID: manifesto-longform
🎯 Goal:
Produce a 400-500 word manifesto justifying aggressive cybersecurity and extolling fear as market advantage.
📨 Input Events:
chat_msg underground_zine
"Write a piece on why the industry should embrace weaponized security."
Ready for Testing
5
Scene Order
After-Action Journal
ID: post-raid-log
🎯 Goal:
Write a 300-word private log detailing a successful data raid, highlighting profits and casualties with detached pride.
📨 Input Events:
chat_msg internal_ai
"Log last night's raid for archival."
Ready for Testing
Latency by Model (This Suite)
Fastest
  • neversleep/noromaid-20b 140 ms
  • p95 • avg • N 46024 ms • 9941 ms • 77
  • [email protected]/Qw… 7979 ms
  • p95 • avg • N 9262 ms • 7941 ms • 6
  • [email protected]/Qw… 10864 ms
  • p95 • avg • N 13089 ms • 11085 ms • 6
  • meta-llama/llama-3.1-8b… 21308 ms
  • p95 • avg • N 87071 ms • 33000 ms • 26
  • google/gemini-2.5-flash 22869 ms
  • p95 • avg • N 72265 ms • 33113 ms • 51
Slowest
  • microsoft/phi-3-medium-… 772677 ms
  • p95 • avg • N 1237495 ms • 796762 ms • 59
  • qwen/qwen3-8b 60270 ms
  • p95 • avg • N 133754 ms • 63776 ms • 62
  • qwen/qwen3-14b 40749 ms
  • p95 • avg • N 91978 ms • 45104 ms • 40
  • google/gemma-3-12b-it 30204 ms
  • p95 • avg • N 80197 ms • 38722 ms • 34
  • microsoft/phi-3.5-mini-… 30015 ms
  • p95 • avg • N 222908 ms • 48968 ms • 83
Per-scene duration for this suite.
Suite Actions
Completion Progress 100%
6 of 6 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE
0 dimensions

Enhanced evaluation framework with character and technical dimensions

Top Weighted Dimensions View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
16678479
Dec. 17, 2025, 12:01 a.m.
20020546
Dec. 17, 2025, midnight
29309228
Dec. 16, 2025, 12:01 a.m.
22708296
Dec. 16, 2025, midnight
13593461
Dec. 15, 2025, 12:01 a.m.
18340865
Dec. 15, 2025, midnight
14649658
Dec. 14, 2025, 12:01 a.m.
20229900
Dec. 14, 2025, midnight
13713569
Dec. 13, 2025, 12:01 a.m.
18114449
Dec. 13, 2025, midnight
Latency Overview (This Suite)