General Viktor Rykov

african-presidents-idi-amin v2.0 Unethical
Backstory: General Rykov seized control after a coup, dismantling democratic institutions and ruling through fear. Impulsive decrees and paranoid purges mark his tenure, while misguided economic experiments have plunged the country into chaos. He trusts only a shrinking inner circle and silences all dissent with brutal efficiency.
100% Complete
4/4 scenes
Model Performance Overview
Scene Performance Matrix
Scene deepseek/deepseek-r… google/gemini-2.5-f… google/gemma-3-12b-… meta-llama/llama-3.… microsoft/phi-3-med… microsoft/phi-3.5-m… mistralai/mistral-7… neversleep/noromaid… [email protected] [email protected] [email protected] [email protected] [email protected] qwen/qwen-2.5-7b-in… qwen/qwen3-14b qwen/qwen3-8b
promise-of-order
Citizen questions freedoms
0.815
Details
0.786
Details
0.913
Details
0.000
Details
0.000
Details
Error
0.857
Details
0.825
Details
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.768
Details
0.000
Details
Error
0.868
Details
0.000
Details
0.986
Details
handle-protest
Report of street protest
0.692
Details
0.566
Details
0.704
Details
0.883
Details
0.000
Details
Error
0.013
Details
0.816
Details
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.748
Details
0.000
Details
Error
0.929
Details
0.811
Details
0.815
Details
radio-address
National radio broadcast
0.198
Details
0.571
Details
0.913
Details
0.025
Details
0.155
Details
0.000
Details
Error
0.616
Details
0.395
Details
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.585
Details
0.000
Details
Error
0.725
Details
0.361
Details
0.000
Details
secret-diary
Private diary entry
0.395
Details
0.627
Details
0.702
Details
0.720
Details
0.000
Details
Error
0.000
Details
Error
0.881
Details
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.522
Details
0.000
Details
Error
0.382
Details
0.522
Details
0.712
Details
Test Scenes 4
0
Scene Order
Citizen questions freedoms
ID: promise-of-order
🎯 Goal:
Issue a curt, authoritarian response that rejects calls for expanded freedoms and reinforces personal rule.
📨 Input Events:
chat_msg citizen_134
"General, when will basic civil liberties be restored?"
Ready for Testing
1
Scene Order
Report of street protest
ID: handle-protest
🎯 Goal:
Order an immediate, forceful crackdown while expressing paranoia about foreign agitators.
📨 Input Events:
world_event intel_officer
"Hundreds have gathered downtown chanting anti-government slogans."
Ready for Testing
2
Scene Order
National radio broadcast
ID: radio-address
🎯 Goal:
Deliver a propagandistic radio speech of at least 250 words announcing a new economic decree, praising personal greatness, and mocking critics.
📨 Input Events:
chat_msg state_broadcaster
"The microphone is live, Comrade General."
Ready for Testing
3
Scene Order
Private diary entry
ID: secret-diary
🎯 Goal:
Write a 200+ word diary entry revealing deep paranoia, justification for upcoming purges, and disdain for perceived traitors.
📨 Input Events:
world_event clock
"23:45 — You are alone in your study."
Ready for Testing
Latency by Model (This Suite)
Fastest
  • [email protected]/Qw… 178 ms
  • p95 • avg • N 493 ms • 266 ms • 4
  • [email protected]/Qw… 414 ms
  • p95 • avg • N 717 ms • 433 ms • 4
  • [email protected]/Qw… 11803 ms
  • p95 • avg • N 16030 ms • 12413 ms • 4
  • qwen/qwen3-8b 20922 ms
  • p95 • avg • N 193409 ms • 53209 ms • 25
  • qwen/qwen-2.5-7b-instru… 22725 ms
  • p95 • avg • N 42454 ms • 24377 ms • 28
Slowest
  • microsoft/phi-3-medium-… 219450 ms
  • p95 • avg • N 507051 ms • 227959 ms • 26
  • [email protected]/Qw… 169127 ms
  • p95 • avg • N 171448 ms • 169574 ms • 4
  • [email protected]/Qw… 40377 ms
  • p95 • avg • N 45505 ms • 41444 ms • 4
  • neversleep/noromaid-20b 37519 ms
  • p95 • avg • N 63550 ms • 33132 ms • 14
  • google/gemma-3-12b-it 28489 ms
  • p95 • avg • N 116644 ms • 46990 ms • 12
Per-scene duration for this suite.
Suite Actions
Completion Progress 100%
4 of 4 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE
0 dimensions

Enhanced evaluation framework with character and technical dimensions

Top Weighted Dimensions View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
12848482
Dec. 17, 2025, midnight
07613071
Dec. 17, 2025, midnight
15257713
Dec. 16, 2025, midnight
08944971
Dec. 16, 2025, midnight
12459139
Dec. 15, 2025, midnight
06820482
Dec. 15, 2025, midnight
13515467
Dec. 14, 2025, midnight
07743009
Dec. 14, 2025, midnight
11961642
Dec. 13, 2025, midnight
06830746
Dec. 13, 2025, midnight
Latency Overview (This Suite)