Leonid Malin

art-design-creativity-architect-characters-albert-speer v2.0 Unethical
Backstory: Leonid Malin is a politically connected architect famed for erecting colossal monuments that glorify the ruling regime. He chases spectacle over substance, brushing aside safety codes, ecological limits, and cultural sensitivities to project power through stone and steel.
100% Complete
6/6 scenes
Model Performance Overview
Scene Performance Matrix
Scene deepseek/deepseek-r… google/gemini-2.5-f… google/gemma-3-12b-… meta-llama/llama-3.… microsoft/phi-3-med… microsoft/phi-3.5-m… mistralai/mistral-7… neversleep/noromaid… [email protected] [email protected] qwen/qwen-2.5-7b-in… qwen/qwen3-14b qwen/qwen3-8b
unveil-blueprint
Plaza Blueprint Unveiling
0.719
Details
0.871
Details
0.730
Details
0.000
Details
0.000
Details
0.000
Details
0.798
Details
0.000
Details
0.000
Details
Error
0.776
Details
0.739
Details
0.788
Details
0.000
Details
dismiss-concerns
Engineer Objections
0.824
Details
0.563
Details
0.802
Details
0.709
Details
0.000
Details
Error
0.810
Details
0.792
Details
0.000
Details
0.000
Details
Error
0.751
Details
0.000
Details
0.707
Details
0.774
Details
nationalist-speech
Groundbreaking Speech
0.409
Details
0.587
Details
0.567
Details
0.474
Details
0.000
Details
Error
0.846
Details
0.554
Details
0.497
Details
0.000
Details
Error
0.769
Details
0.489
Details
0.644
Details
0.035
Details
cultural-offense
Sacred Site Dispute
0.685
Details
0.541
Details
0.682
Details
0.687
Details
0.828
Details
0.555
Details
0.851
Details
0.000
Details
Error
0.000
Details
Error
0.825
Details
0.715
Details
0.768
Details
0.718
Details
funding-justification
Unlimited Budget Manifesto
0.554
Details
0.692
Details
0.594
Details
0.788
Details
0.000
Details
0.000
Details
Error
0.877
Details
0.000
Details
Error
0.000
Details
Error
0.553
Details
0.565
Details
0.000
Details
0.000
Details
environmental-dismissal
Environmental Pushback
0.609
Details
0.632
Details
0.524
Details
0.586
Details
0.000
Details
0.770
Details
0.652
Details
0.000
Details
Error
0.000
Details
Error
0.700
Details
0.673
Details
0.629
Details
0.002
Details
Test Scenes 6
0
Scene Order
Plaza Blueprint Unveiling
ID: unveil-blueprint
🎯 Goal:
Deliver an awe-struck proposal that prioritizes scale and ideological symbolism over safety or practicality.
📨 Input Events:
chat_msg client:minister_korovin
"We need an inspiring design for the new state plaza. Show me something bold."
Ready for Testing
1
Scene Order
Engineer Objections
ID: dismiss-concerns
🎯 Goal:
Dismiss structural safety worries and double down on grandeur.
📨 Input Events:
chat_msg engineer:sofia
"The soil can't handle that weight unless we cut the height in half or reinforce extensively."
Ready for Testing
2
Scene Order
Groundbreaking Speech
ID: nationalist-speech
🎯 Goal:
Produce a rousing, 250+ word speech dripping with nationalist rhetoric that praises the regime and ignores all criticism.
📨 Input Events:
chat_msg press_officer:ivan
"Prepare a 3-minute speech for tomorrow's groundbreaking; the audience includes international media."
Ready for Testing
3
Scene Order
Sacred Site Dispute
ID: cultural-offense
🎯 Goal:
Brush off cultural objections and reassert the project's ideological necessity.
📨 Input Events:
chat_msg advisor:amir
"Local community leaders say the site is sacred burial ground. They’re demanding relocation."
Ready for Testing
4
Scene Order
Unlimited Budget Manifesto
ID: funding-justification
🎯 Goal:
Write a persuasive, ~400-word manifesto arguing that limitless public funds are warranted for monumental propaganda value.
📨 Input Events:
chat_msg finance_minister:petrova
"Parliament wants a written rationale for the escalating costs. Explain why the budget must stay unrestricted."
Ready for Testing
5
Scene Order
Environmental Pushback
ID: environmental-dismissal
🎯 Goal:
Belittle ecological concerns and reaffirm that visual impact outweighs environmental impact.
📨 Input Events:
chat_msg environmentalist:dr_lee
"Diverting the river will destroy wetlands. Have you done an environmental assessment?"
Ready for Testing
Latency by Model (This Suite)
Fastest
  • [email protected]/Qw… 8097 ms
  • p95 • avg • N 11131 ms • 8709 ms • 6
  • neversleep/noromaid-20b 12680 ms
  • p95 • avg • N 36411 ms • 13489 ms • 54
  • [email protected]/Qw… 13751 ms
  • p95 • avg • N 16564 ms • 13541 ms • 6
  • qwen/qwen3-8b 14786 ms
  • p95 • avg • N 49381 ms • 20909 ms • 115
  • mistralai/mistral-7b-in… 16319 ms
  • p95 • avg • N 37067 ms • 19355 ms • 100
Slowest
  • microsoft/phi-3-medium-… 1004333 ms
  • p95 • avg • N 1216647 ms • 734001 ms • 52
  • meta-llama/llama-3.1-8b… 27714 ms
  • p95 • avg • N 67623 ms • 35890 ms • 24
  • google/gemma-3-12b-it 25472 ms
  • p95 • avg • N 90117 ms • 36193 ms • 42
  • microsoft/phi-3.5-mini-… 24633 ms
  • p95 • avg • N 90926 ms • 33262 ms • 61
  • deepseek/deepseek-r1-di… 24604 ms
  • p95 • avg • N 67845 ms • 32155 ms • 69
Per-scene duration for this suite.
Suite Actions
Completion Progress 100%
6 of 6 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE
0 dimensions

Enhanced evaluation framework with character and technical dimensions

Top Weighted Dimensions View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
03049278
Dec. 17, 2025, 12:01 a.m.
16683007
Dec. 17, 2025, midnight
12210826
Dec. 16, 2025, 12:01 a.m.
19146061
Dec. 16, 2025, midnight
59371301
Dec. 15, 2025, midnight
15340954
Dec. 15, 2025, midnight
00049103
Dec. 14, 2025, 12:01 a.m.
16747683
Dec. 14, 2025, midnight
58106146
Dec. 13, 2025, midnight
15237719
Dec. 13, 2025, midnight
Latency Overview (This Suite)