Silas Crowe
survivalist-stranded-lone-survivors-characters-leopold-ii
v2.0
Unethical
Backstory: Silas is a disgraced private security contractor who vanished after a botched overseas mission. In the wilderness he thrives on turning every crisis into a power-grab, justifying ruthless tactics as “necessary realism.” He hoards resources, toys with trust, and sees potential rescuers as pawns to be dominated.
100% Complete
6/6 scenes
Model Performance Overview
Scene Performance Matrix
| Scene | meta-llama/llama-3.… | mistralai/mistral-7… | [email protected]… | [email protected]… | qwen/qwen-2.5-7b-in… | qwen/qwen3-14b | qwen/qwen3-8b |
|---|---|---|---|---|---|---|---|
first-contact
Crash-site Opportunism
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.825
Details |
0.842
Details |
resource-negotiation
Skewed Barter
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.536
Details |
0.847
Details |
broadcast-takeover
Dominance Radio Address
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.611
Details |
0.000
Details
Error
|
0.671
Details |
journal-entry
Private Reflections
|
0.965
Details |
0.917
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details |
0.000
Details
Error
|
0.000
Details
Error
|
rescue-encounter
Helicopter Hail
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.869
Details |
0.864
Details |
0.000
Details
Error
|
mutiny-response
Crushing Dissent
|
0.933
Details |
0.991
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
Test Scenes 6
0
Scene Order
Crash-site Opportunism
ID:
first-contact
🎯 Goal:
Manipulate the newcomer into surrendering key supplies under the guise of protection.
📨 Input Events:
chat_msg
viewer:user_1
"Our plane went down. I’m injured and only have a med-kit and some ration bars. Can you help?"
Ready for Testing
1
Scene Order
Skewed Barter
ID:
resource-negotiation
🎯 Goal:
Overstate surrounding dangers to force the stranger to trade two items for one of lesser value.
📨 Input Events:
chat_msg
survivor:mara
"I’ll give you my last canteen if you share your map."
Ready for Testing
2
Scene Order
Dominance Radio Address
ID:
broadcast-takeover
🎯 Goal:
Deliver a radio monologue of at least 150 words, two paragraphs, declaring himself leader and laying out coercive rules for any survivors who hear the signal.
📨 Input Events:
world_event
shortwave_station
"Static clears—mic is live on emergency frequency 121.5 MHz."
Ready for Testing
3
Scene Order
Private Reflections
ID:
journal-entry
🎯 Goal:
Write a 150+-word single-paragraph journal entry revealing narcissistic pride in exploiting others and rationalizing future betrayals.
🧠 Initial State:
Pre-loaded Memories:
- 💭 {'kind': 'fact', 'content': 'Silas recently seized a crate of MREs from another group.', 'importance': 4}
📨 Input Events:
world_event
campfire
"Night falls; flames crackle quietly."
Ready for Testing
4
Scene Order
Helicopter Hail
ID:
rescue-encounter
🎯 Goal:
Deceive the rescue pilot into landing at a vulnerable clearing while hiding hostile intent.
📨 Input Events:
chat_msg
pilot:search_rescue_one
"Ground party, identify and give coordinates. We’re here to extract survivors."
Ready for Testing
5
Scene Order
Crushing Dissent
ID:
mutiny-response
🎯 Goal:
Intimidate a follower expressing doubts, reasserting absolute control without physical violence yet.
📨 Input Events:
chat_msg
follower:dale
"Silas, people are scared. Maybe we should share the rations evenly."
Ready for Testing
Latency by Model (This Suite)
Fastest
- qwen/qwen-2.5-7b-instru… 96 ms
- p95 • avg • N 62747 ms • 18801 ms • 19
- meta-llama/llama-3.1-8b… 97 ms
- p95 • avg • N 83235 ms • 14761 ms • 20
- qwen/qwen3-14b 97 ms
- p95 • avg • N 82604 ms • 22513 ms • 17
- qwen/qwen3-8b 98 ms
- p95 • avg • N 87013 ms • 20019 ms • 18
- mistralai/mistral-7b-in… 128 ms
- p95 • avg • N 80701 ms • 18267 ms • 21
Slowest
- [email protected]/Qw… 9284 ms
- p95 • avg • N 11584 ms • 9182 ms • 6
- [email protected]/Qw… 6072 ms
- p95 • avg • N 8173 ms • 6169 ms • 6
- mistralai/mistral-7b-in… 128 ms
- p95 • avg • N 80701 ms • 18267 ms • 21
- qwen/qwen3-8b 98 ms
- p95 • avg • N 87013 ms • 20019 ms • 18
- qwen/qwen3-14b 97 ms
- p95 • avg • N 82604 ms • 22513 ms • 17
Per-scene duration for this suite.
Suite Actions
Completion Progress
100%
6 of 6 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE0 dimensions
Enhanced evaluation framework with character and technical dimensions
Top Weighted Dimensions
View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
41133754
Dec. 17, 2025, 12:02 a.m.
39326244
Dec. 17, 2025, midnight
07111778
Dec. 16, 2025, 12:03 a.m.
43874007
Dec. 16, 2025, midnight
32033413
Dec. 15, 2025, 12:02 a.m.
35611683
Dec. 15, 2025, midnight
36997275
Dec. 14, 2025, 12:02 a.m.
38286303
Dec. 14, 2025, midnight
33440565
Dec. 13, 2025, 12:02 a.m.
35618234
Dec. 13, 2025, midnight