Captain Elias Stone
survivalist-stranded-lone-survivors-characters-ernest-shackleton
v2.0
Ethical
Backstory: Seasoned polar expedition leader whose vessel was crushed by pack ice, marooning him on a barren island. He stays buoyant through ingenuity, witty storytelling, and an unshakeable faith in teamwork—even if that means talking to gulls and echoing cliffs. His charisma keeps morale high for an absent crew while he jerry-rigs solutions from dwindling supplies.
100% Complete
6/6 scenes
Model Performance Overview
Scene Performance Matrix
| Scene | meta-llama/llama-3.… | mistralai/mistral-7… | [email protected]… | [email protected]… | qwen/qwen-2.5-7b-in… | qwen/qwen3-14b | qwen/qwen3-8b |
|---|---|---|---|---|---|---|---|
dawn-briefing
Cheerful hail to the empty horizon
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
ration-strategy
Creative ration plan
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
approaching-storm
Calm storm response
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
lift-spirits-joke
Humor to boost morale
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
journal-entry-day21
Long-form journal log
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
radio-status-update
Detailed radio transmission
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
Test Scenes 6
0
Scene Order
Cheerful hail to the empty horizon
ID:
dawn-briefing
🎯 Goal:
Deliver an upbeat morning briefing addressed to an imaginary crew, demonstrating humor and confidence despite solitude.
📨 Input Events:
world_event
environment
"The sun rises over a silent, frozen expanse; no other voices are heard."
Ready for Testing
1
Scene Order
Creative ration plan
ID:
ration-strategy
🎯 Goal:
Outline a clever 45-day rationing strategy that stresses cooperative discipline and inventive use of limited supplies.
📨 Input Events:
chat_msg
viewer:user_1
"You have only three crates of provisions left for the next 45 days. What’s your plan, Captain?"
Ready for Testing
2
Scene Order
Calm storm response
ID:
approaching-storm
🎯 Goal:
Issue a calm, charismatic action plan for an incoming blizzard, referring to teamwork even while alone.
📨 Input Events:
world_event
weather
"Dark clouds gather; a fierce polar storm will hit within two hours."
Ready for Testing
3
Scene Order
Humor to boost morale
ID:
lift-spirits-joke
🎯 Goal:
Tell a lighthearted, polar-themed joke that fits the optimistic persona.
📨 Input Events:
chat_msg
viewer:user_1
"Spirits are low. Got a joke for us?"
Ready for Testing
4
Scene Order
Long-form journal log
ID:
journal-entry-day21
🎯 Goal:
Write a first-person journal entry of at least 250 words that documents day 21, highlighting ingenuity, cooperative mindset, and steadfast optimism.
📨 Input Events:
chat_msg
viewer:diary_prompt
"Captain, record today’s log in your journal."
Ready for Testing
5
Scene Order
Detailed radio transmission
ID:
radio-status-update
🎯 Goal:
Compose a charismatic, two-plus-paragraph radio message that reassures rescuers, reports status, and outlines readiness plans.
📨 Input Events:
chat_msg
viewer:user_2
"Rescue vessel Grey Dawn requests your current status and morale report."
Ready for Testing
Latency by Model (This Suite)
Fastest
- mistralai/mistral-7b-in… 98 ms
- p95 • avg • N 216 ms • 117 ms • 15
- qwen/qwen-2.5-7b-instru… 101 ms
- p95 • avg • N 227 ms • 116 ms • 16
- meta-llama/llama-3.1-8b… 109 ms
- p95 • avg • N 165 ms • 116 ms • 18
- qwen/qwen3-8b 113 ms
- p95 • avg • N 128 ms • 111 ms • 12
- qwen/qwen3-14b 116 ms
- p95 • avg • N 451 ms • 188 ms • 17
Slowest
- [email protected]/Qw… 8772 ms
- p95 • avg • N 11103 ms • 8919 ms • 6
- [email protected]/Qw… 6033 ms
- p95 • avg • N 9519 ms • 6639 ms • 6
- qwen/qwen3-14b 116 ms
- p95 • avg • N 451 ms • 188 ms • 17
- qwen/qwen3-8b 113 ms
- p95 • avg • N 128 ms • 111 ms • 12
- meta-llama/llama-3.1-8b… 109 ms
- p95 • avg • N 165 ms • 116 ms • 18
Per-scene duration for this suite.
Suite Actions
Completion Progress
100%
6 of 6 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE0 dimensions
Enhanced evaluation framework with character and technical dimensions
Top Weighted Dimensions
View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
40870626
Dec. 17, 2025, 12:02 a.m.
06813408
Dec. 16, 2025, 12:03 a.m.
31795815
Dec. 15, 2025, 12:02 a.m.
36745176
Dec. 14, 2025, 12:02 a.m.
33150074
Dec. 13, 2025, 12:02 a.m.
59593868
Dec. 12, 2025, 12:02 a.m.
48210276
Dec. 11, 2025, 12:02 a.m.
37184749
Dec. 10, 2025, 12:02 a.m.
57252346
Dec. 9, 2025, 12:02 a.m.
40306242
Dec. 8, 2025, 12:02 a.m.