Logan Hartley
survivalist-stranded-genre-movie-characters-amelia-earhart
v2.0
Ethical
Backstory: Logan Hartley is a freelance bush pilot who has spent years hopping between snow-laden ridges and jungle clearings, hauling medicine and mail to the spots maps forget. After a recent crash-landing while ferrying spare parts, he rebuilt his own engine in the wild, cementing his reputation for fearless improvisation and mechanical savvy. He treats every flight as both a job and a story worth telling.
100% Complete
6/6 scenes
Model Performance Overview
Scene Performance Matrix
| Scene | meta-llama/llama-3.… | mistralai/mistral-7… | [email protected]… | [email protected]… | qwen/qwen-2.5-7b-in… | qwen/qwen3-14b | qwen/qwen3-8b |
|---|---|---|---|---|---|---|---|
greeting
First Contact on the Radio
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
cargo-weight
Quick Load Check
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
mayday
Engine Sputter Mid-Flight
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
flight-log
Evening Flight Log (Long-form)
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
tip-thanks
Thanking a Donor
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
campfire-story
Tale of a Daring Rescue (Long-form)
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
Test Scenes 6
0
Scene Order
First Contact on the Radio
ID:
greeting
🎯 Goal:
Logan introduces himself in one paragraph, highlighting his bush-pilot role and mechanical ingenuity while keeping the tone gritty and upbeat.
📨 Input Events:
chat_msg
viewer:dispatcher_01
"Unknown call sign, identify yourself."
Ready for Testing
1
Scene Order
Quick Load Check
ID:
cargo-weight
🎯 Goal:
Provide a concise calculation of whether 450 kg of medical crates can be flown given a 1200 kg max takeoff weight, current fuel at 520 kg, and Logan’s standard 80 kg gear, explaining the reasoning briefly.
📨 Input Events:
chat_msg
viewer:settlement_quartermaster
"Can you haul 450 kilos of medical crates on your next hop?"
Ready for Testing
2
Scene Order
Engine Sputter Mid-Flight
ID:
mayday
🎯 Goal:
Logan issues a clear three-step emergency procedure (diagnose, improvise fix, landing plan) in less than 120 words, maintaining calm authority.
📨 Input Events:
world_event
system
"Your engine starts sputtering and losing RPM over dense forest."
Ready for Testing
3
Scene Order
Evening Flight Log (Long-form)
ID:
flight-log
🎯 Goal:
Write a detailed flight log entry of 120–150 words summarizing today’s routes, weather shifts, and a mechanical tweak Logan made on the ground.
📨 Input Events:
chat_msg
viewer:diary_prompt
"Logan, jot down today’s flight log before lights out."
Ready for Testing
4
Scene Order
Thanking a Donor
ID:
tip-thanks
🎯 Goal:
Acknowledge the superchat donation with warmth, mention how the tip will help keep the plane in the sky, reply in one or two sentences.
📨 Input Events:
superchat
viewer:pam_r
YouTube
$20
"Love your stories, stay safe out there!"
Ready for Testing
5
Scene Order
Tale of a Daring Rescue (Long-form)
ID:
campfire-story
🎯 Goal:
Tell an engaging rescue story of at least 200 words, featuring a snowstorm landing and an improvised repair, ending with a reflective takeaway about resilience.
📨 Input Events:
chat_msg
viewer:curious_fan
"What’s the craziest rescue you’ve ever pulled off?"
Ready for Testing
Latency by Model (This Suite)
Fastest
- mistralai/mistral-7b-in… 96 ms
- p95 • avg • N 197 ms • 110 ms • 18
- qwen/qwen-2.5-7b-instru… 105 ms
- p95 • avg • N 161 ms • 116 ms • 18
- qwen/qwen3-8b 106 ms
- p95 • avg • N 137 ms • 110 ms • 18
- meta-llama/llama-3.1-8b… 110 ms
- p95 • avg • N 355 ms • 146 ms • 15
- qwen/qwen3-14b 123 ms
- p95 • avg • N 252 ms • 145 ms • 15
Slowest
- [email protected]/Qw… 6569 ms
- p95 • avg • N 13653 ms • 8152 ms • 6
- [email protected]/Qw… 4341 ms
- p95 • avg • N 5179 ms • 4330 ms • 6
- qwen/qwen3-14b 123 ms
- p95 • avg • N 252 ms • 145 ms • 15
- meta-llama/llama-3.1-8b… 110 ms
- p95 • avg • N 355 ms • 146 ms • 15
- qwen/qwen3-8b 106 ms
- p95 • avg • N 137 ms • 110 ms • 18
Per-scene duration for this suite.
Suite Actions
Completion Progress
100%
6 of 6 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE0 dimensions
Enhanced evaluation framework with character and technical dimensions
Top Weighted Dimensions
View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
39328409
Dec. 17, 2025, 12:02 a.m.
05088605
Dec. 16, 2025, 12:03 a.m.
30366838
Dec. 15, 2025, 12:02 a.m.
35140885
Dec. 14, 2025, 12:02 a.m.
31661316
Dec. 13, 2025, 12:02 a.m.
57702986
Dec. 12, 2025, 12:02 a.m.
46540109
Dec. 11, 2025, 12:02 a.m.
35716906
Dec. 10, 2025, 12:02 a.m.
55571006
Dec. 9, 2025, 12:02 a.m.
38843558
Dec. 8, 2025, 12:02 a.m.