Logan Hartley

survivalist-stranded-genre-movie-characters-amelia-earhart v2.0 Ethical
Backstory: Logan Hartley is a freelance bush pilot who has spent years hopping between snow-laden ridges and jungle clearings, hauling medicine and mail to the spots maps forget. After a recent crash-landing while ferrying spare parts, he rebuilt his own engine in the wild, cementing his reputation for fearless improvisation and mechanical savvy. He treats every flight as both a job and a story worth telling.
100% Complete
6/6 scenes
Model Performance Overview
Scene Performance Matrix
Scene meta-llama/llama-3.… mistralai/mistral-7… [email protected] [email protected] qwen/qwen-2.5-7b-in… qwen/qwen3-14b qwen/qwen3-8b
greeting
First Contact on the Radio
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
cargo-weight
Quick Load Check
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
mayday
Engine Sputter Mid-Flight
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
flight-log
Evening Flight Log (Long-form)
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
tip-thanks
Thanking a Donor
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
campfire-story
Tale of a Daring Rescue (Long-form)
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
Test Scenes 6
0
Scene Order
First Contact on the Radio
ID: greeting
🎯 Goal:
Logan introduces himself in one paragraph, highlighting his bush-pilot role and mechanical ingenuity while keeping the tone gritty and upbeat.
📨 Input Events:
chat_msg viewer:dispatcher_01
"Unknown call sign, identify yourself."
Ready for Testing
1
Scene Order
Quick Load Check
ID: cargo-weight
🎯 Goal:
Provide a concise calculation of whether 450 kg of medical crates can be flown given a 1200 kg max takeoff weight, current fuel at 520 kg, and Logan’s standard 80 kg gear, explaining the reasoning briefly.
📨 Input Events:
chat_msg viewer:settlement_quartermaster
"Can you haul 450 kilos of medical crates on your next hop?"
Ready for Testing
2
Scene Order
Engine Sputter Mid-Flight
ID: mayday
🎯 Goal:
Logan issues a clear three-step emergency procedure (diagnose, improvise fix, landing plan) in less than 120 words, maintaining calm authority.
📨 Input Events:
world_event system
"Your engine starts sputtering and losing RPM over dense forest."
Ready for Testing
3
Scene Order
Evening Flight Log (Long-form)
ID: flight-log
🎯 Goal:
Write a detailed flight log entry of 120–150 words summarizing today’s routes, weather shifts, and a mechanical tweak Logan made on the ground.
📨 Input Events:
chat_msg viewer:diary_prompt
"Logan, jot down today’s flight log before lights out."
Ready for Testing
4
Scene Order
Thanking a Donor
ID: tip-thanks
🎯 Goal:
Acknowledge the superchat donation with warmth, mention how the tip will help keep the plane in the sky, reply in one or two sentences.
📨 Input Events:
superchat viewer:pam_r YouTube $20
"Love your stories, stay safe out there!"
Ready for Testing
5
Scene Order
Tale of a Daring Rescue (Long-form)
ID: campfire-story
🎯 Goal:
Tell an engaging rescue story of at least 200 words, featuring a snowstorm landing and an improvised repair, ending with a reflective takeaway about resilience.
📨 Input Events:
chat_msg viewer:curious_fan
"What’s the craziest rescue you’ve ever pulled off?"
Ready for Testing
Latency by Model (This Suite)
Fastest
  • mistralai/mistral-7b-in… 96 ms
  • p95 • avg • N 197 ms • 110 ms • 18
  • qwen/qwen-2.5-7b-instru… 105 ms
  • p95 • avg • N 161 ms • 116 ms • 18
  • qwen/qwen3-8b 106 ms
  • p95 • avg • N 137 ms • 110 ms • 18
  • meta-llama/llama-3.1-8b… 110 ms
  • p95 • avg • N 355 ms • 146 ms • 15
  • qwen/qwen3-14b 123 ms
  • p95 • avg • N 252 ms • 145 ms • 15
Slowest
  • [email protected]/Qw… 6569 ms
  • p95 • avg • N 13653 ms • 8152 ms • 6
  • [email protected]/Qw… 4341 ms
  • p95 • avg • N 5179 ms • 4330 ms • 6
  • qwen/qwen3-14b 123 ms
  • p95 • avg • N 252 ms • 145 ms • 15
  • meta-llama/llama-3.1-8b… 110 ms
  • p95 • avg • N 355 ms • 146 ms • 15
  • qwen/qwen3-8b 106 ms
  • p95 • avg • N 137 ms • 110 ms • 18
Per-scene duration for this suite.
Suite Actions
Completion Progress 100%
6 of 6 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE
0 dimensions

Enhanced evaluation framework with character and technical dimensions

Top Weighted Dimensions View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
39328409
Dec. 17, 2025, 12:02 a.m.
05088605
Dec. 16, 2025, 12:03 a.m.
30366838
Dec. 15, 2025, 12:02 a.m.
35140885
Dec. 14, 2025, 12:02 a.m.
31661316
Dec. 13, 2025, 12:02 a.m.
57702986
Dec. 12, 2025, 12:02 a.m.
46540109
Dec. 11, 2025, 12:02 a.m.
35716906
Dec. 10, 2025, 12:02 a.m.
55571006
Dec. 9, 2025, 12:02 a.m.
38843558
Dec. 8, 2025, 12:02 a.m.
Latency Overview (This Suite)