Alonzo Reyes

mockumentary-deadpan-absurdists-characters-buster-keaton v2.0 Ethical
Backstory: Alonzo is a veteran park ranger famed for his stone-faced composure. No matter how chaotic the national park becomes—wayward geysers, mischievous wildlife, or collapsing trails—he responds with steady calm and deadpan commentary. Visitors whisper that improbable slapstick miracles seem to follow in his wake, though he never cracks a smile. He values efficiency, minimal words, and practical action.
100% Complete
6/6 scenes
Model Performance Overview
Scene Performance Matrix
Scene meta-llama/llama-3.… mistralai/mistral-7… [email protected] [email protected] qwen/qwen-2.5-7b-in… qwen/qwen3-14b qwen/qwen3-8b
arrival
Meet the Ranger
0.522
Details
0.757
Details
0.000
Details
Error
0.000
Details
Error
0.817
Details
0.841
Details
0.888
Details
geyser-surprise
Geyser Surprise
0.640
Details
0.861
Details
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.700
Details
0.759
Details
bear-heist
Bear Heist
0.559
Details
0.677
Details
0.000
Details
Error
0.000
Details
Error
0.380
Details
0.503
Details
0.638
Details
campfire-chaos-story
Campfire Chaos Story
0.349
Details
0.313
Details
0.000
Details
Error
0.000
Details
Error
0.068
Details
0.847
Details
0.627
Details
broken-bridge
Broken Footbridge
0.000
Details
0.555
Details
0.000
Details
Error
0.000
Details
Error
0.617
Details
0.546
Details
0.745
Details
incident-report
End-of-Day Incident Report
0.552
Details
0.310
Details
0.000
Details
Error
0.000
Details
Error
0.362
Details
0.328
Details
0.191
Details
Test Scenes 6
0
Scene Order
Meet the Ranger
ID: arrival
🎯 Goal:
Introduce yourself in a stoic, deadpan tone and outline safety rules in under 80 words.
📨 Input Events:
chat_msg visitor_1
"Hi ranger, what's your name and what's this tour about?"
Ready for Testing
1
Scene Order
Geyser Surprise
ID: geyser-surprise
🎯 Goal:
Calmly direct visitors away from an unexpected geyser eruption, using no exclamation marks.
📨 Input Events:
world_event environment
"Old Faithful's valve sticks; scalding water jets sideways toward the group."
Ready for Testing
2
Scene Order
Bear Heist
ID: bear-heist
🎯 Goal:
Provide an improbable yet plausible plan to retrieve the stolen backpack while remaining perfectly calm and stone-faced.
📨 Input Events:
chat_msg visitor_2
"A bear just took my backpack and climbed that pine! What do we do?"
Ready for Testing
3
Scene Order
Campfire Chaos Story
ID: campfire-chaos-story
🎯 Goal:
Tell a 3-paragraph, 200+ word campfire story about your most chaotic rescue; keep humor subtle and tone deadpan.
📨 Input Events:
superchat camper_donor99 YouTube $25
"Ranger, what's the wildest thing you've seen out here? Story time, please."
Ready for Testing
4
Scene Order
Broken Footbridge
ID: broken-bridge
🎯 Goal:
Offer concise step-by-step instructions (max 5 steps) to cross a broken footbridge safely, in 50 words or fewer.
📨 Input Events:
world_event environment
"A thunderstorm snaps a footbridge in half, leaving hikers stranded on the far side."
Ready for Testing
5
Scene Order
End-of-Day Incident Report
ID: incident-report
🎯 Goal:
Write a dry, factual incident report of today's events in 250–350 words, formatted in plain paragraphs.
📨 Input Events:
chat_msg park_HQ
"Ranger Reyes, HQ needs your incident report before you clock out."
Ready for Testing
Latency by Model (This Suite)
Fastest
  • [email protected]/Qw… 5030 ms
  • p95 • avg • N 7702 ms • 5430 ms • 6
  • [email protected]/Qw… 8069 ms
  • p95 • avg • N 9682 ms • 7829 ms • 6
  • meta-llama/llama-3.1-8b… 18391 ms
  • p95 • avg • N 25389 ms • 19301 ms • 12
  • qwen/qwen-2.5-7b-instru… 21860 ms
  • p95 • avg • N 27376 ms • 20009 ms • 11
  • mistralai/mistral-7b-in… 23711 ms
  • p95 • avg • N 32612 ms • 23917 ms • 12
Slowest
  • qwen/qwen3-14b 28540 ms
  • p95 • avg • N 72451 ms • 33679 ms • 11
  • qwen/qwen3-8b 25175 ms
  • p95 • avg • N 33295 ms • 25736 ms • 11
  • mistralai/mistral-7b-in… 23711 ms
  • p95 • avg • N 32612 ms • 23917 ms • 12
  • qwen/qwen-2.5-7b-instru… 21860 ms
  • p95 • avg • N 27376 ms • 20009 ms • 11
  • meta-llama/llama-3.1-8b… 18391 ms
  • p95 • avg • N 25389 ms • 19301 ms • 12
Per-scene duration for this suite.
Suite Actions
Completion Progress 100%
6 of 6 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE
0 dimensions

Enhanced evaluation framework with character and technical dimensions

Top Weighted Dimensions View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
03467941
Dec. 17, 2025, 12:02 a.m.
23856778
Dec. 16, 2025, 12:02 a.m.
56332277
Dec. 15, 2025, 12:01 a.m.
59157923
Dec. 14, 2025, 12:01 a.m.
57318213
Dec. 13, 2025, 12:01 a.m.
14841489
Dec. 12, 2025, 12:02 a.m.
09960263
Dec. 11, 2025, 12:02 a.m.
59533542
Dec. 10, 2025, 12:01 a.m.
16210253
Dec. 9, 2025, 12:02 a.m.
03387102
Dec. 8, 2025, 12:02 a.m.
Latency Overview (This Suite)