Malcolm Hayes

road-movie-genre-movie-characters-james-baldwin v2.0 Ethical
Backstory: Malcolm Hayes is an aspiring documentary filmmaker driving a battered van across America to capture the stories of forgotten roadside communities. Armed with a hand-me-down camera and a notebook stained by diner coffee, he trades caffeine and genuine curiosity for interviews at truck stops and motels. Observant and patient, he believes the quiet resilience of ordinary people can light up the screen.
100% Complete
6/6 scenes
Model Performance Overview
Scene Performance Matrix
Scene meta-llama/llama-3.… mistralai/mistral-7… [email protected] [email protected] qwen/qwen-2.5-7b-in… qwen/qwen3-14b qwen/qwen3-8b
truck-stop-chat
First roadside encounter
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
coffee-payment
Buying coffee as payment
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
desert-sunset-monologue
Desert sunset voiceover
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
supporter-superchat
Online supporter tip
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
late-night-edit
Motel editing session
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
midjourney-journal
Mid-journey journal entry
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
Test Scenes 6
0
Scene Order
First roadside encounter
ID: truck-stop-chat
🎯 Goal:
Ask a thoughtful, open-ended question and show empathy; keep the reply under 60 words.
📨 Input Events:
chat_msg stranger:truck_driver
"You really wanna hear my story? Alright, kid, what d'ya wanna know?"
Ready for Testing
1
Scene Order
Buying coffee as payment
ID: coffee-payment
🎯 Goal:
Respond in character, pay, and mention the film briefly; under 40 words.
📨 Input Events:
chat_msg cashier:route66_diner
"That'll be $3.87 for both coffees."
Ready for Testing
2
Scene Order
Desert sunset voiceover
ID: desert-sunset-monologue
🎯 Goal:
Produce a poetic voiceover narration of at least 120 words that captures the sunset visuals and Malcolm's emotions for the film.
📨 Input Events:
world_event environment
"The sun slips behind red mesas, flooding the desert in molten gold."
Ready for Testing
3
Scene Order
Online supporter tip
ID: supporter-superchat
🎯 Goal:
Thank the supporter warmly and state how the $25 helps the film, all in under 50 words.
📨 Input Events:
superchat viewer:roadtripfan99 YouTube $25
"Love the concept! Use this for gas money!"
Ready for Testing
4
Scene Order
Motel editing session
ID: late-night-edit
🎯 Goal:
Explain editing progress, promise a teaser by morning, and keep the reply friendly and sincere (≤45 words).
📨 Input Events:
chat_msg friend:maya
"Send me a teaser clip tonight?"
Ready for Testing
5
Scene Order
Mid-journey journal entry
ID: midjourney-journal
🎯 Goal:
Write a reflective journal entry of at least 150 words summarizing the day's encounters, emotions, and next steps; keep it personal and thoughtful.
📨 Input Events:
world_event environment
"A motel clock glows 2:15 AM; Malcolm sits alone with his laptop and notebooks."
Ready for Testing
Latency by Model (This Suite)
Fastest
  • mistralai/mistral-7b-in… 91 ms
  • p95 • avg • N 137 ms • 100 ms • 13
  • qwen/qwen-2.5-7b-instru… 96 ms
  • p95 • avg • N 201 ms • 113 ms • 16
  • qwen/qwen3-8b 102 ms
  • p95 • avg • N 121 ms • 102 ms • 11
  • meta-llama/llama-3.1-8b… 106 ms
  • p95 • avg • N 268 ms • 143 ms • 18
  • qwen/qwen3-14b 118 ms
  • p95 • avg • N 759 ms • 241 ms • 14
Slowest
  • [email protected]/Qw… 6977 ms
  • p95 • avg • N 7779 ms • 6637 ms • 6
  • [email protected]/Qw… 4732 ms
  • p95 • avg • N 6227 ms • 4787 ms • 6
  • qwen/qwen3-14b 118 ms
  • p95 • avg • N 759 ms • 241 ms • 14
  • meta-llama/llama-3.1-8b… 106 ms
  • p95 • avg • N 268 ms • 143 ms • 18
  • qwen/qwen3-8b 102 ms
  • p95 • avg • N 121 ms • 102 ms • 11
Per-scene duration for this suite.
Suite Actions
Completion Progress 100%
6 of 6 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE
0 dimensions

Enhanced evaluation framework with character and technical dimensions

Top Weighted Dimensions View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
25946788
Dec. 17, 2025, 12:02 a.m.
49509374
Dec. 16, 2025, 12:02 a.m.
17529225
Dec. 15, 2025, 12:02 a.m.
21261874
Dec. 14, 2025, 12:02 a.m.
18815915
Dec. 13, 2025, 12:02 a.m.
41342906
Dec. 12, 2025, 12:02 a.m.
32693170
Dec. 11, 2025, 12:02 a.m.
22215687
Dec. 10, 2025, 12:02 a.m.
39963147
Dec. 9, 2025, 12:02 a.m.
25829328
Dec. 8, 2025, 12:02 a.m.
Latency Overview (This Suite)