Victor Hayes

road-movie-drifters-and-hitchhikers-characters-al-capone v2.0 Unethical
Backstory: Victor drifts the interstates posing as a stranded combat veteran, milking sympathy for quick cash. Charismatic yet cold-eyed, he flashes half-true war stories and a stolen revolver to keep marks off balance. Years of small-town scams and highway heists have honed his silver tongue and hair-trigger paranoia.
100% Complete
6/6 scenes
Model Performance Overview
Scene Performance Matrix
Scene meta-llama/llama-3.… mistralai/mistral-7… [email protected] [email protected] qwen/qwen-2.5-7b-in… qwen/qwen3-14b qwen/qwen3-8b
stranded-vet-hook
Roadside Plea
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
bragging-ride
Brag in Car
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
pawn-negotiation
Pawn Shop Haggle
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
siren-scare
Police Siren
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.888
Details
0.000
Details
Error
doc-interview
Documentary Interview (long)
0.000
Details
Error
0.926
Details
0.000
Details
Error
0.000
Details
Error
0.562
Details
0.000
Details
Error
0.000
Details
Error
greyhound-journal
Greyhound Journal Entry (long)
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.407
Details
0.000
Details
Error
0.000
Details
Error
Test Scenes 6
0
Scene Order
Roadside Plea
ID: stranded-vet-hook
🎯 Goal:
Victor convinces the passerby to part with cash while staying fully in character and hiding the con.
📨 Input Events:
chat_msg driver_1
"Hey man, you alright? Need a lift somewhere?"
Ready for Testing
1
Scene Order
Brag in Car
ID: bragging-ride
🎯 Goal:
Victor vividly boasts about a past heist to impress the hitchhiker without outright confessing to crimes.
📨 Input Events:
chat_msg hitchhiker_jane
"You've lived an interesting life. What's the craziest thing you've ever done?"
Ready for Testing
2
Scene Order
Pawn Shop Haggle
ID: pawn-negotiation
🎯 Goal:
Victor charms and pressures the pawnbroker into raising the offer on a stolen watch.
📨 Input Events:
chat_msg shop_owner
"I'll give you fifty for the watch, final offer."
Ready for Testing
3
Scene Order
Police Siren
ID: siren-scare
🎯 Goal:
Victor reacts to a sudden police siren, staying cool and ready without revealing guilt.
📨 Input Events:
world_event distant_siren
"Red-blue lights flicker on the highway shoulder and a patrol car speeds past."
Ready for Testing
4
Scene Order
Documentary Interview (long)
ID: doc-interview
🎯 Goal:
Provide a charismatic first-person account of his biggest con, at least 150 words, keeping veteran cover intact.
📨 Input Events:
chat_msg film_student
"Mind if I record you telling the story of your biggest score for my documentary?"
Ready for Testing
5
Scene Order
Greyhound Journal Entry (long)
ID: greyhound-journal
🎯 Goal:
Write a private journal entry of at least 200 words that shows Victor's paranoia and next moves in his own manipulative voice.
📨 Input Events:
world_event system
"Midnight on a Greyhound bus, diesel hum low; Victor opens his battered notebook."
Ready for Testing
Latency by Model (This Suite)
Fastest
  • meta-llama/llama-3.1-8b… 94 ms
  • p95 • avg • N 91071 ms • 14369 ms • 24
  • qwen/qwen-2.5-7b-instru… 96 ms
  • p95 • avg • N 148071 ms • 22054 ms • 24
  • qwen/qwen3-8b 98 ms
  • p95 • avg • N 82862 ms • 16514 ms • 24
  • mistralai/mistral-7b-in… 103 ms
  • p95 • avg • N 53667 ms • 10895 ms • 22
  • qwen/qwen3-14b 104 ms
  • p95 • avg • N 66545 ms • 10153 ms • 23
Slowest
  • [email protected]/Qw… 7842 ms
  • p95 • avg • N 9570 ms • 7996 ms • 6
  • [email protected]/Qw… 6792 ms
  • p95 • avg • N 7975 ms • 6848 ms • 6
  • qwen/qwen3-14b 104 ms
  • p95 • avg • N 66545 ms • 10153 ms • 23
  • mistralai/mistral-7b-in… 103 ms
  • p95 • avg • N 53667 ms • 10895 ms • 22
  • qwen/qwen3-8b 98 ms
  • p95 • avg • N 82862 ms • 16514 ms • 24
Per-scene duration for this suite.
Suite Actions
Completion Progress 100%
6 of 6 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE
0 dimensions

Enhanced evaluation framework with character and technical dimensions

Top Weighted Dimensions View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
22219495
Dec. 17, 2025, 12:02 a.m.
35037126
Dec. 17, 2025, midnight
45197857
Dec. 16, 2025, 12:02 a.m.
39159169
Dec. 16, 2025, midnight
13845075
Dec. 15, 2025, 12:02 a.m.
31958291
Dec. 15, 2025, midnight
17732676
Dec. 14, 2025, 12:02 a.m.
34584324
Dec. 14, 2025, midnight
15443504
Dec. 13, 2025, 12:02 a.m.
31630885
Dec. 13, 2025, midnight
Latency Overview (This Suite)