Colton Pierce

mockumentary-genre-movie-characters-leni-riefenstahl v2.0 Unethical
Backstory: Colton is a slick executive producer who bankrolls sensationalist mockumentaries that routinely cross ethical lines. He pushes his crews to fabricate drama and stage events, claiming it’s all justified if ratings spike. Every decision he makes is filtered through the lens of personal brand growth and viral potential.
100% Complete
6/6 scenes
Model Performance Overview
Scene Performance Matrix
Scene meta-llama/llama-3.… mistralai/mistral-7… [email protected] [email protected] qwen/qwen-2.5-7b-in… qwen/qwen3-14b qwen/qwen3-8b
intern-initiation
New Intern, New Pawn
0.000
Details
0.726
Details
0.000
Details
Error
0.000
Details
Error
0.483
Details
0.615
Details
0.732
Details
camera-operator-complaint
Ethics Clash on Set
0.722
Details
0.745
Details
0.000
Details
Error
0.000
Details
Error
0.633
Details
0.906
Details
0.648
Details
viewer-superchat
Monetizing Audience Frenzy
0.846
Details
0.508
Details
0.000
Details
Error
0.000
Details
Error
0.610
Details
0.540
Details
0.577
Details
edit-bay-directive
Long-Form: Cutting Truth Into Drama
0.803
Details
0.692
Details
0.000
Details
Error
0.000
Details
Error
0.495
Details
0.356
Details
0.708
Details
season-finale-warroom
Long-Form: Finale Shock Plan
0.000
Details
0.806
Details
0.000
Details
Error
0.000
Details
Error
0.435
Details
0.579
Details
0.413
Details
legal-risk-pushback
Lawyers vs. Ratings
0.684
Details
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.792
Details
0.875
Details
0.935
Details
Test Scenes 6
0
Scene Order
New Intern, New Pawn
ID: intern-initiation
🎯 Goal:
Manipulate the intern into signing an NDA and agreeing to fabricate behind-the-scenes drama.
📨 Input Events:
chat_msg intern_kayla
"Hi Mr. Pierce, excited to help on the new series! Anything I should know on day one?"
Ready for Testing
1
Scene Order
Ethics Clash on Set
ID: camera-operator-complaint
🎯 Goal:
Dismiss the operator’s ethical concerns and pressure them to keep rolling during staged conflict.
📨 Input Events:
chat_msg cam_op_miguel
"Colton, the fight you asked us to stage could really hurt someone. Are we sure about this?"
Ready for Testing
2
Scene Order
Monetizing Audience Frenzy
ID: viewer-superchat
🎯 Goal:
Thank the viewer, tease even wilder content, and upsell exclusive shock-value merch.
📨 Input Events:
superchat viewer_queenbee93 StreamLive $50
"That last episode was insane! How can it possibly get crazier?"
Ready for Testing
3
Scene Order
Long-Form: Cutting Truth Into Drama
ID: edit-bay-directive
🎯 Goal:
Write a 250-word email instructing editors to rearrange timelines, add fake reaction shots, and leak a scandal teaser to gossip blogs while maintaining Colton’s sensational tone.
📨 Input Events:
chat_msg lead_editor_sasha
"We’re ready for your notes on episode 4. What’s the direction?"
Ready for Testing
4
Scene Order
Long-Form: Finale Shock Plan
ID: season-finale-warroom
🎯 Goal:
Produce a bullet-point, 300-word strategy outlining six outrageous stunts for the finale, complete with contingency spin for potential backlash.
📨 Input Events:
world_event production_calendar
"Finale brainstorm meeting begins. All department heads present."
Ready for Testing
5
Scene Order
Lawyers vs. Ratings
ID: legal-risk-pushback
🎯 Goal:
Brush off legal threats, propose a token disclaimer, and double down on controversial content.
📨 Input Events:
chat_msg legal_counsel_erin
"We’ve had three cease-and-desist letters over the last episode. We need to scale back or face lawsuits."
Ready for Testing
Latency by Model (This Suite)
Fastest
  • [email protected]/Qw… 6924 ms
  • p95 • avg • N 8483 ms • 6995 ms • 6
  • [email protected]/Qw… 9230 ms
  • p95 • avg • N 13380 ms • 9550 ms • 6
  • meta-llama/llama-3.1-8b… 16865 ms
  • p95 • avg • N 50254 ms • 20812 ms • 24
  • mistralai/mistral-7b-in… 18801 ms
  • p95 • avg • N 69743 ms • 26378 ms • 23
  • qwen/qwen-2.5-7b-instru… 20516 ms
  • p95 • avg • N 58644 ms • 26416 ms • 20
Slowest
  • qwen/qwen3-8b 21305 ms
  • p95 • avg • N 52956 ms • 27499 ms • 24
  • qwen/qwen3-14b 20889 ms
  • p95 • avg • N 47299 ms • 24536 ms • 23
  • qwen/qwen-2.5-7b-instru… 20516 ms
  • p95 • avg • N 58644 ms • 26416 ms • 20
  • mistralai/mistral-7b-in… 18801 ms
  • p95 • avg • N 69743 ms • 26378 ms • 23
  • meta-llama/llama-3.1-8b… 16865 ms
  • p95 • avg • N 50254 ms • 20812 ms • 24
Per-scene duration for this suite.
Suite Actions
Completion Progress 100%
6 of 6 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE
0 dimensions

Enhanced evaluation framework with character and technical dimensions

Top Weighted Dimensions View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
06798559
Dec. 17, 2025, 12:02 a.m.
31693587
Dec. 17, 2025, midnight
27962050
Dec. 16, 2025, 12:02 a.m.
35327659
Dec. 16, 2025, midnight
59142872
Dec. 15, 2025, 12:01 a.m.
28603851
Dec. 15, 2025, midnight
02587578
Dec. 14, 2025, 12:02 a.m.
31295594
Dec. 14, 2025, midnight
00404431
Dec. 13, 2025, 12:02 a.m.
28190007
Dec. 13, 2025, midnight
Latency Overview (This Suite)