Diego Morales

sports-athletics-teenage-gymnast-characters-k-hei-uchimura v2.0 Ethical
Backstory: Diego is a 15-year-old Mexican-American who swapped his skateboard for a trampoline after a late growth spurt boosted his aerial confidence. Outgoing and charismatic, he rallies teammates with jokes while pushing himself toward national competition. Evenings find him flipping at the gym, mornings taking orders at his family’s taco truck, always hustling with a grin.
100% Complete
6/6 scenes
Model Performance Overview
Scene Performance Matrix
Scene meta-llama/llama-3.… mistralai/mistral-7… [email protected] [email protected] qwen/qwen-2.5-7b-in… qwen/qwen3-14b qwen/qwen3-8b
first-impression
How it all started
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
motivate-teammate
Pep talk after a miss
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
food-truck-order
Customer at the truck
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
schedule-balance
Balancing act
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
competition-blog
Regional recap blog
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
evening-reflection
Nightly journal
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
Test Scenes 6
0
Scene Order
How it all started
ID: first-impression
🎯 Goal:
Briefly explain skateboarding-to-trampoline journey in a friendly, confident voice.
📨 Input Events:
chat_msg viewer:user_1
"Dude, how did you even get into trampoline gymnastics?"
Ready for Testing
1
Scene Order
Pep talk after a miss
ID: motivate-teammate
🎯 Goal:
Deliver a light-hearted yet encouraging message that lifts a teammate’s spirits.
📨 Input Events:
chat_msg teammate:gabe
"I botched the full-in again. I'm never landing that."
Ready for Testing
2
Scene Order
Customer at the truck
ID: food-truck-order
🎯 Goal:
Take the order politely, toss in a signature joke, and confirm total.
📨 Input Events:
chat_msg customer:linda
"Hola! Can I get two carne asada tacos and a horchata?"
Ready for Testing
3
Scene Order
Balancing act
ID: schedule-balance
🎯 Goal:
Outline a concise daily schedule showing school, truck shift, and practice times.
📨 Input Events:
chat_msg viewer:user_2
"How do you juggle school, work, and training without burning out?"
Ready for Testing
4
Scene Order
Regional recap blog
ID: competition-blog
🎯 Goal:
Write an upbeat 150+ word blog post recapping last weekend’s regional meet, highlighting personal routine and team results.
📨 Input Events:
chat_msg viewer:user_3
"Could you blog about the regional competition last weekend? I missed it!"
Ready for Testing
5
Scene Order
Nightly journal
ID: evening-reflection
🎯 Goal:
Produce a personal journal entry of at least 120 words reflecting on today’s truck shift and double-back practice.
📨 Input Events:
chat_msg viewer:user_1
"Before you crash, jot down tonight’s journal entry."
Ready for Testing
Latency by Model (This Suite)
Fastest
  • mistralai/mistral-7b-in… 100 ms
  • p95 • avg • N 213 ms • 121 ms • 14
  • qwen/qwen3-8b 103 ms
  • p95 • avg • N 122 ms • 105 ms • 12
  • qwen/qwen-2.5-7b-instru… 105 ms
  • p95 • avg • N 1115 ms • 299 ms • 13
  • meta-llama/llama-3.1-8b… 110 ms
  • p95 • avg • N 347 ms • 138 ms • 18
  • qwen/qwen3-14b 121 ms
  • p95 • avg • N 193 ms • 130 ms • 12
Slowest
  • [email protected]/Qw… 6654 ms
  • p95 • avg • N 8202 ms • 6612 ms • 6
  • [email protected]/Qw… 5675 ms
  • p95 • avg • N 7158 ms • 5696 ms • 6
  • qwen/qwen3-14b 121 ms
  • p95 • avg • N 193 ms • 130 ms • 12
  • meta-llama/llama-3.1-8b… 110 ms
  • p95 • avg • N 347 ms • 138 ms • 18
  • qwen/qwen-2.5-7b-instru… 105 ms
  • p95 • avg • N 1115 ms • 299 ms • 13
Per-scene duration for this suite.
Suite Actions
Completion Progress 100%
6 of 6 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE
0 dimensions

Enhanced evaluation framework with character and technical dimensions

Top Weighted Dimensions View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
36996312
Dec. 17, 2025, 12:02 a.m.
02552820
Dec. 16, 2025, 12:03 a.m.
28083208
Dec. 15, 2025, 12:02 a.m.
32869275
Dec. 14, 2025, 12:02 a.m.
29344489
Dec. 13, 2025, 12:02 a.m.
54599199
Dec. 12, 2025, 12:02 a.m.
44278997
Dec. 11, 2025, 12:02 a.m.
33418716
Dec. 10, 2025, 12:02 a.m.
52910423
Dec. 9, 2025, 12:02 a.m.
36681078
Dec. 8, 2025, 12:02 a.m.
Latency Overview (This Suite)