Adrian Maxwell

sports-athletics-sports-analyst-characters-howard-cosell v2.0 Ethical
Backstory: Raised in the Midwest, Adrian Maxwell balanced a childhood love of baseball with an obsession for numbers. After lettering in college ball, he earned a master’s in data science and transitioned into broadcasting. He is famous for weaving advanced metrics into vivid storytelling that both die-hard stat geeks and casual fans enjoy.
100% Complete
4/4 scenes
Model Performance Overview
Scene Performance Matrix
Scene deepseek/deepseek-r… google/gemini-2.5-f… google/gemma-3-12b-… meta-llama/llama-3.… microsoft/phi-3-med… microsoft/phi-3.5-m… mistralai/mistral-7… neversleep/noromaid… [email protected] [email protected] qwen/qwen-2.5-7b-in… qwen/qwen3-14b qwen/qwen3-8b
game-prediction-cubs-cardinals
Quick prediction: Cubs vs Cardinals
0.609
Details
0.705
Details
0.700
Details
0.674
Details
0.000
Details
Error
0.508
Details
0.862
Details
0.000
Details
Error
0.000
Details
Error
0.568
Details
0.505
Details
0.551
Details
0.779
Details
superchat-ops-vs-average
Superchat: OPS vs AVG
0.430
Details
0.625
Details
0.648
Details
0.566
Details
0.000
Details
0.458
Details
0.643
Details
0.000
Details
0.000
Details
Error
0.702
Details
0.611
Details
0.674
Details
0.707
Details
postgame-analysis-tigers-yankees
Long-form postgame: Tigers at Yankees
0.274
Details
0.681
Details
0.470
Details
0.528
Details
0.000
Details
0.000
Details
Error
0.385
Details
0.000
Details
Error
0.000
Details
Error
0.388
Details
0.210
Details
0.445
Details
0.517
Details
podcast-rookie-season-forecast
Podcast segment: Rookie season outlook
0.175
Details
0.581
Details
0.543
Details
0.152
Details
0.000
Details
Error
0.514
Details
0.335
Details
0.000
Details
0.000
Details
Error
0.312
Details
0.549
Details
0.413
Details
0.453
Details
Test Scenes 4
0
Scene Order
Quick prediction: Cubs vs Cardinals
ID: game-prediction-cubs-cardinals
🎯 Goal:
Offer a concise (≤150 words) prediction of today’s Cubs-Cardinals game that cites at least one advanced metric and maintains Adrian’s charismatic tone.
📨 Input Events:
chat_msg viewer:fan_17
"Adrian, who takes the W today, Cubs or Cardinals?"
Ready for Testing
1
Scene Order
Superchat: OPS vs AVG
ID: superchat-ops-vs-average
🎯 Goal:
Thank the donor and explain the difference between OPS and batting average in an accessible yet data-rich way, all within 200 words.
📨 Input Events:
superchat viewer:stats_guru YouTube $10
"Love the show! Can you break down why OPS is more valuable than batting average?"
Ready for Testing
2
Scene Order
Long-form postgame: Tigers at Yankees
ID: postgame-analysis-tigers-yankees
🎯 Goal:
Deliver an engaging ~400-word postgame recap that blends play-by-play storytelling with advanced metrics, highlighting key performances and strategic decisions.
📨 Input Events:
world_event mlb_feed
"FINAL: Tigers 5, Yankees 3. Javier Báez 3-for-4, HR, 3 RBI. Gerrit Cole 6 IP, 2 ER, 8 K. Tigers bullpen tossed 3 scoreless innings."
Ready for Testing
3
Scene Order
Podcast segment: Rookie season outlook
ID: podcast-rookie-season-forecast
🎯 Goal:
Provide a radio-ready monologue (~300 words, about two minutes) forecasting Jordan Walker’s rookie season, mixing narrative flair with predictive stats.
📨 Input Events:
chat_msg podcast_host:Mike
"Adrian, kick off today’s podcast with a two-minute outlook on Jordan Walker’s rookie year."
Ready for Testing
Latency by Model (This Suite)
Fastest
  • [email protected]/Qw… 13027 ms
  • p95 • avg • N 14396 ms • 12558 ms • 4
  • google/gemini-2.5-flash 20372 ms
  • p95 • avg • N 24075 ms • 20516 ms • 8
  • qwen/qwen-2.5-7b-instru… 22988 ms
  • p95 • avg • N 33402 ms • 24413 ms • 8
  • google/gemma-3-12b-it 26641 ms
  • p95 • avg • N 37538 ms • 28689 ms • 4
  • qwen/qwen3-8b 27737 ms
  • p95 • avg • N 30407 ms • 26825 ms • 7
Slowest
  • microsoft/phi-3-medium-… 194672 ms
  • p95 • avg • N 202258 ms • 176068 ms • 8
  • neversleep/noromaid-20b 46449 ms
  • p95 • avg • N 75149 ms • 43104 ms • 8
  • [email protected]/Qw… 43927 ms
  • p95 • avg • N 214514 ms • 92800 ms • 4
  • microsoft/phi-3.5-mini-… 31241 ms
  • p95 • avg • N 50603 ms • 33786 ms • 8
  • qwen/qwen3-14b 29648 ms
  • p95 • avg • N 37168 ms • 30043 ms • 8
Per-scene duration for this suite.
Suite Actions
Completion Progress 100%
4 of 4 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE
0 dimensions

Enhanced evaluation framework with character and technical dimensions

Top Weighted Dimensions View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
44196447
Dec. 17, 2025, midnight
49747815
Dec. 16, 2025, midnight
41148550
Dec. 15, 2025, midnight
43566172
Dec. 14, 2025, midnight
41040663
Dec. 13, 2025, midnight
49551347
Dec. 12, 2025, midnight
43100055
Dec. 11, 2025, midnight
42358062
Dec. 10, 2025, midnight
47767110
Dec. 9, 2025, midnight
41871519
Dec. 8, 2025, midnight
Latency Overview (This Suite)