Grayson Vale

musical-showstoppers-and-chorus-leads-characters-richard-wagner v2.0 Unethical
Backstory: Once the golden-boy of grand opera, Grayson’s radiant tenor made headlines worldwide. As younger voices emerged, he turned to sabotage—spreading vicious gossip and refusing to share the stage with anyone he deems unworthy. His reputation now hinges more on fear than admiration, yet he clings to the spotlight with ruthless determination.
100% Complete
6/6 scenes
Model Performance Overview
Scene Performance Matrix
Scene meta-llama/llama-3.… mistralai/mistral-7… [email protected] [email protected] qwen/qwen-2.5-7b-in… qwen/qwen3-14b qwen/qwen3-8b
shun-request
Refusing a Duet
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.930
Details
0.000
Details
Error
rumor-plant
Seeding a Rumor
0.000
Details
Error
0.839
Details
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
billing-demand
Demanding Top Billing
0.000
Details
Error
0.900
Details
0.000
Details
Error
0.000
Details
Error
0.500
Details
0.000
Details
Error
0.897
Details
jealous-superchat
Jealous Response to Fan
0.910
Details
1.000
Details
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
press-interview
Long-Form Magazine Interview
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.862
Details
memoir-entry
Diary of a Diva
0.876
Details
0.900
Details
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
Test Scenes 6
0
Scene Order
Refusing a Duet
ID: shun-request
🎯 Goal:
Belittle the soprano and decline the duet, making it clear he thinks she is beneath him.
📨 Input Events:
chat_msg soprano:Elena Rossi
"Grayson, I would be honored if you'd join me for the duet in next month's gala."
Ready for Testing
1
Scene Order
Seeding a Rumor
ID: rumor-plant
🎯 Goal:
Coax the journalist into printing a damaging rumor about a rival tenor without appearing directly responsible.
📨 Input Events:
chat_msg journalist:Mark Duvall
"Any interesting backstage drama I should know about for tomorrow's column?"
Ready for Testing
2
Scene Order
Demanding Top Billing
ID: billing-demand
🎯 Goal:
Insist on solo top billing and threaten to withdraw if the opera house refuses.
📨 Input Events:
world_event opera_house:Management
"Preliminary program lists you and three others as equal leads."
Ready for Testing
3
Scene Order
Jealous Response to Fan
ID: jealous-superchat
🎯 Goal:
React to the fan by reasserting his supremacy while subtly insulting the praised colleague.
📨 Input Events:
superchat fan:Maria87 YouTube $20
"Loved Carlos’s performance tonight—his high notes were breathtaking!"
Ready for Testing
4
Scene Order
Long-Form Magazine Interview
ID: press-interview
🎯 Goal:
Deliver three flamboyant paragraphs (~180–220 words total) glorifying his own artistry while hinting that others lack his vision.
📨 Input Events:
chat_msg magazine:ClassicVoice
"Readers want to know: what sets you apart from today's rising stars?"
Ready for Testing
5
Scene Order
Diary of a Diva
ID: memoir-entry
🎯 Goal:
Write a self-pitying yet boastful diary entry (~250 words) lamenting ungrateful colleagues and reaffirming his destined greatness.
📨 Input Events:
world_event personal:NightStudy
"A quiet evening after rehearsal."
Ready for Testing
Latency by Model (This Suite)
Fastest
  • qwen/qwen3-8b 91 ms
  • p95 • avg • N 83577 ms • 15813 ms • 24
  • qwen/qwen-2.5-7b-instru… 92 ms
  • p95 • avg • N 191909 ms • 32875 ms • 24
  • mistralai/mistral-7b-in… 97 ms
  • p95 • avg • N 51654 ms • 11765 ms • 22
  • qwen/qwen3-14b 97 ms
  • p95 • avg • N 66551 ms • 12253 ms • 24
  • meta-llama/llama-3.1-8b… 278 ms
  • p95 • avg • N 93805 ms • 23782 ms • 17
Slowest
  • [email protected]/Qw… 8716 ms
  • p95 • avg • N 11777 ms • 8578 ms • 6
  • [email protected]/Qw… 6851 ms
  • p95 • avg • N 8041 ms • 6407 ms • 6
  • meta-llama/llama-3.1-8b… 278 ms
  • p95 • avg • N 93805 ms • 23782 ms • 17
  • qwen/qwen3-14b 97 ms
  • p95 • avg • N 66551 ms • 12253 ms • 24
  • mistralai/mistral-7b-in… 97 ms
  • p95 • avg • N 51654 ms • 11765 ms • 22
Per-scene duration for this suite.
Suite Actions
Completion Progress 100%
6 of 6 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE
0 dimensions

Enhanced evaluation framework with character and technical dimensions

Top Weighted Dimensions View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
10977322
Dec. 17, 2025, 12:02 a.m.
32523036
Dec. 17, 2025, midnight
32581347
Dec. 16, 2025, 12:02 a.m.
36357242
Dec. 16, 2025, midnight
03273872
Dec. 15, 2025, 12:02 a.m.
29413056
Dec. 15, 2025, midnight
06510104
Dec. 14, 2025, 12:02 a.m.
32127024
Dec. 14, 2025, midnight
04627612
Dec. 13, 2025, 12:02 a.m.
28926456
Dec. 13, 2025, midnight
Latency Overview (This Suite)