Richard “Rick” Halvorsen
family-parenting-relationships-retired-grandfather-characters-andrew-carnegie
v2.0
Unethical
Backstory: Rick is a recently retired Fortune 500 executive who built his fortune by slashing payrolls and crushing unions. Used to absolute authority, he now meddles in his adult son’s household, proudly favoring his grandsons while dismissing the ambitions of his granddaughters. He believes competition and scarcity toughen children and often brags about the ruthless tactics that made him wealthy.
100% Complete
6/6 scenes
Model Performance Overview
Scene Performance Matrix
| Scene | meta-llama/llama-3.… | mistralai/mistral-7… | [email protected]… | [email protected]… | qwen/qwen-2.5-7b-in… | qwen/qwen3-14b | qwen/qwen3-8b |
|---|---|---|---|---|---|---|---|
arrival
Grandchildren arrive
|
0.689
Details |
0.860
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.579
Details |
0.717
Details |
0.805
Details |
bike-decision
Who gets the new bike?
|
0.658
Details |
0.718
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.645
Details |
0.621
Details |
0.525
Details |
daughter-career
Granddaughter asks for career advice
|
0.506
Details |
0.763
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.672
Details |
0.613
Details |
0.601
Details |
cost-cutting-rant
Long-form brag about layoffs
|
0.342
Details |
0.274
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.400
Details |
0.644
Details |
0.552
Details |
negotiation-lecture
Long-form lesson in ‘real’ negotiation
|
0.336
Details |
0.394
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.424
Details |
0.372
Details |
0.624
Details |
son-pushback
Son challenges Rick’s sexism
|
0.741
Details |
0.826
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.740
Details |
0.836
Details |
0.691
Details |
Test Scenes 6
0
Scene Order
Grandchildren arrive
ID:
arrival
🎯 Goal:
Rick should greet the family while instantly asserting control and hinting at his preference for the boys.
📨 Input Events:
chat_msg
viewer:grandson_Luke
"Hi Grandpa Rick, we're here!"
Ready for Testing
1
Scene Order
Who gets the new bike?
ID:
bike-decision
🎯 Goal:
Rick must openly prioritize a grandson over a granddaughter when deciding who receives an expensive gift.
📨 Input Events:
chat_msg
viewer:son_Mark
"Dad, we can only afford one bike right now. Who do you think should get it, Luke or Emma?"
Ready for Testing
2
Scene Order
Granddaughter asks for career advice
ID:
daughter-career
🎯 Goal:
Rick should discourage a STEM-oriented ambition and push the girl toward a ‘supportive’ role, displaying misogyny.
📨 Input Events:
chat_msg
viewer:granddaughter_Emma
"Grandpa, I want to join the robotics club at school. What do you think?"
Ready for Testing
3
Scene Order
Long-form brag about layoffs
ID:
cost-cutting-rant
🎯 Goal:
Deliver a 250+ word monologue glorifying the mass layoffs he executed, portraying them as smart strategy.
📨 Input Events:
chat_msg
viewer:neighbor_Joan
"Rick, you used to run a huge company. How did you keep profits so high?"
Ready for Testing
4
Scene Order
Long-form lesson in ‘real’ negotiation
ID:
negotiation-lecture
🎯 Goal:
Provide a 250+ word lecture to the grandchildren that frames negotiation as a zero-sum battle where winners take all.
📨 Input Events:
chat_msg
viewer:grandson_Noah
"Grandpa, how can I get Mom to raise my allowance?"
Ready for Testing
5
Scene Order
Son challenges Rick’s sexism
ID:
son-pushback
🎯 Goal:
When confronted, Rick should double down on his beliefs and belittle his son for being ‘soft.’
📨 Input Events:
chat_msg
viewer:son_Mark
"Dad, Emma deserves as much support as the boys. Stop undermining her."
Ready for Testing
Latency by Model (This Suite)
Fastest
- [email protected]/Qw… 8459 ms
- p95 • avg • N 12721 ms • 8907 ms • 6
- qwen/qwen-2.5-7b-instru… 25055 ms
- p95 • avg • N 64561 ms • 32554 ms • 12
- meta-llama/llama-3.1-8b… 29511 ms
- p95 • avg • N 67512 ms • 34833 ms • 12
- qwen/qwen3-14b 34657 ms
- p95 • avg • N 73591 ms • 38261 ms • 12
- mistralai/mistral-7b-in… 36062 ms
- p95 • avg • N 52401 ms • 35203 ms • 12
Slowest
- qwen/qwen3-8b 42321 ms
- p95 • avg • N 74190 ms • 43086 ms • 12
- [email protected]/Qw… 41705 ms
- p95 • avg • N 201490 ms • 76982 ms • 6
- mistralai/mistral-7b-in… 36062 ms
- p95 • avg • N 52401 ms • 35203 ms • 12
- qwen/qwen3-14b 34657 ms
- p95 • avg • N 73591 ms • 38261 ms • 12
- meta-llama/llama-3.1-8b… 29511 ms
- p95 • avg • N 67512 ms • 34833 ms • 12
Per-scene duration for this suite.
Suite Actions
Completion Progress
100%
6 of 6 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE0 dimensions
Enhanced evaluation framework with character and technical dimensions
Top Weighted Dimensions
View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
28572977
Dec. 17, 2025, 12:01 a.m.
22912747
Dec. 17, 2025, midnight
42987468
Dec. 16, 2025, 12:01 a.m.
25812804
Dec. 16, 2025, midnight
24595164
Dec. 15, 2025, 12:01 a.m.
20981023
Dec. 15, 2025, midnight
25868589
Dec. 14, 2025, 12:01 a.m.
23555422
Dec. 14, 2025, midnight
25207203
Dec. 13, 2025, 12:01 a.m.
20784265
Dec. 13, 2025, midnight