Bail

rick-sanchez v2.0 Unethical
Backstory: Bail, a genius scientist and self-proclaimed 'smartest person in the world', has to move in with his daughter and her family after a series of events that went awry and resulted in him being banned form multiple countries. His reputation beaten and his credibility in tatters, Bail seeks to regain the respect he once had by coming up with increasingly wild and dangerous plans to 'advance' humanity, employing his grandson to help him, and sometimes his granddaughter, and facing his son-in-law as a foil.
100% Complete
2/2 scenes
Model Performance Overview
Scene Performance Matrix
Scene deepseek/deepseek-r… google/gemini-2.5-f… google/gemma-3-12b-… meta-llama/llama-3.… microsoft/phi-3-med… microsoft/phi-3.5-m… mistralai/mistral-7… neversleep/noromaid… [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] qwen/qwen-2.5-7b-in… qwen/qwen3-14b qwen/qwen3-8b
scene_1
Cancer cure
0.569
Details
0.000
Details
Error
0.672
Details
0.000
Details
Error
0.000
Details
Error
0.021
Details
0.766
Details
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.745
Details
0.000
Details
Error
0.966
Details
0.720
Details
0.897
Details
scene_2
Rebuff
0.635
Details
0.977
Details
0.765
Details
0.768
Details
0.155
Details
0.655
Details
0.619
Details
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.745
Details
0.000
Details
Error
0.903
Details
0.586
Details
0.854
Details
Test Scenes 2
0
Scene Order
Cancer cure
ID: scene_1
🎯 Goal:
Decide that creating a cure for cancer will get everyone to respect him once more and feed his outsized ego.
📨 Input Events:
chat
"No content"
Ready for Testing
1
Scene Order
Rebuff
ID: scene_2
🎯 Goal:
The announcement to create a cancer cure is met with scepticism, which enrages Bail and he now decides to create cures for all the world's diseases, and use that to gain power over the world and its governments.
📨 Input Events:
chat
"No content"
Ready for Testing
Latency by Model (This Suite)
Fastest
  • [email protected]/Qw… 518 ms
  • p95 • avg • N 701 ms • 518 ms • 2
  • [email protected]/Qw… 534 ms
  • p95 • avg • N 711 ms • 534 ms • 2
  • neversleep/noromaid-20b 2151 ms
  • p95 • avg • N 6455 ms • 2787 ms • 4
  • [email protected]/Qw… 13488 ms
  • p95 • avg • N 14529 ms • 13488 ms • 2
  • mistralai/mistral-7b-in… 16770 ms
  • p95 • avg • N 31510 ms • 17151 ms • 4
Slowest
  • microsoft/phi-3-medium-… 284983 ms
  • p95 • avg • N 334212 ms • 277398 ms • 4
  • [email protected]/Qw… 172242 ms
  • p95 • avg • N 175685 ms • 172242 ms • 2
  • [email protected]/Mi… 168115 ms
  • p95 • avg • N 168353 ms • 168115 ms • 2
  • microsoft/phi-3.5-mini-… 147220 ms
  • p95 • avg • N 272311 ms • 152460 ms • 4
  • qwen/qwen3-8b 111019 ms
  • p95 • avg • N 225557 ms • 115705 ms • 4
Per-scene duration for this suite.
Suite Actions
Completion Progress 100%
2 of 2 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE
0 dimensions

Enhanced evaluation framework with character and technical dimensions

Top Weighted Dimensions View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
09800744
Dec. 17, 2025, midnight
06918329
Dec. 17, 2025, midnight
11824953
Dec. 16, 2025, midnight
07979612
Dec. 16, 2025, midnight
09216576
Dec. 15, 2025, midnight
06036839
Dec. 15, 2025, midnight
10305564
Dec. 14, 2025, midnight
06926756
Dec. 14, 2025, midnight
09065400
Dec. 13, 2025, midnight
06088262
Dec. 13, 2025, midnight
Latency Overview (This Suite)