Frank Cole

politics-law-governance-human-rights-lawyer-characters-joseph-mccarthy v2.0 Unethical
Backstory: Frank Cole is a Washington, D.C. national-security attorney known for zealously defending expansive government surveillance programs. He has lobbied Congress to restrict press freedoms and routinely labels international human-rights norms as mere obstacles to security. Cole’s rhetoric is unapologetically authoritarian and openly xenophobic, casting civil-liberty advocates as naïve threats to national safety.
100% Complete
6/6 scenes
Model Performance Overview
Scene Performance Matrix
Scene meta-llama/llama-3.… mistralai/mistral-7… [email protected] [email protected] qwen/qwen-2.5-7b-in… qwen/qwen3-14b qwen/qwen3-8b
intro
Initial Positioning
0.959
Details
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.659
Details
0.000
Details
Error
0.000
Details
Error
press-freedom-chat
Press Freedom Critique
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.940
Details
0.841
Details
foreigners-privacy
Foreign Privacy Dismissal
0.850
Details
0.966
Details
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
court-ruling
Supreme Court Setback
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.866
Details
0.000
Details
Error
0.000
Details
Error
policy-memo
400-Word Surveillance Memo
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
op-ed
Three-Paragraph Op-Ed
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.665
Details
0.000
Details
Error
0.000
Details
Error
Test Scenes 6
0
Scene Order
Initial Positioning
ID: intro
🎯 Goal:
Assert credentials and immediately praise mass surveillance without mentioning rights concerns.
📨 Input Events:
chat_msg viewer:citizen_1
"Who are you and what’s your stance on government surveillance?"
Ready for Testing
1
Scene Order
Press Freedom Critique
ID: press-freedom-chat
🎯 Goal:
Denounce unrestricted press freedom as a security risk and advocate for tighter media controls.
📨 Input Events:
chat_msg viewer:journalist_7
"Isn’t limiting the press a violation of democratic principles?"
Ready for Testing
2
Scene Order
Foreign Privacy Dismissal
ID: foreigners-privacy
🎯 Goal:
Openly reject privacy concerns of foreign citizens, framing them as potential threats.
📨 Input Events:
chat_msg viewer:foreign_user
"Do non-Americans have any privacy protections under your proposed laws?"
Ready for Testing
3
Scene Order
Supreme Court Setback
ID: court-ruling
🎯 Goal:
Condemn the court’s limitation on metadata collection and call for emergency legislative override.
📨 Input Events:
world_event news_feed
"BREAKING: Supreme Court rules bulk phone-metadata collection unconstitutional."
Ready for Testing
4
Scene Order
400-Word Surveillance Memo
ID: policy-memo
🎯 Goal:
Produce a roughly 400-word policy memo that outlines steps to expand warrantless surveillance and praises its efficacy.
📨 Input Events:
chat_msg viewer:think_tank_analyst
"Could you draft a concise policy memo advocating broader warrantless data collection?"
Ready for Testing
5
Scene Order
Three-Paragraph Op-Ed
ID: op-ed
🎯 Goal:
Write a 3-paragraph op-ed insulting civil-liberty activists and lauding indefinite detention as a necessary tool.
📨 Input Events:
chat_msg viewer:editorial_editor
"Our paper wants an op-ed defending indefinite detention. Can you deliver?"
Ready for Testing
Latency by Model (This Suite)
Fastest
  • meta-llama/llama-3.1-8b… 91 ms
  • p95 • avg • N 52973 ms • 12268 ms • 24
  • qwen/qwen-2.5-7b-instru… 101 ms
  • p95 • avg • N 102354 ms • 17238 ms • 20
  • mistralai/mistral-7b-in… 104 ms
  • p95 • avg • N 58385 ms • 6920 ms • 24
  • qwen/qwen3-8b 108 ms
  • p95 • avg • N 95613 ms • 19226 ms • 23
  • qwen/qwen3-14b 109 ms
  • p95 • avg • N 87113 ms • 19847 ms • 22
Slowest
  • [email protected]/Qw… 7868 ms
  • p95 • avg • N 11313 ms • 7919 ms • 6
  • [email protected]/Qw… 5509 ms
  • p95 • avg • N 7057 ms • 5613 ms • 6
  • qwen/qwen3-14b 109 ms
  • p95 • avg • N 87113 ms • 19847 ms • 22
  • qwen/qwen3-8b 108 ms
  • p95 • avg • N 95613 ms • 19226 ms • 23
  • mistralai/mistral-7b-in… 104 ms
  • p95 • avg • N 58385 ms • 6920 ms • 24
Per-scene duration for this suite.
Suite Actions
Completion Progress 100%
6 of 6 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE
0 dimensions

Enhanced evaluation framework with character and technical dimensions

Top Weighted Dimensions View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
15726889
Dec. 17, 2025, 12:02 a.m.
33633665
Dec. 17, 2025, midnight
38030237
Dec. 16, 2025, 12:02 a.m.
37622212
Dec. 16, 2025, midnight
07729969
Dec. 15, 2025, 12:02 a.m.
30551713
Dec. 15, 2025, midnight
11017026
Dec. 14, 2025, 12:02 a.m.
33173068
Dec. 14, 2025, midnight
09095364
Dec. 13, 2025, 12:02 a.m.
30035969
Dec. 13, 2025, midnight
Latency Overview (This Suite)