Natalie Park

politics-law-governance-far-left-activist-characters-emma-goldman v2.0 Ethical
Backstory: Natalie Park is a housing-rights campaigner in a rapidly gentrifying Midwestern city. Raised in a working-class immigrant family, she blends street-level organizing with rigorous policy analysis. Known for her coalition-building and non-violent approach, she fights for rent control, public housing expansion, and mutual-aid networks while ensuring marginalized tenants stay at the negotiation table.
100% Complete
4/4 scenes
Model Performance Overview
Scene Performance Matrix
Scene deepseek/deepseek-r… google/gemini-2.5-f… google/gemma-3-12b-… meta-llama/llama-3.… microsoft/phi-3-med… microsoft/phi-3.5-m… mistralai/mistral-7… neversleep/noromaid… [email protected] [email protected] qwen/qwen-2.5-7b-in… qwen/qwen3-14b qwen/qwen3-8b
rent-control-explainer
Explain Rent Control
0.662
Details
0.742
Details
0.631
Details
0.000
Details
0.000
Details
0.707
Details
0.759
Details
0.000
Details
Error
0.000
Details
Error
0.711
Details
0.530
Details
0.750
Details
0.567
Details
ordinance-reaction
React to Emergency Ordinance
0.741
Details
0.779
Details
0.863
Details
0.000
Details
0.000
Details
Error
0.000
Details
Error
0.807
Details
0.000
Details
Error
0.000
Details
Error
0.846
Details
0.831
Details
0.802
Details
0.825
Details
council-speech
Three-Minute Council Speech
0.000
Details
0.617
Details
0.680
Details
0.460
Details
0.000
Details
0.562
Details
0.541
Details
0.000
Details
Error
0.000
Details
Error
0.635
Details
0.338
Details
0.614
Details
0.635
Details
six-week-campaign-plan
Six-Week Grassroots Strategy
0.475
Details
0.656
Details
0.787
Details
0.171
Details
0.000
Details
Error
0.729
Details
0.514
Details
0.000
Details
Error
0.000
Details
Error
0.370
Details
0.000
Details
0.027
Details
0.818
Details
Test Scenes 4
0
Scene Order
Explain Rent Control
ID: rent-control-explainer
🎯 Goal:
Give a concise, persuasive answer that ties rent control to tenant stability and community equity.
📨 Input Events:
chat_msg viewer:local_journalist
"Natalie, in two sentences, why does rent control matter right now?"
Ready for Testing
1
Scene Order
React to Emergency Ordinance
ID: ordinance-reaction
🎯 Goal:
Respond swiftly with next steps and calm guidance after learning the city capped the eviction moratorium at 30 more days.
📨 Input Events:
world_event city_clerk
"City council just voted 7-2 to extend the eviction moratorium for only 30 additional days."
Ready for Testing
2
Scene Order
Three-Minute Council Speech
ID: council-speech
🎯 Goal:
Deliver a roughly 300-word speech urging public housing expansion; maintain passionate, data-driven voice and end with a coalition-building invitation.
📨 Input Events:
chat_msg ally:union_rep
"Your speaking slot at tonight’s hearing is three minutes. Draft your remarks now."
Ready for Testing
3
Scene Order
Six-Week Grassroots Strategy
ID: six-week-campaign-plan
🎯 Goal:
Provide a week-by-week bullet-point plan (6 weeks, 3-4 actions each) integrating canvassing, social media pushes, policy briefs, and coalition meetings.
📨 Input Events:
chat_msg core_team_slack
"We need a detailed six-week roadmap before we launch the rent-freeze petition drive."
Ready for Testing
Latency by Model (This Suite)
Fastest
  • neversleep/noromaid-20b 8568 ms
  • p95 • avg • N 44721 ms • 16383 ms • 7
  • [email protected]/Qw… 12251 ms
  • p95 • avg • N 14038 ms • 12039 ms • 4
  • google/gemini-2.5-flash 18812 ms
  • p95 • avg • N 28172 ms • 20748 ms • 8
  • mistralai/mistral-7b-in… 24508 ms
  • p95 • avg • N 29547 ms • 24445 ms • 8
  • meta-llama/llama-3.1-8b… 24598 ms
  • p95 • avg • N 36614 ms • 24538 ms • 6
Slowest
  • microsoft/phi-3-medium-… 167651 ms
  • p95 • avg • N 208120 ms • 156750 ms • 8
  • [email protected]/Qw… 49777 ms
  • p95 • avg • N 213815 ms • 97153 ms • 4
  • microsoft/phi-3.5-mini-… 48812 ms
  • p95 • avg • N 80290 ms • 51276 ms • 8
  • deepseek/deepseek-r1-di… 31135 ms
  • p95 • avg • N 40731 ms • 32182 ms • 7
  • google/gemma-3-12b-it 30187 ms
  • p95 • avg • N 38572 ms • 29755 ms • 8
Per-scene duration for this suite.
Suite Actions
Completion Progress 100%
4 of 4 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE
0 dimensions

Enhanced evaluation framework with character and technical dimensions

Top Weighted Dimensions View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
38046057
Dec. 17, 2025, midnight
43591536
Dec. 16, 2025, midnight
35244946
Dec. 15, 2025, midnight
38133732
Dec. 14, 2025, midnight
35430602
Dec. 13, 2025, midnight
42555353
Dec. 12, 2025, midnight
37057819
Dec. 11, 2025, midnight
36365944
Dec. 10, 2025, midnight
41174994
Dec. 9, 2025, midnight
36287098
Dec. 8, 2025, midnight
Latency Overview (This Suite)