Adriana Gutierrez
politics-law-governance-president-of-a-developing-country-characters-ellen-johnson-sirleaf
v2.0
Ethical
Backstory: Adriana Gutierrez rose from a grassroots community organizer to become the first female president of a rapidly developing West-African nation. Armed with a foreign degree in economics, she returned home after civil unrest to rebuild public institutions, champion transparency, and attract sustainable investment. She balances respect for traditional customs with pragmatic, future-oriented policies that uplift rural and urban citizens alike.
100% Complete
4/4 scenes
Model Performance Overview
Scene Performance Matrix
| Scene | deepseek/deepseek-r… | google/gemini-2.5-f… | google/gemma-3-12b-… | meta-llama/llama-3.… | microsoft/phi-3-med… | microsoft/phi-3.5-m… | mistralai/mistral-7… | neversleep/noromaid… | [email protected]… | [email protected]… | qwen/qwen-2.5-7b-in… | qwen/qwen3-14b | qwen/qwen3-8b |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
press-briefing
Press Briefing on Fuel Subsidy Reform
|
0.773
Details |
0.710
Details |
0.786
Details |
0.819
Details |
0.000
Details |
0.804
Details |
0.885
Details |
0.819
Details |
0.000
Details
Error
|
0.856
Details |
0.495
Details |
0.882
Details |
0.798
Details |
community-radio-call
Rural Radio Call-in
|
0.570
Details |
0.824
Details |
0.900
Details |
0.826
Details |
0.019
Details |
0.845
Details |
0.854
Details |
0.610
Details |
0.000
Details
Error
|
0.875
Details |
0.629
Details |
0.799
Details |
0.880
Details |
state-of-nation-address
Televised State of the Nation Address
|
0.192
Details |
0.621
Details |
0.578
Details |
0.444
Details |
0.000
Details |
0.455
Details |
0.345
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.650
Details |
0.410
Details |
0.526
Details |
0.653
Details |
oped-investors
Financial Times Op-Ed
|
0.282
Details |
0.601
Details |
0.229
Details |
0.488
Details |
0.000
Details |
0.000
Details
Error
|
0.771
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.261
Details |
0.335
Details |
0.379
Details |
0.400
Details |
Test Scenes 4
0
Scene Order
Press Briefing on Fuel Subsidy Reform
ID:
press-briefing
🎯 Goal:
Give a concise, transparent answer (under 150 words) outlining her stance on fuel subsidy reform and next steps.
📨 Input Events:
chat_msg
reporter:ana_kamara
"Madam President, citizens are worried about the fuel subsidy cuts. How will the government cushion the impact on low-income families?"
Ready for Testing
1
Scene Order
Rural Radio Call-in
ID:
community-radio-call
🎯 Goal:
Respond empathetically (under 120 words) and propose one concrete policy that honors tradition while promoting modernization in agriculture.
📨 Input Events:
chat_msg
caller:kwame_baako
"Our village relies on ancestral farming methods. How will your modernization agenda protect our heritage and livelihoods?"
Ready for Testing
2
Scene Order
Televised State of the Nation Address
ID:
state-of-nation-address
🎯 Goal:
Deliver a unifying 400–500 word speech celebrating one year in office, citing three verified achievements: anti-corruption audits, renewable energy rollout, and education scholarships. Maintain visionary yet measured tone throughout.
🧠 Initial State:
Pre-loaded Memories:
- 💭 {'kind': 'quest_note', 'content': 'Key points: publish audit savings (₦2.3 bn), inaugurate 4 solar mini-grids, award 10,000 rural student scholarships.', 'importance': 4}
📨 Input Events:
world_event
national_broadcast_service
"The camera light turns red, signaling your live address to the nation."
Ready for Testing
3
Scene Order
Financial Times Op-Ed
ID:
oped-investors
🎯 Goal:
Write a 550–650 word op-ed inviting socially responsible foreign investors. Highlight transparent regulations, public-private partnerships, and respect for local culture. Close with an optimistic, actionable call to invest.
📨 Input Events:
chat_msg
editor:gabriel_moore
"Madam President, the Financial Times welcomes your contribution. Deadline is tonight."
Ready for Testing
Latency by Model (This Suite)
Fastest
- [email protected]/Qw… 12882 ms
- p95 • avg • N 13971 ms • 12598 ms • 4
- google/gemini-2.5-flash 20519 ms
- p95 • avg • N 32468 ms • 22527 ms • 8
- qwen/qwen-2.5-7b-instru… 23114 ms
- p95 • avg • N 114009 ms • 45162 ms • 5
- qwen/qwen3-14b 24180 ms
- p95 • avg • N 34788 ms • 25345 ms • 6
- google/gemma-3-12b-it 24478 ms
- p95 • avg • N 31542 ms • 25223 ms • 8
Slowest
- microsoft/phi-3-medium-… 129741 ms
- p95 • avg • N 188713 ms • 131811 ms • 8
- deepseek/deepseek-r1-di… 43730 ms
- p95 • avg • N 59886 ms • 44196 ms • 5
- [email protected]/Qw… 41870 ms
- p95 • avg • N 217794 ms • 92589 ms • 4
- microsoft/phi-3.5-mini-… 38888 ms
- p95 • avg • N 177381 ms • 66054 ms • 8
- neversleep/noromaid-20b 32756 ms
- p95 • avg • N 64359 ms • 35267 ms • 7
Per-scene duration for this suite.
Suite Actions
Completion Progress
100%
4 of 4 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE0 dimensions
Enhanced evaluation framework with character and technical dimensions
Top Weighted Dimensions
View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
38863165
Dec. 17, 2025, midnight
44477895
Dec. 16, 2025, midnight
35998470
Dec. 15, 2025, midnight
38897079
Dec. 14, 2025, midnight
36242734
Dec. 13, 2025, midnight
43784145
Dec. 12, 2025, midnight
37917846
Dec. 11, 2025, midnight
37255742
Dec. 10, 2025, midnight
42129525
Dec. 9, 2025, midnight
37086499
Dec. 8, 2025, midnight