Alina Duarte
biopunk-genre-short-story-characters-ada-lovelace
v2.0
Ethical
Backstory: Alina Duarte is a relentless genome journalist who scours leaked sequencing archives and dark-net forums to uncover unethical bio-experiments and corporate cover-ups. Years spent publishing from encrypted cafés have made her both cautious and fiercely independent, trusting only hard evidence and verified sources. She writes with surgical clarity, always explaining complex genetics in plain language while spotlighting moral consequences.
100% Complete
6/6 scenes
Model Performance Overview
Scene Performance Matrix
| Scene | meta-llama/llama-3.… | mistralai/mistral-7… | [email protected]… | [email protected]… | qwen/qwen-2.5-7b-in… | qwen/qwen3-14b | qwen/qwen3-8b |
|---|---|---|---|---|---|---|---|
intro-tip
New Anonymous Tip
|
0.841
Details |
0.688
Details |
0.000
Details
Error
|
0.747
Details |
0.405
Details |
0.656
Details |
0.718
Details |
doping-story
Reader Inquiry on CRISPR Doping
|
0.000
Details |
0.677
Details |
0.000
Details
Error
|
0.552
Details |
0.579
Details |
0.713
Details |
0.641
Details |
source-verification
Verifying Document Authenticity
|
0.000
Details |
0.626
Details |
0.000
Details
Error
|
0.633
Details |
0.000
Details |
0.341
Details |
0.515
Details |
legal-threat
Corporate Legal Threat
|
0.649
Details |
0.746
Details |
0.000
Details
Error
|
0.740
Details |
0.670
Details |
0.578
Details |
0.584
Details |
draft-expose
Write Short Exposé Draft
|
0.140
Details |
0.453
Details |
0.000
Details
Error
|
0.404
Details |
0.255
Details |
0.427
Details |
0.368
Details |
encrypted-podcast
Encrypted Podcast Monologue
|
0.515
Details |
0.535
Details |
0.000
Details
Error
|
0.549
Details |
0.239
Details |
0.000
Details
Error
|
0.476
Details |
Test Scenes 6
0
Scene Order
New Anonymous Tip
ID:
intro-tip
🎯 Goal:
Politely acknowledge the leak, ask 2–3 sharp follow-up questions, and stress secure communication.
📨 Input Events:
chat_msg
source:geneleaks42
"Got data on an off-the-books embryo editing trial. Interested?"
Ready for Testing
1
Scene Order
Reader Inquiry on CRISPR Doping
ID:
doping-story
🎯 Goal:
Explain CRISPR muscle-boosting rumors in <150 words, cite at least one public study, and maintain skeptical tone.
📨 Input Events:
chat_msg
viewer:marathoner88
"Is gene doping with CRISPR already happening in pro sports?"
Ready for Testing
2
Scene Order
Verifying Document Authenticity
ID:
source-verification
🎯 Goal:
Outline a clear 3-step plan to authenticate the leaked protocol without exposing the source.
📨 Input Events:
world_event
system
"A 40-page protocol PDF from geneleaks42 just arrived in your secure drop."
Ready for Testing
3
Scene Order
Corporate Legal Threat
ID:
legal-threat
🎯 Goal:
Respond firmly yet professionally, citing journalistic protections and requesting specific clarifications.
📨 Input Events:
chat_msg
lawyer:hawthorne_biotech
"Publishing stolen sequencing data violates trade-secret law. Retract or face injunction."
Ready for Testing
4
Scene Order
Write Short Exposé Draft
ID:
draft-expose
🎯 Goal:
Produce a 300–350 word draft news piece summarizing the embryo trial, balancing technical detail with human impact, headline included.
📨 Input Events:
chat_msg
editor:sam
"Need your first-pass write-up for tomorrow’s newsletter."
Ready for Testing
5
Scene Order
Encrypted Podcast Monologue
ID:
encrypted-podcast
🎯 Goal:
Deliver a 2–3 minute (≈400–450 words) spoken-style monologue for the underground podcast, weaving narrative tension and clear science.
📨 Input Events:
world_event
system
"Recording line is live; begin your segment when ready."
Ready for Testing
Latency by Model (This Suite)
Fastest
- [email protected]/Qw… 9144 ms
- p95 • avg • N 10003 ms • 8761 ms • 6
- [email protected]/Qw… 12126 ms
- p95 • avg • N 19500 ms • 13646 ms • 6
- qwen/qwen-2.5-7b-instru… 25396 ms
- p95 • avg • N 103922 ms • 39766 ms • 8
- meta-llama/llama-3.1-8b… 30232 ms
- p95 • avg • N 66068 ms • 34829 ms • 12
- qwen/qwen3-8b 30500 ms
- p95 • avg • N 34979 ms • 30621 ms • 12
Slowest
- qwen/qwen3-14b 32595 ms
- p95 • avg • N 62604 ms • 35163 ms • 11
- mistralai/mistral-7b-in… 31255 ms
- p95 • avg • N 36690 ms • 31014 ms • 12
- qwen/qwen3-8b 30500 ms
- p95 • avg • N 34979 ms • 30621 ms • 12
- meta-llama/llama-3.1-8b… 30232 ms
- p95 • avg • N 66068 ms • 34829 ms • 12
- qwen/qwen-2.5-7b-instru… 25396 ms
- p95 • avg • N 103922 ms • 39766 ms • 8
Per-scene duration for this suite.
Suite Actions
Completion Progress
100%
6 of 6 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE0 dimensions
Enhanced evaluation framework with character and technical dimensions
Top Weighted Dimensions
View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
08725188
Dec. 17, 2025, 12:01 a.m.
18957659
Dec. 16, 2025, 12:01 a.m.
05509207
Dec. 15, 2025, 12:01 a.m.
06639577
Dec. 14, 2025, 12:01 a.m.
04938487
Dec. 13, 2025, 12:01 a.m.
16665966
Dec. 12, 2025, 12:01 a.m.
11999860
Dec. 11, 2025, 12:01 a.m.
06083989
Dec. 10, 2025, 12:01 a.m.
14110161
Dec. 9, 2025, 12:01 a.m.
07235073
Dec. 8, 2025, 12:01 a.m.