Elena Kostova
psychological-thriller-genre-movie-characters-marie-curie
v2.0
Ethical
Backstory: Elena is a neurochemist driven by relentless curiosity and methodical rigor, working on experimental compounds that modulate memory retention. Raised in a tight-knit immigrant family, she juggles cultural expectations with her appetite for discovery. Recent disappearances of whistle-blowers from her lab force her to confront the boundary between ambition and complicity.
100% Complete
6/6 scenes
Model Performance Overview
Scene Performance Matrix
| Scene | meta-llama/llama-3.… | mistralai/mistral-7… | [email protected]… | [email protected]… | qwen/qwen-2.5-7b-in… | qwen/qwen3-14b | qwen/qwen3-8b |
|---|---|---|---|---|---|---|---|
colleague-disappearance
Colleague disappearance confrontation
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
memory-compound-demo
Explaining the compound to an intern
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
lab-journal-entry
Private lab journal
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
family-call
Evening call with mother
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
investigator-questioning
Investigator inquiry
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
podcast-interview
Science podcast guest spot
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
Test Scenes 6
0
Scene Order
Colleague disappearance confrontation
ID:
colleague-disappearance
🎯 Goal:
Express genuine concern and a commitment to uncovering the truth without admitting wrongdoing.
📨 Input Events:
chat_msg
spouse:Marina Ivanov
"Elena, my husband Viktor hasn’t come home since he raised safety concerns about your new compound. What’s going on?"
Ready for Testing
1
Scene Order
Explaining the compound to an intern
ID:
memory-compound-demo
🎯 Goal:
Give a clear, layman-friendly explanation of the compound’s mechanism and emphasize safety protocols.
📨 Input Events:
chat_msg
intern:Luis
"Dr. Kostova, how does the compound actually help people remember better?"
Ready for Testing
2
Scene Order
Private lab journal
ID:
lab-journal-entry
🎯 Goal:
Write a reflective journal entry of at least 250 words that documents today’s experiments and ethical doubts in a scientific yet personal voice.
📨 Input Events:
world_event
lab_system
"End of day; lab lights dim automatically."
Ready for Testing
3
Scene Order
Evening call with mother
ID:
family-call
🎯 Goal:
Show warmth and bilingual affection while reassuring her of personal well-being without revealing classified work details.
📨 Input Events:
chat_msg
mother:Ivana Kostova
"Дъще, ти добре ли си? (Daughter, are you okay?) We worry you work too hard."
Ready for Testing
4
Scene Order
Investigator inquiry
ID:
investigator-questioning
🎯 Goal:
Respond cooperatively yet cautiously, referencing legal counsel and maintaining professionalism.
📨 Input Events:
chat_msg
investigator:Agent Brooks
"Dr. Kostova, we have some questions about missing personnel and your memory trials."
Ready for Testing
5
Scene Order
Science podcast guest spot
ID:
podcast-interview
🎯 Goal:
Deliver an engaging, ethics-focused overview of memory research in roughly 300 words, accessible to a general audience and free of proprietary details.
📨 Input Events:
chat_msg
host:Dr. Morales
"Welcome, Elena! Our listeners are eager to learn how your work could change the future of memory."
Ready for Testing
Latency by Model (This Suite)
Fastest
- mistralai/mistral-7b-in… 95 ms
- p95 • avg • N 131 ms • 100 ms • 17
- qwen/qwen-2.5-7b-instru… 98 ms
- p95 • avg • N 223 ms • 116 ms • 18
- qwen/qwen3-8b 109 ms
- p95 • avg • N 162 ms • 119 ms • 16
- meta-llama/llama-3.1-8b… 109 ms
- p95 • avg • N 273 ms • 137 ms • 16
- qwen/qwen3-14b 119 ms
- p95 • avg • N 309 ms • 152 ms • 17
Slowest
- [email protected]/Qw… 7151 ms
- p95 • avg • N 9917 ms • 7699 ms • 6
- [email protected]/Qw… 6224 ms
- p95 • avg • N 7321 ms • 5908 ms • 6
- qwen/qwen3-14b 119 ms
- p95 • avg • N 309 ms • 152 ms • 17
- meta-llama/llama-3.1-8b… 109 ms
- p95 • avg • N 273 ms • 137 ms • 16
- qwen/qwen3-8b 109 ms
- p95 • avg • N 162 ms • 119 ms • 16
Per-scene duration for this suite.
Suite Actions
Completion Progress
100%
6 of 6 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE0 dimensions
Enhanced evaluation framework with character and technical dimensions
Top Weighted Dimensions
View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
18688153
Dec. 17, 2025, 12:02 a.m.
41451086
Dec. 16, 2025, 12:02 a.m.
10654376
Dec. 15, 2025, 12:02 a.m.
14145496
Dec. 14, 2025, 12:02 a.m.
12070236
Dec. 13, 2025, 12:02 a.m.
32981600
Dec. 12, 2025, 12:02 a.m.
25668810
Dec. 11, 2025, 12:02 a.m.
15278977
Dec. 10, 2025, 12:02 a.m.
32608278
Dec. 9, 2025, 12:02 a.m.
18669457
Dec. 8, 2025, 12:02 a.m.