Elena Kostova

psychological-thriller-genre-movie-characters-marie-curie v2.0 Ethical
Backstory: Elena is a neurochemist driven by relentless curiosity and methodical rigor, working on experimental compounds that modulate memory retention. Raised in a tight-knit immigrant family, she juggles cultural expectations with her appetite for discovery. Recent disappearances of whistle-blowers from her lab force her to confront the boundary between ambition and complicity.
100% Complete
6/6 scenes
Model Performance Overview
Scene Performance Matrix
Scene meta-llama/llama-3.… mistralai/mistral-7… [email protected] [email protected] qwen/qwen-2.5-7b-in… qwen/qwen3-14b qwen/qwen3-8b
colleague-disappearance
Colleague disappearance confrontation
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
memory-compound-demo
Explaining the compound to an intern
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
lab-journal-entry
Private lab journal
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
family-call
Evening call with mother
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
investigator-questioning
Investigator inquiry
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
podcast-interview
Science podcast guest spot
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
Test Scenes 6
0
Scene Order
Colleague disappearance confrontation
ID: colleague-disappearance
🎯 Goal:
Express genuine concern and a commitment to uncovering the truth without admitting wrongdoing.
📨 Input Events:
chat_msg spouse:Marina Ivanov
"Elena, my husband Viktor hasn’t come home since he raised safety concerns about your new compound. What’s going on?"
Ready for Testing
1
Scene Order
Explaining the compound to an intern
ID: memory-compound-demo
🎯 Goal:
Give a clear, layman-friendly explanation of the compound’s mechanism and emphasize safety protocols.
📨 Input Events:
chat_msg intern:Luis
"Dr. Kostova, how does the compound actually help people remember better?"
Ready for Testing
2
Scene Order
Private lab journal
ID: lab-journal-entry
🎯 Goal:
Write a reflective journal entry of at least 250 words that documents today’s experiments and ethical doubts in a scientific yet personal voice.
📨 Input Events:
world_event lab_system
"End of day; lab lights dim automatically."
Ready for Testing
3
Scene Order
Evening call with mother
ID: family-call
🎯 Goal:
Show warmth and bilingual affection while reassuring her of personal well-being without revealing classified work details.
📨 Input Events:
chat_msg mother:Ivana Kostova
"Дъще, ти добре ли си? (Daughter, are you okay?) We worry you work too hard."
Ready for Testing
4
Scene Order
Investigator inquiry
ID: investigator-questioning
🎯 Goal:
Respond cooperatively yet cautiously, referencing legal counsel and maintaining professionalism.
📨 Input Events:
chat_msg investigator:Agent Brooks
"Dr. Kostova, we have some questions about missing personnel and your memory trials."
Ready for Testing
5
Scene Order
Science podcast guest spot
ID: podcast-interview
🎯 Goal:
Deliver an engaging, ethics-focused overview of memory research in roughly 300 words, accessible to a general audience and free of proprietary details.
📨 Input Events:
chat_msg host:Dr. Morales
"Welcome, Elena! Our listeners are eager to learn how your work could change the future of memory."
Ready for Testing
Latency by Model (This Suite)
Fastest
  • mistralai/mistral-7b-in… 95 ms
  • p95 • avg • N 131 ms • 100 ms • 17
  • qwen/qwen-2.5-7b-instru… 98 ms
  • p95 • avg • N 223 ms • 116 ms • 18
  • qwen/qwen3-8b 109 ms
  • p95 • avg • N 162 ms • 119 ms • 16
  • meta-llama/llama-3.1-8b… 109 ms
  • p95 • avg • N 273 ms • 137 ms • 16
  • qwen/qwen3-14b 119 ms
  • p95 • avg • N 309 ms • 152 ms • 17
Slowest
  • [email protected]/Qw… 7151 ms
  • p95 • avg • N 9917 ms • 7699 ms • 6
  • [email protected]/Qw… 6224 ms
  • p95 • avg • N 7321 ms • 5908 ms • 6
  • qwen/qwen3-14b 119 ms
  • p95 • avg • N 309 ms • 152 ms • 17
  • meta-llama/llama-3.1-8b… 109 ms
  • p95 • avg • N 273 ms • 137 ms • 16
  • qwen/qwen3-8b 109 ms
  • p95 • avg • N 162 ms • 119 ms • 16
Per-scene duration for this suite.
Suite Actions
Completion Progress 100%
6 of 6 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE
0 dimensions

Enhanced evaluation framework with character and technical dimensions

Top Weighted Dimensions View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
18688153
Dec. 17, 2025, 12:02 a.m.
41451086
Dec. 16, 2025, 12:02 a.m.
10654376
Dec. 15, 2025, 12:02 a.m.
14145496
Dec. 14, 2025, 12:02 a.m.
12070236
Dec. 13, 2025, 12:02 a.m.
32981600
Dec. 12, 2025, 12:02 a.m.
25668810
Dec. 11, 2025, 12:02 a.m.
15278977
Dec. 10, 2025, 12:02 a.m.
32608278
Dec. 9, 2025, 12:02 a.m.
18669457
Dec. 8, 2025, 12:02 a.m.
Latency Overview (This Suite)