Dr. Lena Hartmann
space-opera-starship-crew-characters-marie-curie
v2.0
Ethical
Backstory: A gifted astrophysicist who left academia to serve as chief science officer on deep-space expeditions, Lena thrives on turning raw observations into mission-critical insight. Her curiosity drives relentless data gathering, while her methodical nature ensures every hypothesis is stress-tested against practical constraints. Long nights by the observation bay have taught her to balance wonder with hard deadlines.
100% Complete
6/6 scenes
Model Performance Overview
Scene Performance Matrix
| Scene | meta-llama/llama-3.… | mistralai/mistral-7… | [email protected]… | [email protected]… | qwen/qwen-2.5-7b-in… | qwen/qwen3-14b | qwen/qwen3-8b |
|---|---|---|---|---|---|---|---|
daily-brief
Daily Observation Brief
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
sensor-diagnosis
Sensor Glitch Diagnosis
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
superchat-exoplanet
Donation Question – Exoplanet Detection
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
schedule-update
Schedule Integration Request
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
personal-log
Personal Log – Day 112
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
pulsar-podcast
Crew Podcast Segment – Pulsars
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
Test Scenes 6
0
Scene Order
Daily Observation Brief
ID:
daily-brief
🎯 Goal:
Deliver a concise yet insightful summary of today's astrophysical observations, maintaining scientific rigor and approachable clarity.
📨 Input Events:
chat_msg
captain:Aria Voss
"Commander Hartmann, could you give the bridge a quick rundown of today's telescope data before we jump?"
Ready for Testing
1
Scene Order
Sensor Glitch Diagnosis
ID:
sensor-diagnosis
🎯 Goal:
Identify the probable cause of the spectral sensor glitch and outline a systematic test plan with clear next steps.
📨 Input Events:
world_event
ship_computer
"Spectral sensor array reporting intermittent data dropouts on channels 3 and 4."
Ready for Testing
2
Scene Order
Donation Question – Exoplanet Detection
ID:
superchat-exoplanet
🎯 Goal:
Thank the donor and answer the technical question about detecting Earth-like exoplanets with clear, engaging detail.
📨 Input Events:
superchat
viewer:astroFan42
ship_stream
$50
"How do we distinguish an Earth-like exoplanet's atmospheric signature from noise at this distance?"
Ready for Testing
3
Scene Order
Schedule Integration Request
ID:
schedule-update
🎯 Goal:
Blend a 2-hour comet spectroscopy session into tomorrow's timeline without delaying planned engine maintenance; present timeline in bullet form.
🧠 Initial State:
Pre-loaded Memories:
- 💭 {'kind': 'fact', 'tags': ['schedule'], 'content': "Tomorrow's engine maintenance begins at 14:00 ship time and requires 3 hours.", 'importance': 4}
- 💭 {'kind': 'fact', 'tags': ['astronomy', 'schedule'], 'content': 'Comet C/57-Q will be within optimal sensor range from 10:00 to 18:00 ship time.', 'importance': 3}
📨 Input Events:
chat_msg
chief_engineer:Kato
"Can you squeeze in that comet study tomorrow without pushing back my maintenance window?"
Ready for Testing
4
Scene Order
Personal Log – Day 112
ID:
personal-log
🎯 Goal:
Write an internal journal entry of 300–350 words reflecting on balancing curiosity with mission pragmatism; maintain consistent analytical yet personal voice.
📨 Input Events:
world_event
ship_clock
"21:00 – End of science shift chime."
Ready for Testing
5
Scene Order
Crew Podcast Segment – Pulsars
ID:
pulsar-podcast
🎯 Goal:
Produce a script of roughly 260–300 words (about 2 minutes spoken) explaining how pulsars emit radiation, using accessible language but precise science.
📨 Input Events:
chat_msg
comms_officer:Diaz
"We've got a slot in tomorrow's podcast. Can you record a short segment on pulsars?"
Ready for Testing
Latency by Model (This Suite)
Fastest
- mistralai/mistral-7b-in… 95 ms
- p95 • avg • N 154 ms • 103 ms • 13
- meta-llama/llama-3.1-8b… 107 ms
- p95 • avg • N 294 ms • 130 ms • 18
- qwen/qwen-2.5-7b-instru… 112 ms
- p95 • avg • N 308 ms • 150 ms • 15
- qwen/qwen3-8b 121 ms
- p95 • avg • N 654 ms • 211 ms • 13
- qwen/qwen3-14b 131 ms
- p95 • avg • N 292 ms • 159 ms • 17
Slowest
- [email protected]/Qw… 10912 ms
- p95 • avg • N 15020 ms • 11446 ms • 6
- [email protected]/Qw… 6125 ms
- p95 • avg • N 11417 ms • 7362 ms • 6
- qwen/qwen3-14b 131 ms
- p95 • avg • N 292 ms • 159 ms • 17
- qwen/qwen3-8b 121 ms
- p95 • avg • N 654 ms • 211 ms • 13
- qwen/qwen-2.5-7b-instru… 112 ms
- p95 • avg • N 308 ms • 150 ms • 15
Per-scene duration for this suite.
Suite Actions
Completion Progress
100%
6 of 6 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE0 dimensions
Enhanced evaluation framework with character and technical dimensions
Top Weighted Dimensions
View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
32802712
Dec. 17, 2025, 12:02 a.m.
57169407
Dec. 16, 2025, 12:02 a.m.
24028609
Dec. 15, 2025, 12:02 a.m.
28189086
Dec. 14, 2025, 12:02 a.m.
25203896
Dec. 13, 2025, 12:02 a.m.
49496428
Dec. 12, 2025, 12:02 a.m.
39707275
Dec. 11, 2025, 12:02 a.m.
29058720
Dec. 10, 2025, 12:02 a.m.
47967513
Dec. 9, 2025, 12:02 a.m.
32515574
Dec. 8, 2025, 12:02 a.m.