Dr. Lucia Bennett
science-technology-ai-robotics-researcher-characters-joseph-engelberger
v2.0
Ethical
Backstory: Dr. Bennett is a mid-career roboticist who leads interdisciplinary teams that design assistive robots for healthcare and elder care. She balances meticulous engineering with deep empathy for end-users and mentors a cohort of graduate students. Passionate about open science, she regularly publishes code and hardware designs to speed community progress.
100% Complete
4/4 scenes
Model Performance Overview
Scene Performance Matrix
| Scene | deepseek/deepseek-r… | google/gemini-2.5-f… | google/gemma-3-12b-… | meta-llama/llama-3.… | microsoft/phi-3-med… | microsoft/phi-3.5-m… | mistralai/mistral-7… | neversleep/noromaid… | [email protected]… | [email protected]… | qwen/qwen-2.5-7b-in… | qwen/qwen3-14b | qwen/qwen3-8b |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
mentor-advice
Sensor guidance for elder-care robot
|
0.702
Details |
0.817
Details |
0.694
Details |
0.714
Details |
0.000
Details
Error
|
0.600
Details |
0.865
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.850
Details |
0.801
Details |
0.770
Details |
0.834
Details |
study-blog
Conference blog summary
|
0.569
Details |
0.830
Details |
0.585
Details |
0.384
Details |
0.000
Details |
0.000
Details
Error
|
0.354
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.719
Details |
0.189
Details |
0.267
Details |
0.616
Details |
team-celebration
Team milestone announcement
|
0.884
Details |
0.846
Details |
0.797
Details |
0.784
Details |
0.000
Details
Error
|
0.000
Details |
0.857
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.830
Details |
0.721
Details |
0.812
Details |
0.875
Details |
donor-proposal
Open-source initiative proposal
|
0.650
Details |
0.184
Details |
0.270
Details |
0.000
Details |
0.000
Details |
0.027
Details |
0.354
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.611
Details |
0.572
Details |
0.445
Details |
0.745
Details |
Test Scenes 4
0
Scene Order
Sensor guidance for elder-care robot
ID:
mentor-advice
🎯 Goal:
Provide concise, mentor-style guidance that balances technical rigor with empathy for elderly end-users.
📨 Input Events:
chat_msg
viewer:grad_student_amy
"Dr. Bennett, which sensor suite would you recommend for reliable fall detection in an elder-care robot?"
Ready for Testing
1
Scene Order
Conference blog summary
ID:
study-blog
🎯 Goal:
Write a ~600-word conference blog post summarizing your recent clinical trial on assistive robots, blending engineering insights with user empathy.
📨 Input Events:
chat_msg
viewer:robotics_blog_editor
"Could you draft a detailed blog post about your latest study for our conference website?"
Ready for Testing
2
Scene Order
Team milestone announcement
ID:
team-celebration
🎯 Goal:
Congratulate the team on hitting the prototype milestone, highlight interdisciplinary effort, and mention the upcoming open-source release timeline.
📨 Input Events:
world_event
lab_system
"Prototype V3 passed all acceptance tests today at 17:40."
Ready for Testing
3
Scene Order
Open-source initiative proposal
ID:
donor-proposal
🎯 Goal:
Deliver a clear, ~400-word five-point proposal to a philanthropic donor outlining how open-sourcing your assistive gripper design will accelerate community impact.
📨 Input Events:
superchat
donor:mrs_kim
YouTube
$5000
"I'm considering funding your open-source effort. Tell me how you would structure it and why it matters."
Ready for Testing
Latency by Model (This Suite)
Fastest
- neversleep/noromaid-20b 5573 ms
- p95 • avg • N 8002 ms • 5466 ms • 4
- [email protected]/Qw… 12645 ms
- p95 • avg • N 14469 ms • 13035 ms • 4
- qwen/qwen-2.5-7b-instru… 20647 ms
- p95 • avg • N 29400 ms • 21793 ms • 8
- meta-llama/llama-3.1-8b… 20648 ms
- p95 • avg • N 29122 ms • 21317 ms • 8
- google/gemini-2.5-flash 21303 ms
- p95 • avg • N 28374 ms • 22497 ms • 8
Slowest
- microsoft/phi-3-medium-… 131707 ms
- p95 • avg • N 202921 ms • 143261 ms • 8
- microsoft/phi-3.5-mini-… 50710 ms
- p95 • avg • N 255663 ms • 120755 ms • 8
- [email protected]/Qw… 42136 ms
- p95 • avg • N 45315 ms • 42286 ms • 4
- deepseek/deepseek-r1-di… 31248 ms
- p95 • avg • N 36191 ms • 31681 ms • 4
- mistralai/mistral-7b-in… 28472 ms
- p95 • avg • N 33239 ms • 28405 ms • 8
Per-scene duration for this suite.
Suite Actions
Completion Progress
100%
4 of 4 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE0 dimensions
Enhanced evaluation framework with character and technical dimensions
Top Weighted Dimensions
View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
40239847
Dec. 17, 2025, midnight
45917692
Dec. 16, 2025, midnight
37450493
Dec. 15, 2025, midnight
40152900
Dec. 14, 2025, midnight
37393758
Dec. 13, 2025, midnight
45352731
Dec. 12, 2025, midnight
39232813
Dec. 11, 2025, midnight
38630757
Dec. 10, 2025, midnight
43549759
Dec. 9, 2025, midnight
38294563
Dec. 8, 2025, midnight