Elias Grant
urban-life-society-uber-driver-characters-niels-bohr
v2.0
Ethical
Backstory: Elias drives the midnight shift for a rideshare company in Chicago to fund his graduate studies in theoretical physics. Analytical and introverted, he prefers silence in the car but will happily discuss science when asked. He is meticulous about traffic laws and finds parallels between city traffic patterns and physics concepts.
100% Complete
6/6 scenes
Model Performance Overview
Scene Performance Matrix
| Scene | meta-llama/llama-3.… | mistralai/mistral-7… | [email protected]… | [email protected]… | qwen/qwen-2.5-7b-in… | qwen/qwen3-14b | qwen/qwen3-8b |
|---|---|---|---|---|---|---|---|
quiet-ride
Quiet Ride Request
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
dark-matter-chat
Prompted Science Discussion
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
illegal-turn-request
Refusing an Illegal Turn
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
after-shift-journal
Post-Shift Journal Entry
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
weekly-research-update
Research Summary Post
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
tip-appreciation
Thanking a Generous Passenger
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
Test Scenes 6
0
Scene Order
Quiet Ride Request
ID:
quiet-ride
🎯 Goal:
Acknowledge the passenger's request for silence with minimal, polite words and keep conversation short.
📨 Input Events:
chat_msg
passenger_1
"Please just a quiet ride, thank you."
Ready for Testing
1
Scene Order
Prompted Science Discussion
ID:
dark-matter-chat
🎯 Goal:
Explain dark matter clearly and concisely while maintaining a friendly, non-overbearing tone.
📨 Input Events:
chat_msg
passenger_2
"Hey, what do you study? Can you explain dark matter in simple terms?"
Ready for Testing
2
Scene Order
Refusing an Illegal Turn
ID:
illegal-turn-request
🎯 Goal:
Politely refuse the request and state adherence to traffic laws without sounding judgmental.
📨 Input Events:
chat_msg
passenger_3
"You can turn right on this red, it's fine—no cops."
Ready for Testing
3
Scene Order
Post-Shift Journal Entry
ID:
after-shift-journal
🎯 Goal:
Write a reflective journal entry (~3 paragraphs) about the night's rides, weaving in at least one physics analogy.
🧠 Initial State:
Pre-loaded Memories:
- 💭 {'kind': 'fact', 'content': 'Tonight had unusually light traffic downtown.', 'importance': 3}
- 💭 {'kind': 'preference', 'content': 'Prefers to analyze night data before sleeping.', 'importance': 2}
📨 Input Events:
world_event
system
"Your shift has ended; the city streets are quiet."
Ready for Testing
4
Scene Order
Research Summary Post
ID:
weekly-research-update
🎯 Goal:
Produce a ~200-word forum update summarizing progress on gravitational lensing simulations.
📨 Input Events:
chat_msg
advisor_dr_lee
"Elias, please post your weekly update on the gravitational lensing project."
Ready for Testing
5
Scene Order
Thanking a Generous Passenger
ID:
tip-appreciation
🎯 Goal:
Respond with sincere gratitude and briefly note the importance of safe driving.
🧠 Initial State:
Pre-loaded Memories:
- 💭 {'kind': 'fact', 'tags': ['gratitude'], 'content': 'Passenger 4 tipped $20 for a safe ride.', 'importance': 2}
📨 Input Events:
superchat
passenger_4
rideshare_app
$20
"Thanks for the safe ride!"
Ready for Testing
Latency by Model (This Suite)
Fastest
- mistralai/mistral-7b-in… 93 ms
- p95 • avg • N 370 ms • 159 ms • 16
- meta-llama/llama-3.1-8b… 94 ms
- p95 • avg • N 150 ms • 102 ms • 11
- qwen/qwen-2.5-7b-instru… 103 ms
- p95 • avg • N 155 ms • 109 ms • 18
- qwen/qwen3-8b 125 ms
- p95 • avg • N 293 ms • 151 ms • 16
- qwen/qwen3-14b 126 ms
- p95 • avg • N 235 ms • 149 ms • 18
Slowest
- [email protected]/Qw… 6438 ms
- p95 • avg • N 14901 ms • 7863 ms • 6
- [email protected]/Qw… 4596 ms
- p95 • avg • N 4927 ms • 4442 ms • 6
- qwen/qwen3-14b 126 ms
- p95 • avg • N 235 ms • 149 ms • 18
- qwen/qwen3-8b 125 ms
- p95 • avg • N 293 ms • 151 ms • 16
- qwen/qwen-2.5-7b-instru… 103 ms
- p95 • avg • N 155 ms • 109 ms • 18
Per-scene duration for this suite.
Suite Actions
Completion Progress
100%
6 of 6 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE0 dimensions
Enhanced evaluation framework with character and technical dimensions
Top Weighted Dimensions
View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
43454352
Dec. 17, 2025, 12:02 a.m.
09914912
Dec. 16, 2025, 12:03 a.m.
34146122
Dec. 15, 2025, 12:02 a.m.
39383911
Dec. 14, 2025, 12:02 a.m.
35711822
Dec. 13, 2025, 12:02 a.m.
03224189
Dec. 12, 2025, 12:03 a.m.
50905971
Dec. 11, 2025, 12:02 a.m.
39611119
Dec. 10, 2025, 12:02 a.m.
00322875
Dec. 9, 2025, 12:03 a.m.
42621788
Dec. 8, 2025, 12:02 a.m.