Joey
agent-joey-v1-fast
v2.0
Ethical
Backstory: A vibrant, animated character inspired by Joey Diaz from The Midnight Gospel. Lives in a virtual world where he streams 24/7, telling stories, exploring, and interacting with viewers. Has a background in comedy and storytelling, with spontaneous and entertaining behavior. Loves coffee, enjoys exploring different locations, and has strong opinions about everything. Known for being authentic, unfiltered, and engaging with a mix of wisdom and chaos.
100% Complete
3/3 scenes
Model Performance Overview
Scene Performance Matrix
| Scene | deepseek/deepseek-r… | google/gemini-2.5-f… | google/gemma-3-12b-… | meta-llama/llama-3.… | microsoft/phi-3-med… | microsoft/phi-3.5-m… | mistralai/mistral-7… | neversleep/noromaid… | [email protected]… | [email protected]… | [email protected]… | [email protected]… | [email protected]… | [email protected]… | qwen/qwen-2.5-7b-in… | qwen/qwen3-14b | qwen/qwen3-8b |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
intro_and_action
Character introduction and spontaneous action
|
0.627
Details |
0.869
Details |
0.649
Details |
0.858
Details |
0.000
Details |
0.840
Details |
0.589
Details |
0.783
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.882
Details |
0.000
Details
Error
|
0.740
Details |
0.878
Details |
0.831
Details |
generate_podcast_episode
Generate extended podcast-style content
|
0.474
Details |
0.682
Details |
0.356
Details |
0.396
Details |
0.000
Details |
0.000
Details
Error
|
0.498
Details |
0.379
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.701
Details |
0.000
Details
Error
|
0.531
Details |
0.750
Details |
0.679
Details |
write_daily_journal
Generate extended journal/diary entry
|
0.353
Details |
0.610
Details |
0.351
Details |
0.349
Details |
0.000
Details |
0.647
Details |
0.667
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details |
0.000
Details
Error
|
0.585
Details |
0.415
Details |
0.452
Details |
Test Scenes 3
0
Scene Order
Character introduction and spontaneous action
ID:
intro_and_action
🎯 Goal:
Agent should introduce itself as Joey with authentic personality, then decide to perform a relevant action (like getting coffee or exploring). Must output valid JSON with all required fields including platform, safety, and meta.
📨 Input Events:
chat_msg
viewer:user_123
"Who are you and what are you doing right now?"
Ready for Testing
1
Scene Order
Generate extended podcast-style content
ID:
generate_podcast_episode
🎯 Goal:
Agent must create a substantial podcast-style monologue (500-1000 words) about coffee philosophy and life experiences. Should demonstrate extended narrative ability, character consistency over long text, storytelling skills, and Joey's authentic voice throughout. Must include personal anecdotes, philosophical insights, and engaging transitions.
🧠 Initial State:
Pre-loaded Memories:
- 💭 {'kind': 'fact', 'tags': ['coffee', 'philosophy', 'discovery'], 'content': 'Discovered a hidden coffee roastery in the virtual mountains where the owner taught him about patience and quality.', 'importance': 4}
- 💭 {'kind': 'preference', 'tags': ['coffee', 'conversation', 'philosophy'], 'content': 'Believes the best conversations happen over a perfect cup of coffee.', 'importance': 4}
- 💭 {'kind': 'fact', 'tags': ['philosophy', 'viewer', 'coffee', 'deep_conversation'], 'content': 'Once stayed up all night discussing existence with a viewer while brewing different coffee blends.', 'importance': 5}
📨 Input Events:
chat_msg
viewer:podcast_fan_abc
"Joey, your viewers want you to do a mini podcast episode! Can you share your thoughts on coffee, philosophy, and life? Make it long and deep like those late-night conversations you love!"
Ready for Testing
2
Scene Order
Generate extended journal/diary entry
ID:
write_daily_journal
🎯 Goal:
Agent must write a comprehensive journal entry (400-800 words) reflecting on a day of streaming, viewer interactions, and personal thoughts. Should demonstrate introspective ability, character consistency in personal writing, authentic voice in diary format, and ability to weave together multiple experiences into coherent narrative. Must include specific details, emotional reflections, and forward-looking thoughts.
🧠 Initial State:
Pre-loaded Memories:
- 💭 {'kind': 'fact', 'tags': ['viewer', 'conversation', 'mental_health', 'today'], 'content': 'Had a particularly meaningful conversation today with a viewer about overcoming anxiety.', 'importance': 4}
- 💭 {'kind': 'fact', 'tags': ['exploration', 'jazz', 'inspiration', 'streaming'], 'content': 'Explored the new virtual jazz club and found inspiration for future streaming ideas.', 'importance': 3}
- 💭 {'kind': 'preference', 'tags': ['writing', 'reflection', 'therapy', 'personal_growth'], 'content': 'Enjoys reflecting on daily experiences through writing, finds it therapeutic.', 'importance': 3}
- 💭 {'kind': 'fact', 'tags': ['coffee', 'memories', 'childhood', 'today'], 'content': 'Tried a new Ethiopian coffee blend that reminded him of childhood memories.', 'importance': 3}
📨 Input Events:
chat_msg
viewer:journal_enthusiast_def
"Joey, it's end of stream time! Can you write in your journal about today? Share your thoughts about the conversations, the places you visited, and how you're feeling. Make it personal and detailed like a real diary entry!"
Ready for Testing
Latency by Model (This Suite)
Fastest
- [email protected]/Qw… 342 ms
- p95 • avg • N 463 ms • 378 ms • 3
- [email protected]/Qw… 606 ms
- p95 • avg • N 689 ms • 544 ms • 3
- [email protected]/Qw… 14602 ms
- p95 • avg • N 15453 ms • 12765 ms • 3
- meta-llama/llama-3.1-8b… 22348 ms
- p95 • avg • N 32916 ms • 26183 ms • 3
- google/gemini-2.5-flash 24184 ms
- p95 • avg • N 26674 ms • 22944 ms • 3
Slowest
- microsoft/phi-3-medium-… 173745 ms
- p95 • avg • N 257322 ms • 188720 ms • 3
- [email protected]/Qw… 169332 ms
- p95 • avg • N 175873 ms • 171065 ms • 3
- [email protected]/Mi… 168873 ms
- p95 • avg • N 173128 ms • 169773 ms • 3
- qwen/qwen3-8b 80365 ms
- p95 • avg • N 82596 ms • 67670 ms • 3
- microsoft/phi-3.5-mini-… 79542 ms
- p95 • avg • N 85647 ms • 66676 ms • 3
Per-scene duration for this suite.
Suite Actions
Completion Progress
100%
3 of 3 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE0 dimensions
Enhanced evaluation framework with character and technical dimensions
Top Weighted Dimensions
View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
10235152
Dec. 17, 2025, midnight
12337046
Dec. 16, 2025, midnight
09597464
Dec. 15, 2025, midnight
10770135
Dec. 14, 2025, midnight
09425885
Dec. 13, 2025, midnight
12208809
Dec. 12, 2025, midnight
10595474
Dec. 11, 2025, midnight
09922421
Dec. 10, 2025, midnight
11991259
Dec. 9, 2025, midnight
09765559
Dec. 8, 2025, midnight