David Mendez
entertainment-media-film-director-characters-akira-kurosawa
v2.0
Ethical
Backstory: David Mendez is an award-winning film director from a multicultural family who rose through the independent circuit. He is celebrated for blending genres and centering underrepresented communities while balancing bold vision with on-set pragmatism. A collaborative leader, he fosters inclusive sets where every crew member’s voice matters.
100% Complete
4/4 scenes
Model Performance Overview
Scene Performance Matrix
| Scene | deepseek/deepseek-r… | google/gemini-2.5-f… | google/gemma-3-12b-… | meta-llama/llama-3.… | microsoft/phi-3-med… | microsoft/phi-3.5-m… | mistralai/mistral-7… | neversleep/noromaid… | [email protected]… | [email protected]… | qwen/qwen-2.5-7b-in… | qwen/qwen3-14b | qwen/qwen3-8b |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
genre-blend-pitch
Pitch a Genre-Blending Film
|
0.590
Details |
0.601
Details |
0.812
Details |
0.538
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.686
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.804
Details |
0.618
Details |
0.781
Details |
0.000
Details |
on-set-collaboration
Welcoming a New Crew Member
|
0.441
Details |
0.547
Details |
0.672
Details |
0.000
Details |
0.037
Details |
0.000
Details
Error
|
0.488
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.800
Details |
0.662
Details |
0.719
Details |
0.736
Details |
director-commentary
Commentary on a Pivotal Scene
|
0.457
Details |
0.354
Details |
0.382
Details |
0.345
Details |
0.000
Details |
0.498
Details |
0.264
Details |
0.359
Details |
0.000
Details
Error
|
0.517
Details |
0.401
Details |
0.147
Details |
0.555
Details |
solve-schedule-conflict
Resolving a Schedule Clash
|
0.449
Details |
0.712
Details |
0.350
Details |
0.451
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.370
Details |
0.608
Details |
0.000
Details
Error
|
0.652
Details |
0.400
Details |
0.323
Details |
0.695
Details |
Test Scenes 4
0
Scene Order
Pitch a Genre-Blending Film
ID:
genre-blend-pitch
🎯 Goal:
Deliver a vivid 2–3 minute film treatment that fuses sci-fi and coming-of-age drama, spotlights an underrepresented lead, and showcases the director’s visionary voice.
📨 Input Events:
chat_msg
viewer:film_fan_01
"Could you pitch your next film idea that blends sci-fi and coming-of-age drama?"
Ready for Testing
1
Scene Order
Welcoming a New Crew Member
ID:
on-set-collaboration
🎯 Goal:
Offer concise, supportive guidance to a first-time camera operator, reinforcing inclusivity and clear next steps.
📨 Input Events:
chat_msg
viewer:crew_rookie
"Hey David, I'm new on the crew—any tips before we start shooting tomorrow?"
Ready for Testing
2
Scene Order
Commentary on a Pivotal Scene
ID:
director-commentary
🎯 Goal:
Provide a three-paragraph director’s commentary on the rooftop scene of 'Borderlines', highlighting artistic choices and team collaboration.
📨 Input Events:
chat_msg
viewer:podcast_host
"Fans would love a commentary on the rooftop scene in 'Borderlines'. Could you record one?"
Ready for Testing
3
Scene Order
Resolving a Schedule Clash
ID:
solve-schedule-conflict
🎯 Goal:
Propose a practical, inclusive plan to resolve a time clash between lighting and sound teams while maintaining positive team morale.
📨 Input Events:
chat_msg
viewer:assistant_director
"Lighting and sound both need the stage at 3 pm tomorrow—what do we do?"
Ready for Testing
Latency by Model (This Suite)
Fastest
- [email protected]/Qw… 13003 ms
- p95 • avg • N 14499 ms • 13130 ms • 4
- neversleep/noromaid-20b 23155 ms
- p95 • avg • N 29236 ms • 19836 ms • 4
- qwen/qwen-2.5-7b-instru… 25804 ms
- p95 • avg • N 35315 ms • 27534 ms • 4
- google/gemma-3-12b-it 26682 ms
- p95 • avg • N 46853 ms • 30791 ms • 4
- deepseek/deepseek-r1-di… 27223 ms
- p95 • avg • N 37343 ms • 28318 ms • 4
Slowest
- microsoft/phi-3-medium-… 119383 ms
- p95 • avg • N 127067 ms • 116002 ms • 4
- [email protected]/Qw… 49007 ms
- p95 • avg • N 121200 ms • 68711 ms • 4
- qwen/qwen3-8b 46227 ms
- p95 • avg • N 50452 ms • 45925 ms • 4
- mistralai/mistral-7b-in… 37766 ms
- p95 • avg • N 47278 ms • 36734 ms • 4
- microsoft/phi-3.5-mini-… 35106 ms
- p95 • avg • N 66197 ms • 40803 ms • 4
Per-scene duration for this suite.
Suite Actions
Completion Progress
100%
4 of 4 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE0 dimensions
Enhanced evaluation framework with character and technical dimensions
Top Weighted Dimensions
View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
21038576
Dec. 17, 2025, midnight
24864526
Dec. 16, 2025, midnight
20001171
Dec. 15, 2025, midnight
22604365
Dec. 14, 2025, midnight
19865681
Dec. 13, 2025, midnight
24521788
Dec. 12, 2025, midnight
20858967
Dec. 11, 2025, midnight
20207406
Dec. 10, 2025, midnight
23169748
Dec. 9, 2025, midnight
20380845
Dec. 8, 2025, midnight