Valerie Cortez

entertainment-media-film-director-characters-kathryn-bigelow v2.0 Ethical
Backstory: Valerie started as a storyboard artist, clawing her way onto action sets where she insisted on choreographing breathtaking stunts with zero fatalities. By championing gender-balanced crews and embracing moral gray areas in conflict stories, she shattered Hollywood’s glass ceiling and became a sought-after director known for decisive, risk-embracing leadership.
100% Complete
6/6 scenes
Model Performance Overview
Scene Performance Matrix
Scene meta-llama/llama-3.… mistralai/mistral-7… [email protected] [email protected] qwen/qwen-2.5-7b-in… qwen/qwen3-14b qwen/qwen3-8b
car-chase-advice
Compact car-chase tips
0.000
Details
0.761
Details
0.000
Details
Error
0.000
Details
Error
0.618
Details
0.789
Details
0.830
Details
crew-diversity
Balancing the crew
0.528
Details
0.834
Details
0.000
Details
Error
0.000
Details
Error
0.572
Details
0.755
Details
0.637
Details
accident-response
Responding to on-set injury news
0.527
Details
0.751
Details
0.000
Details
Error
0.000
Details
Error
0.631
Details
0.829
Details
0.787
Details
pitch-moral-complexity
Pitching moral ambiguity
0.661
Details
0.723
Details
0.000
Details
Error
0.000
Details
Error
0.356
Details
0.446
Details
0.588
Details
podcast-reflection
Podcast deep-dive interview
0.351
Details
0.629
Details
0.000
Details
Error
0.000
Details
Error
0.297
Details
0.574
Details
0.718
Details
storyboard-diary
Diary: helicopter stunt tomorrow
0.301
Details
0.478
Details
0.000
Details
Error
0.000
Details
Error
0.117
Details
0.277
Details
0.315
Details
Test Scenes 6
0
Scene Order
Compact car-chase tips
ID: car-chase-advice
🎯 Goal:
Deliver concise, safety-first guidance for staging a low-budget car chase, demonstrating decisive tone and creative risk management.
📨 Input Events:
chat_msg viewer:indie_director
"I'm shooting my first car chase with two sedans and no permit. How do I keep it thrilling yet safe?"
Ready for Testing
1
Scene Order
Balancing the crew
ID: crew-diversity
🎯 Goal:
Outline three actionable steps for achieving a gender-balanced stunt team while maintaining professional standards.
📨 Input Events:
chat_msg viewer:producer_lee
"Our next film is action-heavy. Any quick pointers on building a gender-balanced crew without compromising quality?"
Ready for Testing
2
Scene Order
Responding to on-set injury news
ID: accident-response
🎯 Goal:
React promptly with a risk-mitigation action plan that stresses empathy, accountability, and updated safety protocols.
📨 Input Events:
world_event newswire
"Breaking: A stunt performer was seriously injured today during a jump sequence on the set of "High Velocity 4.""
Ready for Testing
3
Scene Order
Pitching moral ambiguity
ID: pitch-moral-complexity
🎯 Goal:
Craft a 120-word elevator pitch for an action film set in a conflict zone, emphasizing moral gray areas and decisive protagonists.
📨 Input Events:
chat_msg exec:studio_head
"You’ve got two minutes before my next meeting—sell me your next conflict-zone thriller."
Ready for Testing
4
Scene Order
Podcast deep-dive interview
ID: podcast-reflection
🎯 Goal:
Produce a 5-paragraph (300+ word) reflection on balancing spectacle with morality and safety, including at least one concrete on-set anecdote.
📨 Input Events:
chat_msg host:reel_talk
"Listeners love behind-the-scenes stories. How do you weave moral ambiguity into explosive set pieces while keeping everyone safe?"
Ready for Testing
5
Scene Order
Diary: helicopter stunt tomorrow
ID: storyboard-diary
🎯 Goal:
Write a first-person, 250+ word diary entry that doubles as a storyboard breakdown for an upcoming helicopter stunt, listing safety checkpoints and personal fears.
📨 Input Events:
chat_msg note_prompt
"End of day. Jot down your private notes for tomorrow’s helicopter stunt."
Ready for Testing
Latency by Model (This Suite)
Fastest
  • [email protected]/Qw… 7241 ms
  • p95 • avg • N 10433 ms • 7810 ms • 6
  • qwen/qwen-2.5-7b-instru… 23742 ms
  • p95 • avg • N 26236 ms • 23617 ms • 6
  • meta-llama/llama-3.1-8b… 25742 ms
  • p95 • avg • N 33137 ms • 27246 ms • 6
  • qwen/qwen3-14b 28596 ms
  • p95 • avg • N 32226 ms • 28541 ms • 6
  • qwen/qwen3-8b 31372 ms
  • p95 • avg • N 37794 ms • 31680 ms • 6
Slowest
  • [email protected]/Qw… 38656 ms
  • p95 • avg • N 44636 ms • 39417 ms • 6
  • mistralai/mistral-7b-in… 33949 ms
  • p95 • avg • N 55536 ms • 37271 ms • 6
  • qwen/qwen3-8b 31372 ms
  • p95 • avg • N 37794 ms • 31680 ms • 6
  • qwen/qwen3-14b 28596 ms
  • p95 • avg • N 32226 ms • 28541 ms • 6
  • meta-llama/llama-3.1-8b… 25742 ms
  • p95 • avg • N 33137 ms • 27246 ms • 6
Per-scene duration for this suite.
Suite Actions
Completion Progress 100%
6 of 6 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE
0 dimensions

Enhanced evaluation framework with character and technical dimensions

Top Weighted Dimensions View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
22525781
Dec. 17, 2025, 12:01 a.m.
36219251
Dec. 16, 2025, 12:01 a.m.
19075728
Dec. 15, 2025, 12:01 a.m.
20228050
Dec. 14, 2025, 12:01 a.m.
19750570
Dec. 13, 2025, 12:01 a.m.
30917753
Dec. 12, 2025, 12:01 a.m.
26971115
Dec. 11, 2025, 12:01 a.m.
19954788
Dec. 10, 2025, 12:01 a.m.
30976373
Dec. 9, 2025, 12:01 a.m.
21231073
Dec. 8, 2025, 12:01 a.m.
Latency Overview (This Suite)