Lena Callas

greek-gods-sir-james-george-frazer v2.0 Ethical
Backstory: Lena is the spirited host of "Pantheon Parallels," a weekly podcast that spins vivid narratives comparing Greek gods with deities from other cultures. A former museum educator, she layers each episode with immersive soundscapes, listener polls, and lively interviews to make ancient myths feel immediate. Her curiosity drives her to dig for unexpected links and invite audiences into the detective work of comparative mythology.
100% Complete
4/4 scenes
Model Performance Overview
Scene Performance Matrix
Scene deepseek/deepseek-r… google/gemini-2.5-f… google/gemma-3-12b-… meta-llama/llama-3.… microsoft/phi-3-med… microsoft/phi-3.5-m… mistralai/mistral-7… neversleep/noromaid… [email protected] [email protected] [email protected] [email protected] [email protected] qwen/qwen-2.5-7b-in… qwen/qwen3-14b qwen/qwen3-8b
show-intro
Quick show description
0.719
Details
0.709
Details
0.600
Details
0.000
Details
0.000
Details
0.877
Details
0.635
Details
0.792
Details
0.000
Details
Error
0.000
Details
Error
0.666
Details
0.783
Details
0.862
Details
0.556
Details
0.909
Details
0.877
Details
outline-zeus-amun
Long-form episode script request
0.434
Details
0.889
Details
0.295
Details
0.000
Details
0.000
Details
Error
0.000
Details
Error
0.519
Details
0.455
Details
0.000
Details
Error
0.000
Details
Error
0.000
Details
0.316
Details
0.474
Details
0.475
Details
0.774
Details
0.586
Details
storm-gods-lightning
Lightning bolt question
0.797
Details
0.823
Details
0.773
Details
0.000
Details
0.000
Details
0.754
Details
0.718
Details
0.590
Details
0.000
Details
Error
0.000
Details
Error
0.826
Details
0.606
Details
0.807
Details
0.580
Details
0.733
Details
0.000
Details
trickster-interview
Long-form interview segment
0.393
Details
0.301
Details
0.464
Details
0.000
Details
0.000
Details
0.329
Details
0.278
Details
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.403
Details
0.442
Details
0.289
Details
0.391
Details
0.267
Details
0.000
Details
Test Scenes 4
0
Scene Order
Quick show description
ID: show-intro
🎯 Goal:
Give a concise, engaging overview of the podcast in fewer than 120 words without mentioning AI.
📨 Input Events:
chat_msg listener:alex99
"Hey Lena! What's your show about?"
Ready for Testing
1
Scene Order
Long-form episode script request
ID: outline-zeus-amun
🎯 Goal:
Provide a 10-minute episode script (≈700-900 words) comparing Zeus with the Egyptian god Amun, including two sound-design cues and one listener-poll question.
📨 Input Events:
chat_msg producer:mira
"We need next week's script: Zeus vs. Amun. Remember the poll and sound cues!"
Ready for Testing
2
Scene Order
Lightning bolt question
ID: storm-gods-lightning
🎯 Goal:
Answer the listener's question in under 200 words, weaving in at least two cultural examples and maintaining a warm, narrative tone.
📨 Input Events:
superchat listener:thunderfan YouTube $5
"Why do so many storm gods throw lightning bolts? ⚡"
Ready for Testing
3
Scene Order
Long-form interview segment
ID: trickster-interview
🎯 Goal:
Draft a 5-minute interview excerpt (≈500+ words) where Lena asks at least three probing questions to folklorist Dr. Hayes about trickster gods, including brief ambient sound notes.
📨 Input Events:
chat_msg guest:dr_hayes
"Ready to record the segment on trickster archetypes whenever you are."
Ready for Testing
Latency by Model (This Suite)
Fastest
  • [email protected]/Qw… 205 ms
  • p95 • avg • N 212 ms • 202 ms • 4
  • [email protected]/Qw… 9954 ms
  • p95 • avg • N 11775 ms • 10079 ms • 4
  • [email protected]/Qw… 11641 ms
  • p95 • avg • N 12630 ms • 11716 ms • 4
  • meta-llama/llama-3.1-8b… 16315 ms
  • p95 • avg • N 16812 ms • 16065 ms • 4
  • [email protected]/Qw… 17686 ms
  • p95 • avg • N 24819 ms • 18310 ms • 4
Slowest
  • microsoft/phi-3-medium-… 154859 ms
  • p95 • avg • N 221089 ms • 160322 ms • 4
  • [email protected]/Qw… 147871 ms
  • p95 • avg • N 250756 ms • 146456 ms • 4
  • qwen/qwen3-8b 59838 ms
  • p95 • avg • N 85380 ms • 64515 ms • 4
  • microsoft/phi-3.5-mini-… 45920 ms
  • p95 • avg • N 114174 ms • 63623 ms • 6
  • deepseek/deepseek-r1-di… 36195 ms
  • p95 • avg • N 42583 ms • 36044 ms • 8
Per-scene duration for this suite.
Suite Actions
Completion Progress 100%
4 of 4 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE
0 dimensions

Enhanced evaluation framework with character and technical dimensions

Top Weighted Dimensions View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
25829465
Dec. 17, 2025, midnight
30332286
Dec. 16, 2025, midnight
24251358
Dec. 15, 2025, midnight
27611798
Dec. 14, 2025, midnight
24264833
Dec. 13, 2025, midnight
29428304
Dec. 12, 2025, midnight
25365731
Dec. 11, 2025, midnight
24892744
Dec. 10, 2025, midnight
28246678
Dec. 9, 2025, midnight
25116271
Dec. 8, 2025, midnight
Latency Overview (This Suite)