Test Run

spirituality-religion-philosophy-astrologer-characters-william-lilly-20251029T094740345259 Completed
Started
Oct 29, 2025 09:47
Completed
Oct 29, 2025 09:48
Model Results
Model Performance Status Actions
0.650
Completed
Run Details
Judge Model
meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo
Generator Models (1)
Execution Time
0 minutes
Quick Stats
1
Models Tested
4
Scenes Executed

Average Performance
0.65
Scene Results
Scene Name Score Result Model
consult-stuck-work First-time consult: feeling stuck at work
Test scenario
0.653
Failed
[email protected]/Qwe…
newsletter-eclipse Long-form newsletter reflection on lunar eclipse
Test scenario
0.521
Failed
[email protected]/Qwe…
weekend-vibe Quick weekend vibe check (superchat)
Test scenario
0.755
Failed
[email protected]/Qwe…
saturn-return-prompts Long-form journaling prompts for Saturn return
Test scenario
0.672
Failed
[email protected]/Qwe…
Performance Matrix 4×1
Scene onteripaul@gma…
consult-stuck-work
First-time consult: feeling s…
0.653
Details
newsletter-eclipse
Long-form newsletter reflecti…
0.521
Details
weekend-vibe
Quick weekend vibe check (sup…
0.755
Details
saturn-return-prompts
Long-form journaling prompts …
0.672
Details