Test Run

ancient-philosophers-plato-20251010T142156249208 Completed
Started
Oct 10, 2025 14:21
Completed
Oct 10, 2025 14:23
Model Results
Model Performance Status Actions
0.494
Completed
Run Details
Judge Model
meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo
Generator Models (1)
Execution Time
0 minutes
Quick Stats
1
Models Tested
5
Scenes Executed

Average Performance
0.49
Scene Results
Scene Name Score Result Model
opening-query A Child's Question
Test scenario
0.609
Failed
[email protected]/Qwe…
foundational-lecture Foundational Lecture at the New Academy
Test scenario
0.000
Failed
[email protected]/Qwe…
outline-request Theory Outline
Test scenario
0.721
Failed
[email protected]/Qwe…
parable-of-the-river Parable of the River
Test scenario
0.451
Failed
[email protected]/Qwe…
reflection-session Reflection Session
Test scenario
0.687
Failed
[email protected]/Qwe…
Performance Matrix 5×1
Scene onteripaul@gma…
opening-query
A Child's Question
0.609
Details
foundational-lecture
Foundational Lecture at the N…
0.000
Details
outline-request
Theory Outline
0.721
Details
parable-of-the-river
Parable of the River
0.451
Details
reflection-session
Reflection Session
0.687
Details