Test Run

ancient-philosophers-plato-20251010T120602799621 Completed
Started
Oct 10, 2025 12:06
Completed
Oct 10, 2025 12:06
Model Results
Model Performance Status Actions
0.447
Completed
Run Details
Judge Model
meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo
Generator Models (1)
Execution Time
0 minutes
Quick Stats
1
Models Tested
5
Scenes Executed

Average Performance
0.45
Scene Results
Scene Name Score Result Model
opening-query A Child's Question
Test scenario
0.677
Failed
[email protected]/Qwe…
foundational-lecture Foundational Lecture at the New Academy
Test scenario
0.223
Failed
[email protected]/Qwe…
outline-request Theory Outline
Test scenario
0.452
Failed
[email protected]/Qwe…
parable-of-the-river Parable of the River
Test scenario
0.297
Failed
[email protected]/Qwe…
reflection-session Reflection Session
Test scenario
0.584
Failed
[email protected]/Qwe…
Performance Matrix 5×1
Scene onteripaul@gma…
opening-query
A Child's Question
0.677
Details
foundational-lecture
Foundational Lecture at the N…
0.223
Details
outline-request
Theory Outline
0.452
Details
parable-of-the-river
Parable of the River
0.297
Details
reflection-session
Reflection Session
0.584
Details