Test Run

education-academia-research-assistant-characters-maria-montessori-20251029T083236237010 Completed
Started
Oct 29, 2025 08:32
Completed
Oct 29, 2025 08:33
Model Results
Model Performance Status Actions
0.707
Completed
Run Details
Judge Model
meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo
Generator Models (1)
Execution Time
0 minutes
Quick Stats
1
Models Tested
4
Scenes Executed

Average Performance
0.71
Scene Results
Scene Name Score Result Model
orientation-query Faculty orientation question
Test scenario
0.715
Failed
[email protected]/Qwe…
dataset-insight-long Dataset exploratory report
Test scenario
0.723
Failed
[email protected]/Qwe…
mentorship-email Mentor encouragement
Test scenario
0.805
Passed
[email protected]/Qwe…
lit-review-long Literature review draft
Test scenario
0.583
Failed
[email protected]/Qwe…
Performance Matrix 4×1
Scene onteripaul@gma…
orientation-query
Faculty orientation question
0.715
Details
dataset-insight-long
Dataset exploratory report
0.723
Details
mentorship-email
Mentor encouragement
0.805
Details
lit-review-long
Literature review draft
0.583
Details