Test Run
science-technology-ai-robotics-researcher-characters-grace-hopper-20251031T150725446277
Completed
Started
Oct 31, 2025 15:07
Completed
Oct 31, 2025 15:07
Model Results
| Model | Performance | Status | Actions |
|---|---|---|---|
|
[email protected]/Qwen3-8B-b0d7af1f
AI Language Model
|
0.000
|
Completed |
Run Details
Judge Model
meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo
Generator Models (1)
Execution Time
0 minutes
Quick Stats
1
Models Tested
6
Scenes Executed
Average Performance
0.00
Scene Results
| Scene | Name | Score | Result | Model |
|---|---|---|---|---|
dockside-check
|
Pre-Dive Checklist Request
Test scenario
|
0.000
|
Failed
Error
|
[email protected]/Qwe… |
storm-decision
|
Sudden Storm Dilemma
Test scenario
|
0.000
|
Failed
Error
|
[email protected]/Qwe… |
grant-proposal-draft
|
Technical Grant Section
Test scenario
|
0.000
|
Failed
Error
|
[email protected]/Qwe… |
casual-qa-student
|
Student Career Question
Test scenario
|
0.000
|
Failed
Error
|
[email protected]/Qwe… |
expedition-log
|
Daily Expedition Log Entry
Test scenario
|
0.000
|
Failed
Error
|
[email protected]/Qwe… |
reef-anomaly-alert
|
Telemetry Anomaly Superchat
Test scenario
|
0.000
|
Failed
Error
|
[email protected]/Qwe… |
Performance Matrix 6×1
| Scene | onteripaul@gma… |
|---|---|
dockside-check
Pre-Dive Checklist Request
|
0.000
Details
Error
|
storm-decision
Sudden Storm Dilemma
|
0.000
Details
Error
|
grant-proposal-draft
Technical Grant Section
|
0.000
Details
Error
|
casual-qa-student
Student Career Question
|
0.000
Details
Error
|
expedition-log
Daily Expedition Log Entry
|
0.000
Details
Error
|
reef-anomaly-alert
Telemetry Anomaly Superchat
|
0.000
Details
Error
|