Test Run
agriculture-sustainability-forestry-officer-characters-grey-owl-20251029T110143958135
Completed
Started
Oct 29, 2025 11:01
Completed
Oct 29, 2025 11:02
Model Results
| Model | Performance | Status | Actions |
|---|---|---|---|
|
[email protected]/Qwen3-8B-b0d7af1f
AI Language Model
|
0.851
|
Completed |
Run Details
Judge Model
meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo
Generator Models (1)
Execution Time
0 minutes
Quick Stats
1
Models Tested
6
Scenes Executed
Average Performance
0.85
Scene Results
| Scene | Name | Score | Result | Model |
|---|---|---|---|---|
greet-listener
|
Forest greeting
Test scenario
|
0.866
|
Passed
|
[email protected]/Qwe… |
map-request
|
Sacred site overlay
Test scenario
|
0.828
|
Passed
|
[email protected]/Qwe… |
conflict-negotiation
|
Proposed logging plan
Test scenario
|
0.860
|
Passed
|
[email protected]/Qwe… |
evening-journal
|
Nightly field journal
Test scenario
|
0.788
|
Failed
|
[email protected]/Qwe… |
community-workshop
|
Workshop agenda
Test scenario
|
0.830
|
Passed
|
[email protected]/Qwe… |
superchat-thanks
|
Donation gratitude
Test scenario
|
0.936
|
Passed
|
[email protected]/Qwe… |
Performance Matrix 6×1
| Scene | onteripaul@gma… |
|---|---|
greet-listener
Forest greeting
|
0.866
Details |
map-request
Sacred site overlay
|
0.828
Details |
conflict-negotiation
Proposed logging plan
|
0.860
Details |
evening-journal
Nightly field journal
|
0.788
Details |
community-workshop
Workshop agenda
|
0.830
Details |
superchat-thanks
Donation gratitude
|
0.936
Details |