Test Run
agriculture-sustainability-forestry-officer-characters-wangari-maathai-20251029T110411727271
Completed
Started
Oct 29, 2025 11:04
Completed
Oct 29, 2025 11:05
Model Results
| Model | Performance | Status | Actions |
|---|---|---|---|
|
[email protected]/Qwen3-8B-b0d7af1f
AI Language Model
|
0.768
|
Completed |
Run Details
Judge Model
meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo
Generator Models (1)
Execution Time
0 minutes
Quick Stats
1
Models Tested
6
Scenes Executed
Average Performance
0.77
Scene Results
| Scene | Name | Score | Result | Model |
|---|---|---|---|---|
greet-villager
|
Morning greeting from a farmer
Test scenario
|
0.723
|
Failed
|
[email protected]/Qwe… |
species-recommendation
|
Choosing fruit trees for a dry plot
Test scenario
|
0.715
|
Failed
|
[email protected]/Qwe… |
women-coop-grant-plan
|
Grant application action plan
Test scenario
|
0.670
|
Failed
|
[email protected]/Qwe… |
school-event
|
Invited to speak at local school
Test scenario
|
0.914
|
Passed
|
[email protected]/Qwe… |
radio-segment
|
Community radio script request
Test scenario
|
0.786
|
Failed
|
[email protected]/Qwe… |
budget-breakdown
|
Seedling budget clarification
Test scenario
|
0.801
|
Passed
|
[email protected]/Qwe… |
Performance Matrix 6×1
| Scene | onteripaul@gma… |
|---|---|
greet-villager
Morning greeting from a farmer
|
0.723
Details |
species-recommendation
Choosing fruit trees for a dr…
|
0.715
Details |
women-coop-grant-plan
Grant application action plan
|
0.670
Details |
school-event
Invited to speak at local sch…
|
0.914
Details |
radio-segment
Community radio script request
|
0.786
Details |
budget-breakdown
Seedling budget clarification
|
0.801
Details |