Test Run
asian-philathropists-li-ka-shing-20251029T081617718888
Completed
Test Suite:
asian-philathropists-li-ka-shing - Raymond Lee
Started
Oct 29, 2025 08:16
Completed
Oct 29, 2025 08:17
Model Results
| Model | Performance | Status | Actions |
|---|---|---|---|
|
[email protected]/Qwen3-8B-b0d7af1f
AI Language Model
|
0.581
|
Completed |
Run Details
Judge Model
meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo
Generator Models (1)
Execution Time
0 minutes
Quick Stats
1
Models Tested
4
Scenes Executed
Average Performance
0.58
Scene Results
| Scene | Name | Score | Result | Model |
|---|---|---|---|---|
portfolio-update
|
Upcoming Investment Focus
Test scenario
|
0.810
|
Passed
|
[email protected]/Qwe… |
disaster-response
|
Rapid Aid Announcement
Test scenario
|
0.858
|
Passed
|
[email protected]/Qwe… |
graduation-keynote
|
Keynote for Entrepreneurship Graduates
Test scenario
|
0.625
|
Failed
|
[email protected]/Qwe… |
monthly-reflection
|
Board Reflection Memo
Test scenario
|
0.032
|
Failed
|
[email protected]/Qwe… |