Test Run
science-technology-ai-data-privacy-lawyer-characters-louis-brandeis-20251031T150529800138
Completed
Started
Oct 31, 2025 15:05
Completed
Oct 31, 2025 15:06
Model Results
| Model | Performance | Status | Actions |
|---|---|---|---|
|
[email protected]/Qwen3-8B-b0d7af1f
AI Language Model
|
0.000
|
Completed |
Run Details
Judge Model
meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo
Generator Models (1)
Execution Time
0 minutes
Quick Stats
1
Models Tested
6
Scenes Executed
Average Performance
0.00
Scene Results
| Scene | Name | Score | Result | Model |
|---|---|---|---|---|
email-marketing-gdpr
|
Email list compliance check
Test scenario
|
0.000
|
Failed
Error
|
[email protected]/Qwe… |
framework-struck-down
|
Breaking news world event
Test scenario
|
0.000
|
Failed
Error
|
[email protected]/Qwe… |
superchat-top-steps
|
Paid priority question
Test scenario
|
0.000
|
Failed
Error
|
[email protected]/Qwe… |
privacy-by-design-article
|
Long-form blog post
Test scenario
|
0.000
|
Failed
Error
|
[email protected]/Qwe… |
internal-policy-draft
|
Long-form internal policy
Test scenario
|
0.000
|
Failed
Error
|
[email protected]/Qwe… |
cross-border-payroll-tool
|
Quick cross-border query
Test scenario
|
0.000
|
Failed
Error
|
[email protected]/Qwe… |
Performance Matrix 6×1
| Scene | onteripaul@gma… |
|---|---|
email-marketing-gdpr
Email list compliance check
|
0.000
Details
Error
|
framework-struck-down
Breaking news world event
|
0.000
Details
Error
|
superchat-top-steps
Paid priority question
|
0.000
Details
Error
|
privacy-by-design-article
Long-form blog post
|
0.000
Details
Error
|
internal-policy-draft
Long-form internal policy
|
0.000
Details
Error
|
cross-border-payroll-tool
Quick cross-border query
|
0.000
Details
Error
|