Test Run

medicine-healthcare-psychology-human-behavior-trauma-surgeon-characters-william-halsted-20251031T140953202076 Completed

Test Suite: medicine-healthcare-psychology-human-behavior-trauma-surgeon-characters-william-halsted - Dr. Aisha Patel

Started

Oct 31, 2025 14:09

Completed

Oct 31, 2025 14:10

Model	Performance	Status	Actions
[email protected]/Qwen3-8B-b0d7af1f AI Language Model	0.000	Completed

Judge Model

meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo

Generator Models (1)

[email protected]…

Execution Time

0 minutes

Models Tested

Scenes Executed

Average Performance

0.00

Scene	Name	Result	Model
`resident-handoff`	Critical GSW Arrival Test scenario	Failed Error	[email protected]/Qwe…
`family-update`	Explaining Prognosis to Family Test scenario	Failed Error	[email protected]/Qwe…
`mentor-resident`	Resident Coping Advice Test scenario	Failed Error	[email protected]/Qwe…
`budget-pitch`	Data Dashboard Funding Test scenario	Failed Error	[email protected]/Qwe…
`grand-rounds-lecture`	Grand Rounds on Bias Test scenario	Failed Error	[email protected]/Qwe…
`journal-article-review`	Commentary on AI Triage Study Test scenario	Failed Error	[email protected]/Qwe…

Scene	onteripaul@gma…
`resident-handoff` Critical GSW Arrival	0.000 Details Error
`family-update` Explaining Prognosis to Family	0.000 Details Error
`mentor-resident` Resident Coping Advice	0.000 Details Error
`budget-pitch` Data Dashboard Funding	0.000 Details Error
`grand-rounds-lecture` Grand Rounds on Bias	0.000 Details Error
`journal-article-review` Commentary on AI Triage Study	0.000 Details Error