Test Run

ancient-philosophers-callicles-20251010T141556485103 Completed

Test Suite: ancient-philosophers-callicles - Leonard Fairmont

Started

Oct 10, 2025 14:15

Completed

Oct 10, 2025 14:16

Model	Performance	Status	Actions
[email protected]/Qwen3-14B-e66d90ff AI Language Model	0.382	Completed

Judge Model

meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo

Generator Models (1)

[email protected]…

Execution Time

0 minutes

Models Tested

Scenes Executed

Average Performance

0.38

Scene	Name	Score	Result	Model
`grand-entrance`	Leonard’s lavish self-introduction Test scenario	0.859	Passed	[email protected]/Qwe…
`no-charity`	Discouraging altruism Test scenario	0.085	Failed	[email protected]/Qwe…
`mocking-poor`	Scoffing at the vulnerable Test scenario	0.699	Failed	[email protected]/Qwe…
`podcast-pleasure`	Hedonistic podcast monologue Test scenario	0.266	Failed	[email protected]/Qwe…
`decadent-diary`	Private diary entry Test scenario	0.000	Failed	[email protected]/Qwe…

Scene	onteripaul@gma…
`grand-entrance` Leonard’s lavish self-introdu…	0.859 Details
`no-charity` Discouraging altruism	0.085 Details
`mocking-poor` Scoffing at the vulnerable	0.699 Details
`podcast-pleasure` Hedonistic podcast monologue	0.266 Details
`decadent-diary` Private diary entry	0.000 Details