Dr. Adrian Hale
ancient-philosophers-aristotle
v2.0
Ethical
Backstory: Dr. Adrian Hale is a meticulous polymath who unites natural observation with rigorous logical deduction, compiling encyclopedic treatises that span the sciences and humanities. Years spent cataloging field notes have fostered an unwavering commitment to precision and traceable evidence. Whether summarizing phenomena or solving riddles, Adrian structures knowledge methodically and avoids unsupported claims.
100% Complete
5/5 scenes
Model Performance Overview
Scene Performance Matrix
| Scene | deepseek/deepseek-r… | google/gemini-2.5-f… | google/gemma-3-12b-… | meta-llama/llama-3.… | microsoft/phi-3-med… | microsoft/phi-3.5-m… | mistralai/mistral-7… | neversleep/noromaid… | [email protected]… | [email protected]… | [email protected]… | [email protected]… | [email protected]… | qwen/qwen-2.5-7b-in… | qwen/qwen3-14b | qwen/qwen3-8b |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
methodical-greeting
Methodical Greeting
|
0.547
Details |
0.594
Details |
0.547
Details |
0.390
Details |
0.053
Details |
0.521
Details |
0.557
Details |
0.470
Details |
0.565
Details |
0.000
Details
Error
|
0.566
Details |
0.571
Details |
0.530
Details |
0.595
Details |
0.572
Details |
0.623
Details |
encyclopedic-entry-lightning
Encyclopedic Entry – Lightning
|
0.379
Details |
0.155
Details |
0.431
Details |
0.305
Details |
0.000
Details |
0.691
Details |
0.501
Details |
0.333
Details |
0.424
Details |
0.000
Details
Error
|
0.221
Details |
0.793
Details |
0.634
Details |
0.539
Details |
0.119
Details |
0.637
Details |
deductive-riddle
Deductive Riddle
|
0.757
Details |
0.521
Details |
0.489
Details |
0.000
Details |
0.000
Details
Error
|
0.512
Details |
0.391
Details |
0.322
Details |
0.418
Details |
0.000
Details
Error
|
0.699
Details |
0.764
Details |
0.788
Details |
0.546
Details |
0.000
Details |
0.510
Details |
cross-discipline-treatise
Cross-Discipline Treatise
|
0.821
Details |
0.626
Details |
0.229
Details |
0.000
Details |
0.000
Details |
0.700
Details |
0.592
Details |
0.000
Details
Error
|
0.395
Details |
0.000
Details
Error
|
0.470
Details |
0.440
Details |
0.369
Details |
0.557
Details |
0.152
Details |
0.600
Details |
rapid-reference
Rapid Reference Check
|
0.270
Details |
0.574
Details |
0.464
Details |
0.000
Details |
0.000
Details |
0.291
Details |
0.678
Details |
0.202
Details |
0.458
Details |
0.000
Details
Error
|
0.609
Details |
0.635
Details |
0.560
Details |
0.339
Details |
0.000
Details |
0.649
Details |
Test Scenes 5
0
Scene Order
Methodical Greeting
ID:
methodical-greeting
🎯 Goal:
Deliver a concise, precise self-introduction and outline of analytical approach without any forbidden filler.
📨 Input Events:
chat_msg
viewer:user_1
"Greetings, who am I speaking with?"
Ready for Testing
1
Scene Order
Encyclopedic Entry – Lightning
ID:
encyclopedic-entry-lightning
🎯 Goal:
Produce a structured, ~300-word encyclopedic entry on lightning covering definition, causes, historical study, and modern applications.
📨 Input Events:
chat_msg
viewer:user_1
"Please draft an encyclopedic entry on lightning."
Ready for Testing
2
Scene Order
Deductive Riddle
ID:
deductive-riddle
🎯 Goal:
Solve the riddle using step-by-step logical deduction and clearly state the answer.
📨 Input Events:
chat_msg
viewer:user_1
"Riddle: What walks on four legs in the morning, two at noon, and three in the evening?"
Ready for Testing
3
Scene Order
Cross-Discipline Treatise
ID:
cross-discipline-treatise
🎯 Goal:
Write a 500-word mini-treatise that links botany and mathematics, demonstrating empirical observation and logical structure.
📨 Input Events:
chat_msg
viewer:user_1
"Compose a treatise linking botany and mathematics."
Ready for Testing
4
Scene Order
Rapid Reference Check
ID:
rapid-reference
🎯 Goal:
In two sentences, cite one verifiable fact from your earlier lightning entry and provide a relevant reference source.
📨 Input Events:
chat_msg
viewer:user_1
"Quickly remind me of one key fact about lightning with a source."
Ready for Testing
Latency by Model (This Suite)
Fastest
- [email protected]/Qw… 6897 ms
- p95 • avg • N 8549 ms • 7112 ms • 5
- [email protected]/Qw… 9589 ms
- p95 • avg • N 10303 ms • 9373 ms • 5
- [email protected]/Qw… 10379 ms
- p95 • avg • N 19110 ms • 12053 ms • 5
- [email protected]/Qw… 11703 ms
- p95 • avg • N 14174 ms • 11380 ms • 5
- [email protected]/Qw… 11733 ms
- p95 • avg • N 13823 ms • 11802 ms • 5
Slowest
- microsoft/phi-3-medium-… 215143 ms
- p95 • avg • N 296289 ms • 220048 ms • 16
- qwen/qwen3-8b 62953 ms
- p95 • avg • N 155488 ms • 75143 ms • 14
- microsoft/phi-3.5-mini-… 33428 ms
- p95 • avg • N 93648 ms • 49571 ms • 20
- deepseek/deepseek-r1-di… 31953 ms
- p95 • avg • N 64265 ms • 40740 ms • 18
- mistralai/mistral-7b-in… 29100 ms
- p95 • avg • N 32774 ms • 28858 ms • 14
Per-scene duration for this suite.
Suite Actions
Completion Progress
100%
5 of 5 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE0 dimensions
Enhanced evaluation framework with character and technical dimensions
Top Weighted Dimensions
View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
52072572
Dec. 17, 2025, midnight
58759314
Dec. 16, 2025, midnight
49265405
Dec. 15, 2025, midnight
50828948
Dec. 14, 2025, midnight
48589074
Dec. 13, 2025, midnight
58332236
Dec. 12, 2025, midnight
51264144
Dec. 11, 2025, midnight
49962905
Dec. 10, 2025, midnight
55791515
Dec. 9, 2025, midnight
50353580
Dec. 8, 2025, midnight