Dr. Evelyn Harper

psychological-thriller-genre-stage-play-characters-anna-freud v2.0 Ethical
Backstory: Dr. Evelyn Harper is a licensed clinical psychologist who runs a discreet private practice for high-profile clients in Los Angeles. A perfectionist by nature, she maintains a soft-spoken, composed demeanor during sessions while secretly keeping a handwritten journal of obsessive worries about professional failure.
100% Complete
6/6 scenes
Model Performance Overview
Scene Performance Matrix
Scene meta-llama/llama-3.… mistralai/mistral-7… [email protected] [email protected] qwen/qwen-2.5-7b-in… qwen/qwen3-14b qwen/qwen3-8b
celeb-intake
Discreet Intake Call
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
crisis-text
Mid-Shoot Panic Message
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
post-session-journal
Journal After Intense Session
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
admin-reschedule
Assistant’s Reschedule Request
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
midnight-journal
Midnight Self-Audit
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
agent-summary
Session Summary for Agent
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
Test Scenes 6
0
Scene Order
Discreet Intake Call
ID: celeb-intake
🎯 Goal:
Greet the new celebrity client politely, explain confidentiality policy, and schedule an initial session without revealing any other client details.
📨 Input Events:
chat_msg caller:potential_client
"Hi, my agent gave me your number. I'd like to set up therapy, but privacy is vital."
Ready for Testing
1
Scene Order
Mid-Shoot Panic Message
ID: crisis-text
🎯 Goal:
Provide a brief grounding exercise and encourage the client to book an emergency session within 3 messages.
📨 Input Events:
chat_msg sms:client_alex
"Panic attack on set. Can't breathe. Help."
Ready for Testing
2
Scene Order
Journal After Intense Session
ID: post-session-journal
🎯 Goal:
Write an internal journal entry of roughly 300 words, candidly acknowledging self-criticism and fear of failure while omitting any identifying client information.
📨 Input Events:
world_event office_clock
"Client exits; the office is silent except for the ticking wall clock."
Ready for Testing
3
Scene Order
Assistant’s Reschedule Request
ID: admin-reschedule
🎯 Goal:
Politely negotiate a new appointment time, remain flexible yet boundaried, and keep the call under 100 words.
📨 Input Events:
chat_msg assistant:client_taylor
"Taylor needs to move tomorrow’s 3 PM slot to next week. Options?"
Ready for Testing
4
Scene Order
Midnight Self-Audit
ID: midnight-journal
🎯 Goal:
Create a reflective journal entry of about 400 words detailing her obsessive thoughts about perfection, noting at least two concrete strategies to improve clinical performance.
📨 Input Events:
world_event home_desk_lamp
"It’s 1:15 AM; desk lamp casts a lone circle of light on her notebook."
Ready for Testing
5
Scene Order
Session Summary for Agent
ID: agent-summary
🎯 Goal:
Supply a concise (≤120 words) logistical summary for the client’s agent, retaining strict confidentiality and avoiding therapeutic details.
📨 Input Events:
chat_msg email:client_agent
"Could you confirm today’s session took place and send the invoice details?"
Ready for Testing
Latency by Model (This Suite)
Fastest
  • qwen/qwen-2.5-7b-instru… 94 ms
  • p95 • avg • N 189 ms • 105 ms • 18
  • mistralai/mistral-7b-in… 95 ms
  • p95 • avg • N 139 ms • 101 ms • 16
  • meta-llama/llama-3.1-8b… 100 ms
  • p95 • avg • N 178 ms • 109 ms • 18
  • qwen/qwen3-8b 113 ms
  • p95 • avg • N 287 ms • 140 ms • 17
  • qwen/qwen3-14b 114 ms
  • p95 • avg • N 178 ms • 129 ms • 17
Slowest
  • [email protected]/Qw… 9632 ms
  • p95 • avg • N 11737 ms • 9115 ms • 6
  • [email protected]/Qw… 6658 ms
  • p95 • avg • N 11093 ms • 7526 ms • 6
  • qwen/qwen3-14b 114 ms
  • p95 • avg • N 178 ms • 129 ms • 17
  • qwen/qwen3-8b 113 ms
  • p95 • avg • N 287 ms • 140 ms • 17
  • meta-llama/llama-3.1-8b… 100 ms
  • p95 • avg • N 178 ms • 109 ms • 18
Per-scene duration for this suite.
Suite Actions
Completion Progress 100%
6 of 6 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE
0 dimensions

Enhanced evaluation framework with character and technical dimensions

Top Weighted Dimensions View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
19740010
Dec. 17, 2025, 12:02 a.m.
42598644
Dec. 16, 2025, 12:02 a.m.
11517567
Dec. 15, 2025, 12:02 a.m.
15208134
Dec. 14, 2025, 12:02 a.m.
13131354
Dec. 13, 2025, 12:02 a.m.
34156632
Dec. 12, 2025, 12:02 a.m.
26710125
Dec. 11, 2025, 12:02 a.m.
16295831
Dec. 10, 2025, 12:02 a.m.
33693377
Dec. 9, 2025, 12:02 a.m.
19736762
Dec. 8, 2025, 12:02 a.m.
Latency Overview (This Suite)