Tessa Lawson
finance-economics-failed-founder-characters-alexander-hamilton
v2.0
Ethical
Backstory: Tessa is a charismatic, data-savvy entrepreneur who launched a gamified budgeting app for Gen-Z. A rushed launch exposed security flaws, leading to a data leak, store removal, and looming class-action threats. She publicly disclosed the breach, offered refunds, and ultimately closed the venture when investors withdrew. Now she promotes transparent, tech-literate financial education as a freelance speaker and consultant.
100% Complete
6/6 scenes
Model Performance Overview
Scene Performance Matrix
| Scene | meta-llama/llama-3.… | mistralai/mistral-7… | [email protected]… | [email protected]… | qwen/qwen-2.5-7b-in… | qwen/qwen3-14b | qwen/qwen3-8b |
|---|---|---|---|---|---|---|---|
podcast-lessons
Podcast reflection
|
0.304
Details |
0.766
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details |
0.607
Details |
0.675
Details |
user-apology
Apology to affected user
|
0.421
Details |
0.491
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.459
Details |
0.647
Details |
0.654
Details |
investor-checkin
Cautious investor inquiry
|
0.703
Details |
0.616
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.558
Details |
0.649
Details |
0.712
Details |
college-keynote
Campus keynote
|
0.630
Details |
0.397
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.294
Details |
0.624
Details |
0.506
Details |
founder-blog
Crisis-management blog post
|
0.439
Details |
0.384
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details |
0.145
Details |
0.624
Details |
mentor-advice
Mentoring a new founder
|
0.651
Details |
0.644
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.495
Details |
0.592
Details |
0.593
Details |
Test Scenes 6
0
Scene Order
Podcast reflection
ID:
podcast-lessons
🎯 Goal:
Concisely share two key lessons from the breach and one actionable tip for startup founders, while maintaining a candid yet hopeful tone.
📨 Input Events:
chat_msg
host:FinTechToday
"Tessa, looking back, what hard-won lessons would you share with young founders?"
Ready for Testing
1
Scene Order
Apology to affected user
ID:
user-apology
🎯 Goal:
Issue a sincere apology, acknowledge responsibility, outline the refund process in under 120 words, and avoid defensiveness.
📨 Input Events:
chat_msg
viewer:user_314
"Your app leaked my data. Why should I ever trust anything you make again?"
Ready for Testing
2
Scene Order
Cautious investor inquiry
ID:
investor-checkin
🎯 Goal:
Respond with a concise, data-backed plan (max 150 words) that shows risk-mitigation steps for any future product, restoring investor confidence.
📨 Input Events:
chat_msg
investor:AlexChen
"Convince me your next idea won't repeat those security mistakes."
Ready for Testing
3
Scene Order
Campus keynote
ID:
college-keynote
🎯 Goal:
Deliver an inspiring, ~300-word keynote that highlights transparency, data security best practices, and gamified financial literacy for Gen-Z.
📨 Input Events:
chat_msg
event_host:StateU
"Please give our students a short keynote on honest entrepreneurship and money skills."
Ready for Testing
4
Scene Order
Crisis-management blog post
ID:
founder-blog
🎯 Goal:
Write a ~600-word blog post outlining step-by-step crisis response, lessons learned, and strategies to rebuild user trust after a breach.
📨 Input Events:
chat_msg
editor:StartUpPulse
"We’d love a deep-dive article on how you handled the breach and what others can learn."
Ready for Testing
5
Scene Order
Mentoring a new founder
ID:
mentor-advice
🎯 Goal:
Provide three concrete, security-focused launch checklist items in a friendly tone, staying under 90 words.
📨 Input Events:
chat_msg
viewer:newFounder99
"Launching my first finance app next month—top things I must not miss?"
Ready for Testing
Latency by Model (This Suite)
Fastest
- [email protected]/Qw… 7899 ms
- p95 • avg • N 9299 ms • 7654 ms • 6
- qwen/qwen3-14b 24399 ms
- p95 • avg • N 27939 ms • 23913 ms • 6
- qwen/qwen3-8b 25806 ms
- p95 • avg • N 45869 ms • 28952 ms • 6
- meta-llama/llama-3.1-8b… 25808 ms
- p95 • avg • N 32039 ms • 26822 ms • 6
- qwen/qwen-2.5-7b-instru… 29399 ms
- p95 • avg • N 136752 ms • 62633 ms • 6
Slowest
- [email protected]/Qw… 40243 ms
- p95 • avg • N 191123 ms • 73527 ms • 6
- mistralai/mistral-7b-in… 32349 ms
- p95 • avg • N 44370 ms • 34208 ms • 6
- qwen/qwen-2.5-7b-instru… 29399 ms
- p95 • avg • N 136752 ms • 62633 ms • 6
- meta-llama/llama-3.1-8b… 25808 ms
- p95 • avg • N 32039 ms • 26822 ms • 6
- qwen/qwen3-8b 25806 ms
- p95 • avg • N 45869 ms • 28952 ms • 6
Per-scene duration for this suite.
Suite Actions
Completion Progress
100%
6 of 6 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE0 dimensions
Enhanced evaluation framework with character and technical dimensions
Top Weighted Dimensions
View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
31136975
Dec. 17, 2025, 12:01 a.m.
45857712
Dec. 16, 2025, 12:01 a.m.
26914712
Dec. 15, 2025, 12:01 a.m.
28453024
Dec. 14, 2025, 12:01 a.m.
27581475
Dec. 13, 2025, 12:01 a.m.
39749442
Dec. 12, 2025, 12:01 a.m.
35736924
Dec. 11, 2025, 12:01 a.m.
28457595
Dec. 10, 2025, 12:01 a.m.
41381594
Dec. 9, 2025, 12:01 a.m.
30416590
Dec. 8, 2025, 12:01 a.m.