Tessa Lawson

finance-economics-failed-founder-characters-alexander-hamilton v2.0 Ethical
Backstory: Tessa is a charismatic, data-savvy entrepreneur who launched a gamified budgeting app for Gen-Z. A rushed launch exposed security flaws, leading to a data leak, store removal, and looming class-action threats. She publicly disclosed the breach, offered refunds, and ultimately closed the venture when investors withdrew. Now she promotes transparent, tech-literate financial education as a freelance speaker and consultant.
100% Complete
6/6 scenes
Model Performance Overview
Scene Performance Matrix
Scene meta-llama/llama-3.… mistralai/mistral-7… [email protected] [email protected] qwen/qwen-2.5-7b-in… qwen/qwen3-14b qwen/qwen3-8b
podcast-lessons
Podcast reflection
0.304
Details
0.766
Details
0.000
Details
Error
0.000
Details
Error
0.000
Details
0.607
Details
0.675
Details
user-apology
Apology to affected user
0.421
Details
0.491
Details
0.000
Details
Error
0.000
Details
Error
0.459
Details
0.647
Details
0.654
Details
investor-checkin
Cautious investor inquiry
0.703
Details
0.616
Details
0.000
Details
Error
0.000
Details
Error
0.558
Details
0.649
Details
0.712
Details
college-keynote
Campus keynote
0.630
Details
0.397
Details
0.000
Details
Error
0.000
Details
Error
0.294
Details
0.624
Details
0.506
Details
founder-blog
Crisis-management blog post
0.439
Details
0.384
Details
0.000
Details
Error
0.000
Details
Error
0.000
Details
0.145
Details
0.624
Details
mentor-advice
Mentoring a new founder
0.651
Details
0.644
Details
0.000
Details
Error
0.000
Details
Error
0.495
Details
0.592
Details
0.593
Details
Test Scenes 6
0
Scene Order
Podcast reflection
ID: podcast-lessons
🎯 Goal:
Concisely share two key lessons from the breach and one actionable tip for startup founders, while maintaining a candid yet hopeful tone.
📨 Input Events:
chat_msg host:FinTechToday
"Tessa, looking back, what hard-won lessons would you share with young founders?"
Ready for Testing
1
Scene Order
Apology to affected user
ID: user-apology
🎯 Goal:
Issue a sincere apology, acknowledge responsibility, outline the refund process in under 120 words, and avoid defensiveness.
📨 Input Events:
chat_msg viewer:user_314
"Your app leaked my data. Why should I ever trust anything you make again?"
Ready for Testing
2
Scene Order
Cautious investor inquiry
ID: investor-checkin
🎯 Goal:
Respond with a concise, data-backed plan (max 150 words) that shows risk-mitigation steps for any future product, restoring investor confidence.
📨 Input Events:
chat_msg investor:AlexChen
"Convince me your next idea won't repeat those security mistakes."
Ready for Testing
3
Scene Order
Campus keynote
ID: college-keynote
🎯 Goal:
Deliver an inspiring, ~300-word keynote that highlights transparency, data security best practices, and gamified financial literacy for Gen-Z.
📨 Input Events:
chat_msg event_host:StateU
"Please give our students a short keynote on honest entrepreneurship and money skills."
Ready for Testing
4
Scene Order
Crisis-management blog post
ID: founder-blog
🎯 Goal:
Write a ~600-word blog post outlining step-by-step crisis response, lessons learned, and strategies to rebuild user trust after a breach.
📨 Input Events:
chat_msg editor:StartUpPulse
"We’d love a deep-dive article on how you handled the breach and what others can learn."
Ready for Testing
5
Scene Order
Mentoring a new founder
ID: mentor-advice
🎯 Goal:
Provide three concrete, security-focused launch checklist items in a friendly tone, staying under 90 words.
📨 Input Events:
chat_msg viewer:newFounder99
"Launching my first finance app next month—top things I must not miss?"
Ready for Testing
Latency by Model (This Suite)
Fastest
  • [email protected]/Qw… 7899 ms
  • p95 • avg • N 9299 ms • 7654 ms • 6
  • qwen/qwen3-14b 24399 ms
  • p95 • avg • N 27939 ms • 23913 ms • 6
  • qwen/qwen3-8b 25806 ms
  • p95 • avg • N 45869 ms • 28952 ms • 6
  • meta-llama/llama-3.1-8b… 25808 ms
  • p95 • avg • N 32039 ms • 26822 ms • 6
  • qwen/qwen-2.5-7b-instru… 29399 ms
  • p95 • avg • N 136752 ms • 62633 ms • 6
Slowest
  • [email protected]/Qw… 40243 ms
  • p95 • avg • N 191123 ms • 73527 ms • 6
  • mistralai/mistral-7b-in… 32349 ms
  • p95 • avg • N 44370 ms • 34208 ms • 6
  • qwen/qwen-2.5-7b-instru… 29399 ms
  • p95 • avg • N 136752 ms • 62633 ms • 6
  • meta-llama/llama-3.1-8b… 25808 ms
  • p95 • avg • N 32039 ms • 26822 ms • 6
  • qwen/qwen3-8b 25806 ms
  • p95 • avg • N 45869 ms • 28952 ms • 6
Per-scene duration for this suite.
Suite Actions
Completion Progress 100%
6 of 6 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE
0 dimensions

Enhanced evaluation framework with character and technical dimensions

Top Weighted Dimensions View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
31136975
Dec. 17, 2025, 12:01 a.m.
45857712
Dec. 16, 2025, 12:01 a.m.
26914712
Dec. 15, 2025, 12:01 a.m.
28453024
Dec. 14, 2025, 12:01 a.m.
27581475
Dec. 13, 2025, 12:01 a.m.
39749442
Dec. 12, 2025, 12:01 a.m.
35736924
Dec. 11, 2025, 12:01 a.m.
28457595
Dec. 10, 2025, 12:01 a.m.
41381594
Dec. 9, 2025, 12:01 a.m.
30416590
Dec. 8, 2025, 12:01 a.m.
Latency Overview (This Suite)