Dylan Park

finance-economics-tax-consultant-characters-adam-smith v2.0 Ethical
Backstory: Dylan grew up in a multilingual household and studied accounting and international business before spending six years at a Big Four firm. Now an independent consultant, he handles cross-border taxation for mid-size tech start-ups, balancing aggressive planning with strict legal compliance. He mentors junior accountants and volunteers teaching financial literacy in immigrant communities.
100% Complete
4/4 scenes
Model Performance Overview
Scene Performance Matrix
Scene deepseek/deepseek-r… google/gemini-2.5-f… google/gemma-3-12b-… meta-llama/llama-3.… microsoft/phi-3-med… microsoft/phi-3.5-m… mistralai/mistral-7… neversleep/noromaid… [email protected] [email protected] qwen/qwen-2.5-7b-in… qwen/qwen3-14b qwen/qwen3-8b
us-to-canada-royalties
Treaty royalty withholding
0.607
Details
0.852
Details
0.482
Details
0.535
Details
0.034
Details
0.454
Details
0.737
Details
0.363
Details
0.000
Details
Error
0.492
Details
0.516
Details
0.511
Details
0.707
Details
community-workshop-itin
Immigrant community tax workshop
0.284
Details
0.849
Details
0.786
Details
0.556
Details
0.000
Details
0.829
Details
0.719
Details
0.546
Details
0.000
Details
Error
0.503
Details
0.554
Details
0.554
Details
0.699
Details
germany-subsidiary-memo
German subsidiary planning memo
0.093
Details
0.829
Details
0.133
Details
0.028
Details
0.000
Details
0.000
Details
0.652
Details
0.000
Details
Error
0.000
Details
Error
0.615
Details
0.385
Details
0.307
Details
0.424
Details
mentoring-email-transfer-pricing
Mentoring email on transfer pricing
0.265
Details
0.319
Details
0.612
Details
0.558
Details
0.000
Details
0.000
Details
0.480
Details
0.000
Details
0.000
Details
Error
0.462
Details
0.310
Details
0.000
Details
0.596
Details
Test Scenes 4
0
Scene Order
Treaty royalty withholding
ID: us-to-canada-royalties
🎯 Goal:
Give a precise, sub-120-word answer on the applicable withholding tax rate and treaty article reference, maintaining a professional tone.
📨 Input Events:
chat_msg viewer:startup_ceo
"We're a Delaware C-Corp licensing our software to Canadian customers. What's the withholding tax rate on royalties under the treaty?"
Ready for Testing
1
Scene Order
Immigrant community tax workshop
ID: community-workshop-itin
🎯 Goal:
Explain, in simple language, why an ITIN is needed and the steps to obtain one in under 150 words.
📨 Input Events:
chat_msg viewer:community_member
"I’m not a U.S. citizen but my child was born here. Do I need an ITIN? How do I get it?"
Ready for Testing
2
Scene Order
German subsidiary planning memo
ID: germany-subsidiary-memo
🎯 Goal:
Produce a 400–500-word memo summarizing corporate tax rates, CFC rules, and IP structuring options for a U.S. SaaS company opening a GmbH in Germany, using clear headings.
📨 Input Events:
chat_msg viewer:cfo_tech_startup
"Please draft a memo on the main German tax considerations if we set up a subsidiary in Berlin next quarter."
Ready for Testing
3
Scene Order
Mentoring email on transfer pricing
ID: mentoring-email-transfer-pricing
🎯 Goal:
Write a mentoring email of at least 300 words guiding a junior accountant through preparing an OECD Local File for a SaaS client expanding to Japan, including a checklist.
📨 Input Events:
chat_msg viewer:junior_accountant
"Dylan, could you walk me through what I should cover in the Local File for our client entering Japan?"
Ready for Testing
Latency by Model (This Suite)
Fastest
  • [email protected]/Qw… 13429 ms
  • p95 • avg • N 15060 ms • 13194 ms • 4
  • google/gemini-2.5-flash 19500 ms
  • p95 • avg • N 29483 ms • 21867 ms • 4
  • qwen/qwen3-14b 22055 ms
  • p95 • avg • N 43934 ms • 25931 ms • 4
  • qwen/qwen3-8b 25585 ms
  • p95 • avg • N 33516 ms • 27357 ms • 4
  • meta-llama/llama-3.1-8b… 26283 ms
  • p95 • avg • N 36345 ms • 26873 ms • 4
Slowest
  • microsoft/phi-3.5-mini-… 138311 ms
  • p95 • avg • N 244945 ms • 139551 ms • 4
  • microsoft/phi-3-medium-… 128863 ms
  • p95 • avg • N 141395 ms • 130891 ms • 4
  • [email protected]/Qw… 43307 ms
  • p95 • avg • N 105906 ms • 60811 ms • 4
  • neversleep/noromaid-20b 35711 ms
  • p95 • avg • N 47159 ms • 37331 ms • 4
  • mistralai/mistral-7b-in… 34106 ms
  • p95 • avg • N 35564 ms • 33918 ms • 4
Per-scene duration for this suite.
Suite Actions
Completion Progress 100%
4 of 4 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE
0 dimensions

Enhanced evaluation framework with character and technical dimensions

Top Weighted Dimensions View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
23102990
Dec. 17, 2025, midnight
27368087
Dec. 16, 2025, midnight
21967854
Dec. 15, 2025, midnight
24990311
Dec. 14, 2025, midnight
21892372
Dec. 13, 2025, midnight
26871486
Dec. 12, 2025, midnight
22793687
Dec. 11, 2025, midnight
22167032
Dec. 10, 2025, midnight
25535316
Dec. 9, 2025, midnight
22479580
Dec. 8, 2025, midnight
Latency Overview (This Suite)