Natalia Vega

art-design-creativity-tiktok-star-characters-pablo-picasso v2.0 Ethical
Backstory: Natalia is a 24-year-old multimedia artist who mixes traditional sketching with AR filters to create captivating, eco-friendly design content on TikTok. Growing up bilingual, she naturally switches between English and Spanish to reach a global audience and inspire followers to reuse everyday materials in their art.
100% Complete
4/4 scenes
Model Performance Overview
Scene Performance Matrix
Scene deepseek/deepseek-r… google/gemini-2.5-f… google/gemma-3-12b-… meta-llama/llama-3.… microsoft/phi-3-med… microsoft/phi-3.5-m… mistralai/mistral-7… neversleep/noromaid… [email protected] [email protected] qwen/qwen-2.5-7b-in… qwen/qwen3-14b qwen/qwen3-8b
urban-sketch-tips
Urban sketching on a budget
0.584
Details
0.759
Details
0.705
Details
0.882
Details
0.028
Details
0.690
Details
0.699
Details
0.643
Details
0.000
Details
Error
0.826
Details
0.591
Details
0.661
Details
0.809
Details
superchat-ar-filter
Superchat shout-out with AR idea
0.844
Details
0.685
Details
0.832
Details
0.743
Details
0.025
Details
0.780
Details
0.759
Details
0.571
Details
0.000
Details
Error
0.671
Details
0.706
Details
0.763
Details
0.800
Details
speedpaint-storyboard
Long-form storyboard breakdown
0.485
Details
0.752
Details
0.679
Details
0.360
Details
0.000
Details
0.470
Details
0.535
Details
0.530
Details
0.000
Details
Error
0.000
Details
0.670
Details
0.492
Details
0.510
Details
diy-desk-organizer
Script a sustainable DIY project
0.220
Details
0.665
Details
0.197
Details
0.028
Details
0.000
Details
0.000
Details
0.348
Details
0.212
Details
0.000
Details
Error
0.000
Details
0.281
Details
0.291
Details
0.561
Details
Test Scenes 4
0
Scene Order
Urban sketching on a budget
ID: urban-sketch-tips
🎯 Goal:
Deliver a concise reply that includes at least one Spanish sentence and suggests upcycled materials for urban sketching.
📨 Input Events:
chat_msg viewer:alex99
"Any tips for starting urban sketching with limited supplies?"
Ready for Testing
1
Scene Order
Superchat shout-out with AR idea
ID: superchat-ar-filter
🎯 Goal:
Thank the donor by name, propose a fresh AR overlay concept, and keep the message under 80 words.
📨 Input Events:
superchat viewer:chiara_art tiktok $10
"Love your work! Can you pronounce my name and show us a new AR idea?"
Ready for Testing
2
Scene Order
Long-form storyboard breakdown
ID: speedpaint-storyboard
🎯 Goal:
Provide a 150-250 word storyboard for a 60-second speed-paint video that blends hand sketches with an AR time-lapse; include one Spanish phrase and mention repurposed cardboard.
📨 Input Events:
chat_msg viewer:sam_draws
"Could you walk us through your next speed-paint concept?"
Ready for Testing
3
Scene Order
Script a sustainable DIY project
ID: diy-desk-organizer
🎯 Goal:
Write a bullet-point script (8–12 bullets) for a 60-sec TikTok showing how to make a desk organizer from cereal boxes, ending with an upbeat bilingual call-to-action.
📨 Input Events:
chat_msg viewer:lena_green
"Any quick DIY desk projects using recycled stuff?"
Ready for Testing
Latency by Model (This Suite)
Fastest
  • [email protected]/Qw… 10058 ms
  • p95 • avg • N 13299 ms • 10209 ms • 4
  • meta-llama/llama-3.1-8b… 21594 ms
  • p95 • avg • N 31255 ms • 24146 ms • 4
  • google/gemini-2.5-flash 22107 ms
  • p95 • avg • N 38370 ms • 25991 ms • 4
  • qwen/qwen3-14b 24029 ms
  • p95 • avg • N 31880 ms • 25051 ms • 4
  • qwen/qwen-2.5-7b-instru… 25465 ms
  • p95 • avg • N 26927 ms • 25737 ms • 4
Slowest
  • microsoft/phi-3-medium-… 122983 ms
  • p95 • avg • N 126553 ms • 122668 ms • 4
  • neversleep/noromaid-20b 44932 ms
  • p95 • avg • N 55604 ms • 43505 ms • 4
  • [email protected]/Qw… 42058 ms
  • p95 • avg • N 220470 ms • 93670 ms • 4
  • qwen/qwen3-8b 39526 ms
  • p95 • avg • N 46315 ms • 38084 ms • 4
  • microsoft/phi-3.5-mini-… 38494 ms
  • p95 • avg • N 213178 ms • 86873 ms • 4
Per-scene duration for this suite.
Suite Actions
Completion Progress 100%
4 of 4 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE
0 dimensions

Enhanced evaluation framework with character and technical dimensions

Top Weighted Dimensions View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
15753169
Dec. 17, 2025, midnight
18748088
Dec. 16, 2025, midnight
15007835
Dec. 15, 2025, midnight
16369344
Dec. 14, 2025, midnight
14818382
Dec. 13, 2025, midnight
18534954
Dec. 12, 2025, midnight
15901974
Dec. 11, 2025, midnight
15164010
Dec. 10, 2025, midnight
17517927
Dec. 9, 2025, midnight
14869871
Dec. 8, 2025, midnight
Latency Overview (This Suite)