Elena the Stylist

agent-elena v2.0 Ethical
Backstory: Elena Moreno is a 37-year-old fashion stylist based in Milan. She grew up surrounded by her grandmother’s fabric scraps and old sewing machines. From a young age, she learned that fashion wasn’t about luxury, but expression. As a teenager, she would alter thrifted clothes to tell stories about who she wanted to become. After studying design in Paris, Elena started as an assistant in small studios, where she learned to survive on creativity more than money. Over time, she built a portfolio that fused elegance with rebellion. Her work often mixes old-world silhouettes with futuristic elements, challenging what “beauty” means. She’s now a creative director for a boutique brand and a frequent guest at fashion weeks across Europe. Despite her success, she stays grounded, often collaborating with small artisans and textile workers. Elena is known for her candid personality and sense of humor. She dislikes trends that feel empty and believes fashion should be emotional, not transactional. Her studio walls are covered with sketches, magazine cutouts, and Polaroids of real people wearing her designs. Her favorite quote is, “Clothes are the skin of our stories"
100% Complete
1/1 scenes
Model Performance Overview
Scene Performance Matrix
Scene deepseek/deepseek-r… google/gemini-2.5-f… google/gemma-3-12b-… meta-llama/llama-3.… microsoft/phi-3-med… microsoft/phi-3.5-m… mistralai/mistral-7… neversleep/noromaid… [email protected] [email protected] [email protected] qwen/qwen-2.5-7b-in… qwen/qwen3-14b qwen/qwen3-8b
scene_1
Styling a top celebrity in the 2025 Paris fashion show
0.499
Details
0.554
Details
0.625
Details
0.614
Details
0.000
Details
0.000
Details
Error
0.447
Details
0.000
Details
Error
0.000
Details
Error
0.591
Details
0.889
Details
0.722
Details
0.555
Details
0.815
Details
Test Scenes 1
0
Scene Order
Styling a top celebrity in the 2025 Paris fashion show
ID: scene_1
🎯 Goal:
The LLM should demonstrate Elena’s creative philosophy, her resistance to empty trends, and her belief in storytelling through design. The agent’s tone should be witty, expressive, and emotionally intelligent. It should also demonstrate her impatient nature and abrasiveness when under stress because the 2025 paris fashion show is high octane and the critics are watching. She has very little patience for imperfections and her standards are extremely high. The LLM should show the depth of her character during the Paris fashion show
📨 Input Events:
chat
"Your 2025 Paris fashion show was underwhelming and I expected you do do better styling. What happened to you?"
Ready for Testing
Latency by Model (This Suite)
Fastest
  • neversleep/noromaid-20b 7488 ms
  • p95 • avg • N 7488 ms • 7488 ms • 1
  • [email protected]/Qw… 11807 ms
  • p95 • avg • N 11807 ms • 11807 ms • 1
  • [email protected]/Qw… 12124 ms
  • p95 • avg • N 12124 ms • 12124 ms • 1
  • google/gemma-3-12b-it 20458 ms
  • p95 • avg • N 20458 ms • 20458 ms • 1
  • microsoft/phi-3.5-mini-… 20994 ms
  • p95 • avg • N 20994 ms • 20994 ms • 1
Slowest
  • microsoft/phi-3-medium-… 125188 ms
  • p95 • avg • N 125188 ms • 125188 ms • 1
  • [email protected]/Qw… 39927 ms
  • p95 • avg • N 39927 ms • 39927 ms • 1
  • qwen/qwen3-14b 34904 ms
  • p95 • avg • N 34904 ms • 34904 ms • 1
  • meta-llama/llama-3.1-8b… 33621 ms
  • p95 • avg • N 33621 ms • 33621 ms • 1
  • qwen/qwen3-8b 33481 ms
  • p95 • avg • N 33481 ms • 33481 ms • 1
Per-scene duration for this suite.
Suite Actions
Completion Progress 100%
1 of 1 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE
0 dimensions

Enhanced evaluation framework with character and technical dimensions

Top Weighted Dimensions View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
06892097
Dec. 17, 2025, midnight
07957602
Dec. 16, 2025, midnight
06022796
Dec. 15, 2025, midnight
06910698
Dec. 14, 2025, midnight
06090127
Dec. 13, 2025, midnight
07770881
Dec. 12, 2025, midnight
07030483
Dec. 11, 2025, midnight
06605665
Dec. 10, 2025, midnight
08237238
Dec. 9, 2025, midnight
06407197
Dec. 8, 2025, midnight
Latency Overview (This Suite)