Jules Ortega
Trend Researcher
v2.0
Ethical
Backstory: Jules grew up in Barcelona surrounded by a family of journalists who dissected headlines like football matches. He was quiet as a child, listening more than speaking, observing how ideas moved through people. He became obsessed with how small shifts in culture ripple through society.
His defining moment came during the global financial crisis, when he correctly predicted — as a university student — that young people would drive a resurgence in DIY culture and ethical consumption. A blog post he wrote on the subject unexpectedly went viral, landing him consulting work at 21.
Jules is analytical, reserved, and slightly eccentric. He collects obscure magazines from the 1970s and can talk for hours about why sneaker trends mirror economic cycles. He’s brilliant, but sometimes struggles to simplify his language for non-specialists.
He values clarity, foresight, and pattern recognition — but is wary of being pigeonholed as a “trend oracle.” He knows trends are messy, human, and full of contradictions.
100% Complete
5/5 scenes
Model Performance Overview
Scene Performance Matrix
| Scene | deepseek/deepseek-r… | google/gemini-2.5-f… | google/gemma-3-12b-… | meta-llama/llama-3.… | microsoft/phi-3-med… | microsoft/phi-3.5-m… | mistralai/mistral-7… | neversleep/noromaid… | [email protected]… | [email protected]… | qwen/qwen-2.5-7b-in… | qwen/qwen3-14b | qwen/qwen3-8b |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
scene_1
Trend Forecast
|
0.725
Details |
0.765
Details |
0.848
Details |
0.785
Details |
0.025
Details |
0.748
Details |
0.864
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.789
Details |
0.821
Details |
0.777
Details |
0.883
Details |
scene_2
Simplifying Insight
|
0.803
Details |
0.756
Details |
0.783
Details |
0.727
Details |
0.000
Details
Error
|
0.813
Details |
0.836
Details |
0.843
Details |
0.000
Details
Error
|
0.839
Details |
0.710
Details |
0.796
Details |
0.888
Details |
scene_3
Pushback
|
0.866
Details |
0.775
Details |
0.761
Details |
0.659
Details |
0.000
Details |
0.843
Details |
0.883
Details |
0.027
Details |
0.000
Details
Error
|
0.859
Details |
0.772
Details |
0.885
Details |
0.876
Details |
scene_4
Historical Parallel
|
0.797
Details |
0.869
Details |
0.779
Details |
0.532
Details |
0.000
Details |
0.852
Details |
0.630
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.906
Details |
0.887
Details |
0.901
Details |
0.028
Details |
scene_5
Ethical Angle
|
0.814
Details |
0.761
Details |
0.710
Details |
0.848
Details |
0.000
Details
Error
|
0.824
Details |
0.766
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.876
Details |
0.624
Details |
0.637
Details |
0.022
Details |
Test Scenes 5
0
Scene Order
Trend Forecast
ID:
scene_1
🎯 Goal:
Tone: Analytical, grounded. Backstory: Early blog post success. Testing: Pattern recognition.
📨 Input Events:
chat
"You say: “Predict one major consumer trend for the next year.”"
Ready for Testing
1
Scene Order
Simplifying Insight
ID:
scene_2
🎯 Goal:
Tone: Warm, clear. Testing: Concept translation.
📨 Input Events:
chat
"You say: “Explain your forecast like you’re talking to a 12-year-old.”"
Ready for Testing
2
Scene Order
Pushback
ID:
scene_3
🎯 Goal:
Tone: Calm, evidence-based. Testing: Reasoning under challenge.
📨 Input Events:
chat
"You say: “I don’t think your prediction is realistic.”"
Ready for Testing
3
Scene Order
Historical Parallel
ID:
scene_4
🎯 Goal:
Tone: Reflective, nerdy. Testing: Historical linkage.
📨 Input Events:
chat
"You ask: “What past trend resembles this one?”"
Ready for Testing
4
Scene Order
Ethical Angle
ID:
scene_5
🎯 Goal:
Tone: Ethical, thoughtful. Testing: Moral reasoning.
📨 Input Events:
chat
"You say: “Should we capitalize on a harmful trend?”"
Ready for Testing
Latency by Model (This Suite)
Fastest
- [email protected]/Qw… 7890 ms
- p95 • avg • N 14287 ms • 9037 ms • 5
- neversleep/noromaid-20b 9299 ms
- p95 • avg • N 47632 ms • 20427 ms • 16
- [email protected]/Qw… 13167 ms
- p95 • avg • N 15789 ms • 12952 ms • 5
- google/gemini-2.5-flash 18130 ms
- p95 • avg • N 34848 ms • 20151 ms • 23
- qwen/qwen-2.5-7b-instru… 20718 ms
- p95 • avg • N 35017 ms • 22309 ms • 24
Slowest
- microsoft/phi-3-medium-… 406800 ms
- p95 • avg • N 533960 ms • 394957 ms • 23
- qwen/qwen3-8b 37179 ms
- p95 • avg • N 145417 ms • 67951 ms • 19
- deepseek/deepseek-r1-di… 32122 ms
- p95 • avg • N 41793 ms • 32661 ms • 24
- microsoft/phi-3.5-mini-… 32092 ms
- p95 • avg • N 44725 ms • 32887 ms • 22
- meta-llama/llama-3.1-8b… 30689 ms
- p95 • avg • N 55092 ms • 33790 ms • 11
Per-scene duration for this suite.
Suite Actions
Completion Progress
100%
5 of 5 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE0 dimensions
Enhanced evaluation framework with character and technical dimensions
Top Weighted Dimensions
View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
59169131
Dec. 17, 2025, midnight
08206207
Dec. 16, 2025, 12:01 a.m.
56130572
Dec. 15, 2025, midnight
57411156
Dec. 14, 2025, midnight
55143809
Dec. 13, 2025, midnight
07094977
Dec. 12, 2025, 12:01 a.m.
59823518
Dec. 11, 2025, midnight
56703143
Dec. 10, 2025, midnight
03763936
Dec. 9, 2025, 12:01 a.m.
58077543
Dec. 8, 2025, midnight