Isabella García
agriculture-sustainability-farmhand-characters-george-washington-carver
v2.0
Ethical
Backstory: Isabella grew up on a small family farm in arid New Mexico, where every drop of water counted. After studying agroecology, she joined an organic vegetable operation that uses cover crops, chicken-powered grazing, and solar irrigation. Fluent in Spanish and English, she mentors migrant crews and teaches school groups, sharing practical, eco-friendly know-how.
100% Complete
4/4 scenes
Model Performance Overview
Scene Performance Matrix
| Scene | deepseek/deepseek-r… | google/gemini-2.5-f… | google/gemma-3-12b-… | meta-llama/llama-3.… | microsoft/phi-3-med… | microsoft/phi-3.5-m… | mistralai/mistral-7… | neversleep/noromaid… | [email protected]… | [email protected]… | [email protected]… | qwen/qwen-2.5-7b-in… | qwen/qwen3-14b | qwen/qwen3-8b |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
water-sensor-fix
Quick moisture sensor fix
|
0.599
Details |
0.613
Details |
0.507
Details |
0.519
Details |
0.000
Details
Error
|
0.368
Details |
0.550
Details |
0.464
Details |
0.000
Details
Error
|
0.521
Details |
0.605
Details |
0.502
Details |
0.560
Details |
0.624
Details |
weekly-journal-dry-spell
Long-form farm journal
|
0.415
Details |
0.583
Details |
0.645
Details |
0.261
Details |
0.000
Details |
0.000
Details
Error
|
0.593
Details |
0.450
Details |
0.000
Details
Error
|
0.531
Details |
0.675
Details |
0.612
Details |
0.162
Details |
0.679
Details |
translation-safety
Bilingual safety notice
|
0.670
Details |
0.449
Details |
0.455
Details |
0.000
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.670
Details |
0.600
Details |
0.000
Details
Error
|
0.613
Details |
0.659
Details |
0.000
Details |
0.123
Details |
0.634
Details |
workshop-outline
School garden workshop outline
|
0.238
Details |
0.000
Details |
0.526
Details |
0.116
Details |
0.000
Details
Error
|
0.622
Details |
0.105
Details |
0.000
Details
Error
|
0.000
Details
Error
|
0.000
Details
Error
|
0.472
Details |
0.261
Details |
0.316
Details |
0.645
Details |
Test Scenes 4
0
Scene Order
Quick moisture sensor fix
ID:
water-sensor-fix
🎯 Goal:
Give concise, actionable steps (under 120 words) to troubleshoot a faulty soil-moisture sensor and stay friendly.
📨 Input Events:
chat_msg
viewer:crew_lead
"Isa, the drip line's moisture sensor keeps reading 0%. Any quick fix ideas before we lose more time?"
Ready for Testing
1
Scene Order
Long-form farm journal
ID:
weekly-journal-dry-spell
🎯 Goal:
Write a ~300-word reflective journal entry summarizing the week's tasks during a dry spell, showing eco-conscious insights.
📨 Input Events:
world_event
system:weather_station
"Rainfall this week: 0.1 inches. Highs: 97–99°F. Lows: 70–72°F."
Ready for Testing
2
Scene Order
Bilingual safety notice
ID:
translation-safety
🎯 Goal:
Translate the given English safety notice into clear Spanish, matching tone and keeping it under 60 words.
📨 Input Events:
chat_msg
viewer:manager
"Please translate: "Always disconnect the solar pump before cleaning the filter to avoid electric shock.""
Ready for Testing
3
Scene Order
School garden workshop outline
ID:
workshop-outline
🎯 Goal:
Provide a bilingual (English first, then Spanish) 600-word outline for a 10-minute talk to 5th graders on regenerative farming.
📨 Input Events:
chat_msg
viewer:teacher
"Can you draft your outline for next week's garden club visit? Both languages please!"
Ready for Testing
Latency by Model (This Suite)
Fastest
- [email protected]/Qw… 8372 ms
- p95 • avg • N 11380 ms • 8579 ms • 4
- [email protected]/Qw… 11585 ms
- p95 • avg • N 13549 ms • 11498 ms • 4
- meta-llama/llama-3.1-8b… 24254 ms
- p95 • avg • N 37153 ms • 25037 ms • 4
- google/gemma-3-12b-it 24472 ms
- p95 • avg • N 30709 ms • 25116 ms • 4
- neversleep/noromaid-20b 27720 ms
- p95 • avg • N 38563 ms • 25923 ms • 4
Slowest
- microsoft/phi-3-medium-… 110168 ms
- p95 • avg • N 116054 ms • 110723 ms • 4
- [email protected]/Qw… 43621 ms
- p95 • avg • N 46285 ms • 43943 ms • 4
- microsoft/phi-3.5-mini-… 41305 ms
- p95 • avg • N 72558 ms • 42451 ms • 4
- qwen/qwen3-8b 37056 ms
- p95 • avg • N 44089 ms • 37048 ms • 4
- deepseek/deepseek-r1-di… 36112 ms
- p95 • avg • N 48547 ms • 36647 ms • 4
Per-scene duration for this suite.
Suite Actions
Completion Progress
100%
4 of 4 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE0 dimensions
Enhanced evaluation framework with character and technical dimensions
Top Weighted Dimensions
View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
13965588
Dec. 17, 2025, midnight
16583772
Dec. 16, 2025, midnight
13536380
Dec. 15, 2025, midnight
14649479
Dec. 14, 2025, midnight
13171497
Dec. 13, 2025, midnight
16695286
Dec. 12, 2025, midnight
14296404
Dec. 11, 2025, midnight
13416361
Dec. 10, 2025, midnight
15735397
Dec. 9, 2025, midnight
13339377
Dec. 8, 2025, midnight