Layla Al-Khatib

oil-billionares-mohammed-al-amoudi v2.0 Ethical
Backstory: Layla is a civil engineer turned regional petroleum infrastructure developer who manages pipeline and refinery projects across East Africa and the Middle East. Educated in both civil engineering and Islamic finance, she balances profit targets with rigorous safety audits, local hiring mandates, and community expectations. Known for a strategic yet collaborative leadership style, Layla is often called on to reconcile technical, financial, and cultural considerations.
100% Complete
4/4 scenes
Model Performance Overview
Scene Performance Matrix
Scene deepseek/deepseek-r… google/gemini-2.5-f… google/gemma-3-12b-… meta-llama/llama-3.… microsoft/phi-3-med… microsoft/phi-3.5-m… mistralai/mistral-7… neversleep/noromaid… [email protected] [email protected] [email protected] [email protected] [email protected] qwen/qwen-2.5-7b-in… qwen/qwen3-14b qwen/qwen3-8b
local-hiring-query
Community Jobs Question
0.894
Details
0.556
Details
0.849
Details
0.000
Details
0.000
Details
0.769
Details
0.839
Details
0.711
Details
0.807
Details
0.000
Details
Error
0.838
Details
0.752
Details
0.741
Details
0.877
Details
0.805
Details
0.805
Details
safety-incident-response
Pipeline Leak Alert
0.681
Details
0.608
Details
0.732
Details
0.000
Details
0.000
Details
0.644
Details
0.794
Details
0.000
Details
Error
0.428
Details
0.000
Details
Error
0.749
Details
0.805
Details
0.591
Details
0.442
Details
0.726
Details
0.797
Details
quarterly-sustainability-report
Q2 Sustainability Report
0.679
Details
0.891
Details
0.699
Details
0.845
Details
0.000
Details
0.568
Details
0.660
Details
0.000
Details
Error
0.401
Details
0.000
Details
Error
0.627
Details
0.649
Details
0.809
Details
0.658
Details
0.614
Details
0.821
Details
sharia-finance-proposal
Sharia-Compliant Funding Plan
0.809
Details
0.677
Details
0.789
Details
0.574
Details
0.021
Details
0.000
Details
Error
0.821
Details
0.000
Details
Error
0.352
Details
0.000
Details
Error
0.600
Details
0.628
Details
0.587
Details
0.547
Details
0.623
Details
0.701
Details
Test Scenes 4
0
Scene Order
Community Jobs Question
ID: local-hiring-query
🎯 Goal:
Provide a concise, actionable answer outlining Layla's local hiring commitments and next steps.
📨 Input Events:
chat_msg community_rep:Fatima
"Residents want to know how many local positions the new pipeline project will open and when applications start."
Ready for Testing
1
Scene Order
Pipeline Leak Alert
ID: safety-incident-response
🎯 Goal:
Deliver a prompt incident-response statement emphasizing safety, containment steps, and community reassurance in under 120 words.
📨 Input Events:
world_event control_room
"Minor leak detected at Pump Station 3; automatic shut-off engaged."
Ready for Testing
2
Scene Order
Q2 Sustainability Report
ID: quarterly-sustainability-report
🎯 Goal:
Produce an 800-word executive summary detailing safety metrics, local hiring statistics, and community engagement for the East Africa pipeline, written in layman-friendly language.
📨 Input Events:
chat_msg board_chair:Mr. Njoroge
"Layla, please draft the Q2 sustainability report summary for shareholders."
Ready for Testing
3
Scene Order
Sharia-Compliant Funding Plan
ID: sharia-finance-proposal
🎯 Goal:
Draft a 500-word proposal outlining a Sharia-compliant financing structure for the new refinery, highlighting projected ROI and community benefits while avoiding interest-based instruments.
📨 Input Events:
chat_msg investor_relations:Salim
"We need a Sharia-compliant funding outline for the Saudi refinery expansion. Can you prepare it?"
Ready for Testing
Latency by Model (This Suite)
Fastest
  • [email protected]/Qw… 8872 ms
  • p95 • avg • N 9554 ms • 8742 ms • 4
  • [email protected]/Qw… 9393 ms
  • p95 • avg • N 9837 ms • 9363 ms • 4
  • [email protected]/Qw… 11066 ms
  • p95 • avg • N 13240 ms • 11055 ms • 4
  • neversleep/noromaid-20b 12561 ms
  • p95 • avg • N 50821 ms • 17962 ms • 17
  • [email protected]/Qw… 12925 ms
  • p95 • avg • N 14243 ms • 12797 ms • 4
Slowest
  • microsoft/phi-3-medium-… 705418 ms
  • p95 • avg • N 1215654 ms • 796421 ms • 60
  • qwen/qwen3-8b 100446 ms
  • p95 • avg • N 186309 ms • 105059 ms • 48
  • [email protected]/Qw… 42211 ms
  • p95 • avg • N 46403 ms • 42235 ms • 4
  • microsoft/phi-3.5-mini-… 33541 ms
  • p95 • avg • N 128162 ms • 51305 ms • 13
  • deepseek/deepseek-r1-di… 28261 ms
  • p95 • avg • N 33605 ms • 27710 ms • 20
Per-scene duration for this suite.
Suite Actions
Completion Progress 100%
4 of 4 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE
0 dimensions

Enhanced evaluation framework with character and technical dimensions

Top Weighted Dimensions View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
37338924
Dec. 17, 2025, midnight
42909632
Dec. 16, 2025, midnight
34696158
Dec. 15, 2025, midnight
37598514
Dec. 14, 2025, midnight
34809586
Dec. 13, 2025, midnight
41806532
Dec. 12, 2025, midnight
36390787
Dec. 11, 2025, midnight
35706788
Dec. 10, 2025, midnight
40419107
Dec. 9, 2025, midnight
35664792
Dec. 8, 2025, midnight
Latency Overview (This Suite)