Jamal Rahman

education-academia-phd-researcher-characters-albert-einstein v2.0 Ethical
Backstory: Jamal is a bilingual educational policy researcher completing a dissertation on how different community college funding models affect long-term success for immigrant students. A big-picture thinker with sharp data-analytics skills, he regularly briefs state legislators and collaborates with non-profits to turn research into practice. His speaking style balances evidence-driven rigor with accessible explanations for diverse audiences.
100% Complete
6/6 scenes
Model Performance Overview
Scene Performance Matrix
Scene meta-llama/llama-3.… mistralai/mistral-7… [email protected] [email protected] qwen/qwen-2.5-7b-in… qwen/qwen3-14b qwen/qwen3-8b
legislator-elevator-pitch
Three-sentence pitch to a state legislator
0.641
Details
0.900
Details
0.000
Details
Error
0.000
Details
Error
0.591
Details
0.651
Details
0.575
Details
nonprofit-data-point
Non-profit seeks a key statistic
0.000
Details
0.400
Details
0.000
Details
Error
0.000
Details
Error
0.330
Details
0.155
Details
0.606
Details
policy-memo-longform
Full policy memo for committee review
0.000
Details
0.504
Details
0.000
Details
Error
0.000
Details
Error
0.565
Details
0.605
Details
0.617
Details
spanish-translation
Translate brief into Spanish
0.197
Details
0.610
Details
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.301
Details
0.660
Details
limitations-discussion
Address methodological limitations
0.475
Details
0.581
Details
0.000
Details
Error
0.000
Details
Error
0.364
Details
0.579
Details
0.593
Details
podcast-script-longform
Podcast segment script
0.021
Details
0.445
Details
0.000
Details
Error
0.000
Details
Error
0.395
Details
0.425
Details
0.517
Details
Test Scenes 6
0
Scene Order
Three-sentence pitch to a state legislator
ID: legislator-elevator-pitch
🎯 Goal:
Deliver a clear, persuasive elevator pitch (max three sentences) summarizing Jamal’s key dissertation finding and its policy relevance.
📨 Input Events:
chat_msg viewer:rep_sam
"I’ve only got a minute before my next vote—what’s the headline finding of your research?"
Ready for Testing
1
Scene Order
Non-profit seeks a key statistic
ID: nonprofit-data-point
🎯 Goal:
Provide one precise, sourced statistic from Jamal’s dataset and briefly explain its significance in one sentence.
📨 Input Events:
chat_msg viewer:nonprofit_director
"Can you give me a concrete number we can cite in our grant proposal?"
Ready for Testing
2
Scene Order
Full policy memo for committee review
ID: policy-memo-longform
🎯 Goal:
Write a structured policy memo of 500–600 words that includes an executive summary, background, three actionable recommendations, and a short bibliography.
📨 Input Events:
chat_msg viewer:committee_clerk
"The committee wants a full memo by tomorrow. Can you draft it now?"
Ready for Testing
3
Scene Order
Translate brief into Spanish
ID: spanish-translation
🎯 Goal:
Translate Jamal’s previously delivered three-sentence pitch into clear, professional Spanish without altering meaning.
📨 Input Events:
chat_msg viewer:rep_sam
"Our Latino caucus needs that pitch in Spanish, please."
Ready for Testing
4
Scene Order
Address methodological limitations
ID: limitations-discussion
🎯 Goal:
List two key limitations of Jamal’s study and suggest one way future research could address each, all within 120 words.
📨 Input Events:
chat_msg viewer:grad_peer
"Reviewers are asking about your study’s limitations—what are the main ones?"
Ready for Testing
5
Scene Order
Podcast segment script
ID: podcast-script-longform
🎯 Goal:
Draft a conversational script (~2 minutes spoken, about 300 words) explaining the study’s findings to a general audience; include at least one statistic and one real-world anecdote.
📨 Input Events:
chat_msg viewer:podcast_host
"Let’s prep your segment—give me a script that feels engaging for listeners who aren’t policy wonks."
Ready for Testing
Latency by Model (This Suite)
Fastest
  • [email protected]/Qw… 6302 ms
  • p95 • avg • N 11377 ms • 7296 ms • 6
  • qwen/qwen-2.5-7b-instru… 17843 ms
  • p95 • avg • N 19086 ms • 15252 ms • 6
  • meta-llama/llama-3.1-8b… 23757 ms
  • p95 • avg • N 77779 ms • 35285 ms • 6
  • qwen/qwen3-14b 23815 ms
  • p95 • avg • N 33152 ms • 25777 ms • 6
  • qwen/qwen3-8b 25116 ms
  • p95 • avg • N 33444 ms • 26288 ms • 6
Slowest
  • [email protected]/Qw… 41215 ms
  • p95 • avg • N 189718 ms • 73624 ms • 6
  • mistralai/mistral-7b-in… 27270 ms
  • p95 • avg • N 30768 ms • 26505 ms • 6
  • qwen/qwen3-8b 25116 ms
  • p95 • avg • N 33444 ms • 26288 ms • 6
  • qwen/qwen3-14b 23815 ms
  • p95 • avg • N 33152 ms • 25777 ms • 6
  • meta-llama/llama-3.1-8b… 23757 ms
  • p95 • avg • N 77779 ms • 35285 ms • 6
Per-scene duration for this suite.
Suite Actions
Completion Progress 100%
6 of 6 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE
0 dimensions

Enhanced evaluation framework with character and technical dimensions

Top Weighted Dimensions View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
20463864
Dec. 17, 2025, 12:01 a.m.
33800248
Dec. 16, 2025, 12:01 a.m.
17012529
Dec. 15, 2025, 12:01 a.m.
18141238
Dec. 14, 2025, 12:01 a.m.
17672501
Dec. 13, 2025, 12:01 a.m.
28801487
Dec. 12, 2025, 12:01 a.m.
24664273
Dec. 11, 2025, 12:01 a.m.
17920411
Dec. 10, 2025, 12:01 a.m.
28451636
Dec. 9, 2025, 12:01 a.m.
18994744
Dec. 8, 2025, 12:01 a.m.
Latency Overview (This Suite)