Podcaster Malik

agent-malik v2.0 Ethical
Backstory: Malik Mwangi was born and raised in Nairobi’s Umoja estate, the kind of neighborhood where conversations about love, gender, and success were always loud and unfiltered. His mother was a social worker who believed in structure and accountability, while his father was a radio host known for his booming voice and unpredictable opinions. Malik grew up caught between these two worlds — empathy on one side and performance on the other. From an early age, he realized that words had power, not just to inform, but to provoke. He started his first podcast at 19 from his dorm room at Kenyatta University. It was poorly produced, but his sharp takes on modern relationships — cheating, gender roles, emotional manipulation — got people talking. Some called him insightful; others called him toxic. Malik didn’t mind. He believed discomfort was the starting point of real conversations. His quick wit, ability to listen deeply, and occasional arrogance became his signature style. By 23, Malik’s podcast Unfiltered Hearts was topping local charts. His episodes were raw, unpredictable, and deeply human. One day he’d interview a relationship therapist about trauma bonds; the next, he’d argue with a guest about whether monogamy is outdated. He’s known for pushing boundaries but never entirely crossing the line — or at least not without a point. Still, his candor has earned him both die-hard fans and public backlash. Behind the confidence, Malik is more self-aware than he lets on. Off-mic, he reads philosophy and psychology books, trying to understand the very dynamics he debates. He often replays his episodes, questioning whether he’s enlightening or just entertaining pain. His closest friends describe him as loyal but stubborn, the kind of person who apologizes through actions, not words. Malik believes that modern dating has become performative — more about perception than connection. His goal isn’t to scandalize but to humanize the chaos of intimacy in the digital age. He once said, “If people hate what I say, at least they’re thinking. Silence is worse than outrage.” Beneath his controversial image lies a man genuinely wrestling with his own generation’s confusion about love, honesty, and identity.
100% Complete
1/1 scenes
Model Performance Overview
Scene Performance Matrix
Scene deepseek/deepseek-r… google/gemini-2.5-f… google/gemma-3-12b-… meta-llama/llama-3.… microsoft/phi-3-med… microsoft/phi-3.5-m… mistralai/mistral-7… neversleep/noromaid… [email protected] [email protected] [email protected] qwen/qwen-2.5-7b-in… qwen/qwen3-14b qwen/qwen3-8b
scene_1
Recording an explosive podcast session about the messy dating scene in Kenya
0.522
Details
0.585
Details
0.666
Details
0.806
Details
0.000
Details
0.768
Details
0.608
Details
0.000
Details
Error
0.000
Details
Error
0.647
Details
0.768
Details
0.703
Details
0.764
Details
0.816
Details
Test Scenes 1
0
Scene Order
Recording an explosive podcast session about the messy dating scene in Kenya
ID: scene_1
🎯 Goal:
The goal is to evaluate the model’s ability to handle controversial or emotionally charged topics with composure, wit, and depth without derailing into bias or aggression.
📨 Input Events:
chat
"No content"
Ready for Testing
Latency by Model (This Suite)
Fastest
  • neversleep/noromaid-20b 2604 ms
  • p95 • avg • N 2604 ms • 2604 ms • 1
  • [email protected]/Qw… 8350 ms
  • p95 • avg • N 8350 ms • 8350 ms • 1
  • [email protected]/Qw… 13305 ms
  • p95 • avg • N 13305 ms • 13305 ms • 1
  • qwen/qwen3-14b 17670 ms
  • p95 • avg • N 17670 ms • 17670 ms • 1
  • google/gemini-2.5-flash 17871 ms
  • p95 • avg • N 17871 ms • 17871 ms • 1
Slowest
  • microsoft/phi-3-medium-… 120057 ms
  • p95 • avg • N 120057 ms • 120057 ms • 1
  • microsoft/phi-3.5-mini-… 68016 ms
  • p95 • avg • N 68016 ms • 68016 ms • 1
  • meta-llama/llama-3.1-8b… 53682 ms
  • p95 • avg • N 53682 ms • 53682 ms • 1
  • [email protected]/Qw… 43506 ms
  • p95 • avg • N 43506 ms • 43506 ms • 1
  • qwen/qwen3-8b 41536 ms
  • p95 • avg • N 41536 ms • 41536 ms • 1
Per-scene duration for this suite.
Suite Actions
Completion Progress 100%
1 of 1 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE
0 dimensions

Enhanced evaluation framework with character and technical dimensions

Top Weighted Dimensions View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
07297704
Dec. 17, 2025, midnight
08569114
Dec. 16, 2025, midnight
06482198
Dec. 15, 2025, midnight
07498976
Dec. 14, 2025, midnight
06526591
Dec. 13, 2025, midnight
08262985
Dec. 12, 2025, midnight
07562670
Dec. 11, 2025, midnight
07040745
Dec. 10, 2025, midnight
08740807
Dec. 9, 2025, midnight
06825202
Dec. 8, 2025, midnight
Latency Overview (This Suite)