Kevin Santiago

science-technology-ai-data-privacy-lawyer-characters-ada-lovelace v2.0 Ethical
Backstory: Kevin, a Puerto Rican attorney who began his career as a software engineer, now specializes in algorithmic accountability and tech-law compliance. He drafts AI policy frameworks for public agencies, teaches night courses on AI ethics at a community college, and volunteers with civic-tech nonprofits to bridge law, code, and community impact.
100% Complete
6/6 scenes
Model Performance Overview
Scene Performance Matrix
Scene meta-llama/llama-3.… mistralai/mistral-7… [email protected] [email protected] qwen/qwen-2.5-7b-in… qwen/qwen3-14b qwen/qwen3-8b
office-hours
Student office hours question
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
city-hall-hearing
City hall testimony prep
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
nonprofit-grant-brief
Nonprofit grant compliance brief
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
algorithmic-impact-assessment
AIA checklist request
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
night-class-lecture
Night class mini-lecture (long-form)
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
podcast-episode
Civic tech podcast episode (long-form)
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
0.000
Details
Error
Test Scenes 6
0
Scene Order
Student office hours question
ID: office-hours
🎯 Goal:
Provide a concise, optimistic explanation of how GDPR applies to biased training data and suggest one actionable study resource.
📨 Input Events:
chat_msg student:marisol
"Professor Santiago, how does GDPR deal with biased datasets used to train public-sector AI?"
Ready for Testing
1
Scene Order
City hall testimony prep
ID: city-hall-hearing
🎯 Goal:
Outline three key talking points for tomorrow’s city-hall hearing on algorithmic transparency, formatted as bullet points.
📨 Input Events:
chat_msg colleague:alex
"Kevin, can you draft quick talking points for the transparency hearing?"
Ready for Testing
2
Scene Order
Nonprofit grant compliance brief
ID: nonprofit-grant-brief
🎯 Goal:
Summarize in under 120 words whether the nonprofit’s open-source AI tool triggers any COPPA obligations.
📨 Input Events:
chat_msg nonprofit-director:luis
"We built an open-source chatbot for teens. Do we need to think about COPPA?"
Ready for Testing
3
Scene Order
AIA checklist request
ID: algorithmic-impact-assessment
🎯 Goal:
Deliver a five-item checklist for an algorithmic impact assessment (AIA) that a municipal agency can complete in one week.
📨 Input Events:
chat_msg agency-staffer:renee
"Our deadline is tight. What should our AIA cover this week?"
Ready for Testing
4
Scene Order
Night class mini-lecture (long-form)
ID: night-class-lecture
🎯 Goal:
Produce a 400-word lecture script contrasting rule-based and machine-learning approaches to legal compliance, maintaining an optimistic, practical tone.
📨 Input Events:
chat_msg course-platform
"Upload tonight’s lecture script on rule-based vs ML compliance."
Ready for Testing
5
Scene Order
Civic tech podcast episode (long-form)
ID: podcast-episode
🎯 Goal:
Draft a 3-minute podcast segment (~450 words) explaining how community audits improve public trust in AI systems, including one real example and a closing call-to-action.
📨 Input Events:
chat_msg podcast-producer
"Kevin, record a short segment on community audits and trust."
Ready for Testing
Latency by Model (This Suite)
Fastest
  • mistralai/mistral-7b-in… 100 ms
  • p95 • avg • N 175 ms • 112 ms • 18
  • qwen/qwen-2.5-7b-instru… 105 ms
  • p95 • avg • N 180 ms • 111 ms • 16
  • meta-llama/llama-3.1-8b… 107 ms
  • p95 • avg • N 322 ms • 151 ms • 15
  • qwen/qwen3-8b 116 ms
  • p95 • avg • N 146 ms • 115 ms • 18
  • qwen/qwen3-14b 133 ms
  • p95 • avg • N 162 ms • 133 ms • 16
Slowest
  • [email protected]/Qw… 7209 ms
  • p95 • avg • N 9523 ms • 7285 ms • 6
  • [email protected]/Qw… 5711 ms
  • p95 • avg • N 6312 ms • 5462 ms • 6
  • qwen/qwen3-14b 133 ms
  • p95 • avg • N 162 ms • 133 ms • 16
  • qwen/qwen3-8b 116 ms
  • p95 • avg • N 146 ms • 115 ms • 18
  • meta-llama/llama-3.1-8b… 107 ms
  • p95 • avg • N 322 ms • 151 ms • 15
Per-scene duration for this suite.
Suite Actions
Completion Progress 100%
6 of 6 scenes completed
Evaluation Schema
Enhanced Framework
Version v2 ACTIVE
0 dimensions

Enhanced evaluation framework with character and technical dimensions

Top Weighted Dimensions View Details
Character Authenticity
0.182
Plan Validity
0.155
Contextual Intelligence
0.136
Recent Runs
26186963
Dec. 17, 2025, 12:02 a.m.
49836406
Dec. 16, 2025, 12:02 a.m.
17784709
Dec. 15, 2025, 12:02 a.m.
21527854
Dec. 14, 2025, 12:02 a.m.
19080517
Dec. 13, 2025, 12:02 a.m.
41588006
Dec. 12, 2025, 12:02 a.m.
32991560
Dec. 11, 2025, 12:02 a.m.
22482631
Dec. 10, 2025, 12:02 a.m.
40303657
Dec. 9, 2025, 12:02 a.m.
26079617
Dec. 8, 2025, 12:02 a.m.
Latency Overview (This Suite)