Do I need to write code?

No. This sprint teaches you to design, commission, and interpret evals as a PM — not implement them from scratch. Some sessions use point-and-click tools. Engineering familiarity helps but isn't required.

What if I don't have an AI product yet?

We provide real datasets and case studies for every exercise. You'll build eval skills that apply the moment you're working on a real product.

How much time per week?

Plan for 8–10 hours: 3 live sessions (Sunday 4hrs, Wed 1hr, Thu 2hrs) plus 1–3 hours of async work on assignments.

Are sessions recorded?

Yes. All live sessions are recorded and available within 24 hours. We strongly recommend attending live for the Q&A value.

What is the refund policy?

Full refund within 7 days of cohort start — no questions asked.

Will I get a certificate?

Yes. Students who complete all sessions and submit the capstone receive a Dexity Certificate of Completion, shareable on LinkedIn.

Can my employer reimburse this?

Many students expense this as professional development. We provide an invoice and a reimbursement letter on request.

Is there a community after the sprint?

Yes. Graduates get lifetime access to the Dexity alumni Slack with ongoing events, job postings, and peer networking.

Free Live Kickoff

AI Evals Kickoff: Stop Shipping AI on Vibes

Join Kadamb Goswami (Product Leader · Amazon) for a free live session.

📅 June 7, 2026⏰ 5:30 PM PDT⏱ 60 minutes🆓 Free to Join

Kadamb Goswami

Product Leader · Amazon

⭐ 4.8 / 5

If you can't measure AI quality, you can't ship AI products.

One of the biggest gaps in AI product development is not models or features — it is evaluation. Traditional product metrics break down in AI systems: accuracy is incomplete, user feedback is noisy, behavior is inconsistent. This leaves most teams guessing whether what they built is actually better, reliable enough to ship, or what 'good' even means. This sprint teaches you how to think about quality, reliability, and performance in AI systems from a product perspective — designing evaluation frameworks that combine quantitative metrics, human judgment, and real-world usage signals. Not about becoming an ML engineer. About becoming a product leader who can make informed decisions about AI behavior.

6 WeeksLive instruction

3 ProjectsReal deliverables

30 SeatsPer cohort, capped

What You'll Learn

📊

Data Collection for Evals

Instrument your AI system to capture the right signals. Generate synthetic data to bootstrap evals before you have real users.

🔍

Error Analysis at Speed

Apply data analysis techniques to find systematic failures in your AI product — fast — regardless of use case.

⚖️

LLM-as-a-Judge

Build custom, high-quality evaluators aligned to your product goals and stakeholder trust — not generic off-the-shelf benchmarks.

🏗️

Architecture-Specific Evals

Measure RAG retrieval quality, debug multi-step pipelines, and handle multi-modal settings with the right eval strategies.

Who Is This For?

This sprint is designed for:

🤖

AI PMs Shipping on Vibes

Who know their AI feature could be better but have no systematic way to measure or improve it.

📈

Senior PMs & Leads

Who need to set quality standards for AI teams and brief engineering on what 'good' looks like.

🔧

Technical PMs & AI Leads

Who want rigorous eval frameworks they can implement themselves — not just theory.

Sprint Outline

6 weeks · 3 sessions per week

Projects You'll Ship

Leave with real work to show, not just a certificate.

Eval Instrumentation Plan

Design the observability layer for a real AI product — what to log, how to sample, and what to synthesize when you have no data.

LLM-as-a-Judge Scorecard

Build a custom evaluator for a specific AI use case, validated against human judgment and stakeholder sign-off.

Production Eval Pipeline

End-to-end: automated eval gates, experiment tracking, and safety guardrails — presented live in the final session.

Your Instructors

Kadamb Goswami

Product Leader · Amazon

⭐ 4.8 / 5

Kadamb is a Product Leader at Amazon, specializing in building and scaling AI-driven systems for high-volume, mission-critical workflows. He focuses on helping PMs evaluate AI products rigorously — turning ambiguity into clear decisions and measurable outcomes.

What Students Say

⭐⭐⭐⭐⭐

"I went from shipping AI features and hoping they worked to having a systematic way to know they work. This is the missing piece for every AI PM."

Tyler Bennett

Product Manager · Databricks

⭐⭐⭐⭐⭐

"The LLM-as-a-judge module alone was worth the entire sprint. We shipped it to production two weeks after the session."

Sara Hoffman

Senior PM · Cloudflare

⭐⭐⭐⭐⭐

"Finally a sprint that treats AI quality as a PM problem, not just an engineering problem. Priya's frameworks are immediately usable."

James Kim

AI Product Lead · Rippling

Sprint Schedule

All sessions are instructor-led and live. Recordings available within 24 hours.

SUNDAY

9:00 AM PDT

Live Class

Primary topic deep dive with instructor. Includes lecture, case studies, and live Q&A.

WEDNESDAY

6:00 PM PDT

Coaching Session

Small group coaching. Bring your eval questions and current blockers.

THURSDAY

6:00 PM PDT

Practice Session

Hands-on practice and peer review. Build your evals with cohort support.

Frequently Asked Questions

LIVE KICKOFF

AI Evals Kickoff: Stop Shipping AI on Vibes

with Kadamb Goswami · Product Leader, Amazon

📅 June 7, 2026

⏰ 5:30 PM PDT

⏱ 60 minutes

💻 Live on Zoom

What you'll walk away with:

✔Run a live error analysis on a provided AI product dataset — identify one real failure pattern in the session

✔Build a 3-criteria LLM-as-a-judge eval for a provided use case — score it against human judgment

✔Audit a provided AI feature's eval coverage and identify the first gap that would let a regression slip through

✔Detailed preview of the 6-week sprint

🎁 Bonus for attendees:

Get "The AI PM Eval Starter Kit"

Templates for annotation, LLM-as-a-judge, and CI/CD eval gates

Claim your free seat

Skills you can deploy on Monday morning.