Solutech

AI-201

Building Production AI Systems

Take a working LLM prototype and turn it into a system you can put on call rotation without flinching.

AI-201 — Building Production AI Systems
ABOUT THIS COURSE

What you will learn.

Most LLM features ship in a state that is functionally a demo. They work on the happy path, they fall over on the long tail, and the team has no way of knowing whether a model change made things better or worse. This course is about closing that gap.

We spend six weeks on the operational disciplines: building eval harnesses that catch regressions, designing prompts and tool-use contracts that degrade gracefully, instrumenting the system so that you can answer the question “did anything get worse this week,” running incident response when the answer is yes, and managing cost without breaking the product. You ship one project — a substantial one — across the six weeks, with weekly code review from the instructor and peer review from your cohort.

Alumni from this course routinely tell us it is the program that changed how their team operates, not just how they build. That is the intended outcome.

WHAT YOU’LL BUILD

Four substantial projects.

Project 01

An eval harness with regression gates

Build an eval suite that runs on every model and prompt change, fails CI on regression, and reports honestly.

Project 02

A traced and budgeted production endpoint

Add tracing, per-request cost budgeting, and a circuit breaker to an existing LLM-backed endpoint.

Project 03

An incident playbook

Run a tabletop incident against a degraded model and produce the postmortem document.

Project 04

A cost-and-latency dashboard

Stand up the minimum observability needed for your manager to stop worrying about your AI bill.

CURRICULUM

Week by week.

FIT

Who this is for — and who it is not.

For you if

  • Engineers who have shipped an LLM feature and now own the pager for it.
  • Tech leads inheriting an AI codebase from someone who has moved on.
  • Senior engineers about to be put in front of a board on the question “how do we know our AI works?”

Probably not for you if

  • Engineers who have not yet shipped anything against a model API — take AI-101 first.
  • Researchers focused on pre-training or evaluation methodology — this is operational engineering.
  • Folks looking for a one-evening overview — this is a six-week commitment.
YOUR INSTRUCTOR

Taught by an operator.

Staff Engineer

Marcus Hale

Marcus spent eight years at Stripe, where he led the team responsible for the Radar risk-decisioning platform. Before that he wrote infrastructure at Square and Twilio. He thinks about LLM systems the way payments engineers think about payments — with a healthy paranoia about retries, idempotency, observability, and the long tail of failure modes that only show up at 3 a.m. on a Saturday.

FAQ

Questions we’re asked often.

AI-201 · Next cohort starts soon

Building Production AI Systems

$1,800

Secure payment · 14-day refund · Invoice on request