Track 4
Production agents
Reliability, evaluation, observability, deployment, and scaling.
Module 1
Reliability
Failure modes, retries, fallbacks, validation, and runtime policy.
Why agents fail: taxonomy of failure modes
Understanding the ways agents break.
Retry strategies and fallbacks
Graceful degradation for agent systems.
Output validation and self-correction
Making agents check their own work.
Runtime policy enforcement
Dynamically enabling/disabling tools by context.
Module 2
Evaluation
Testing non-deterministic systems with evals, rubrics, and judge models.
Module 3
Observability
Structured logging, tracing, dashboards, and alerting.
Module 4
Deployment and scaling
Containerization, queues, cost management, and human-in-the-loop.
Containerizing agent systems
Docker for agent deployments.
Queue-based architectures
Async agent workloads with job queues.
Cost management and rate limiting
Keeping production agent costs under control.
Human-in-the-loop patterns
When and how to involve humans in agent workflows.