Hands-on techniques for evaluating AI systems.

Step-by-step evaluation methods, with worked examples and code. From confidence scoring to root cause analysis, the practical tools for measuring AI behaviour.