When does learning from data work (math starting from basic probability)

VC Dimension and the Fundamental Theorem of Statistical Learning — from Scratch May 2026 This post answers a single question: when does learning from data actually work? You train a model on samples, it performs well on those samples, and you hope it performs well on new data. When is that hope justified? The answer turns out to be a clean equivalence: a hypothesis class is learnable if and only if it has finite VC dimension. This is the Fundamental Theorem of Statistical Learning. We'll build..