When does learning from data work (math starting from basic probability)
VC Dimension and the Fundamental Theorem of Statistical Learning — from Scratch
May 2026
This post answers a single question: when does learning from data actually work?
You train a model on samples, it performs well on those samples, and you hope it performs well on new data. When is that hope justified? The answer turns out to be a clean equivalence: a hypothesis class is learnable if and only if it has finite VC dimension. This is the Fundamental Theorem of Statistical Learning.
We'll build..
