Hierarchical skill KB improves performance of weaker models

The dominant paradigm for teaching autonomous language‑model agents is to let each instance wander through its own training episodes, rediscovering the same sub‑tasks over and over. That redundancy inflates exploration budgets and leaves even modest models struggling on long‑horizon problems. A fully automated pipeline that extracts reusable, hierarchical behaviors from a collective pool of trajectories flips the script. Historically, agents have relied on flat replay buffers or hand‑crafted mac