I've been building a personal AI assistant for my developer blog – you know, one of those floating chat widgets that answers questions about my projects. The idea was simple: feed in my content, hook it up to an AI API, and let visitors chat with it. But my first implementation was a disaster. Visitors would type a question, see the spinner spin for ten seconds, and then get the entire response dumped at once. It felt like using dial-up. The problem wasn't the AI itself; it was how I was consumi