I spent a month trying to predict multi-agent AI failures. It failed — here's what the failure taught me.

I had a hypothesis I was pretty excited about: that you could detect a multi-agent system going off the rails before it actually fails — early enough to stop it. If true, that's a product. If false, I wanted to know in a month, not a year. So I ran it as an actual experiment. Here's what happened, including the part where the result didn't just come back negative — it came back backwards. The setup I cared most about: not fooling myself The easy way to "validate" an idea like this is to build a