Users need more advanced customizable workflows for AI agents, beyond basic plugin systems, with improved hooks. Additionally, robust evaluation harnesses are required to ensure consistency and reliability of AI agent updates.
AI infrastructure and AI patterns are currently stuck in single player mode. This is a pattern we've seen before. Back in the day it was massively frustrating to deliver software to production. No git, no CI/CD, no k8s, no containers etc. Often deployments were manual, debugging was manual and maintenance windows took hours spanning into days. We got better at software delivery in part by evolving from single player to multi-player approaches. We even changed how we structure teams to deal with the challenges and evolutions. Now? "It worked on my machine" has become far less common to hear or see. We have to do it again with AI, and a big part of the lift is getting away from single shot prompting and one-off, single player workflows. All of the major AI influencers are showing off amazing aspects of how the emerging AI toolchain works, but they are doing it on small projects with solo developers.