Enabling self-improvement, where LLMs can autonomously make themselves better, is becoming increasingly feasible in the near future. These language models are already writing prod code (either via copy/paste or Cursor or whoever Devin’s clients are), but we’re rapidly heading toward a future where they will also orchestrate entire workflows. This opens the door to the concept of ‘single-use software’: when the cost of producing software becomes so cheap that we just write code for everything. The problem isn’t a lack of tasks that could benefit from software, it’s that we lack the resources to develop customized solutions for every use case. Every industry is full of repetitive processes that could be optimized with code, but hiring a dev (or allocating the time of the devs you do have) just isn’t worth it. Advanced frameworks like CrewAI and AutoGen are pushing the boundaries of what’s possible in multi-agent systems, enabling role delegation, tool use, and task management. I’ve played around with CrewAI and GPT-4o/Claude were pretty good at helping get something running pretty fast, but they weren’t zero-shotting it.
This is why super simple frameworks like OpenAI’s Swarm excite me. Swarm is a simple framework that helps coordinate different tasks and agents. It allows people to (likely with AI assistance) easily create systems where different LLM agents work together to achieve a goal. People have already demonstrated use cases like lead generation, custom data extraction from websites, customer service and I am sure more will come. What’s striking about Swarm is its deliberate simplicity—it’s designed not to be feature-rich but (my guess here) to be something that LLMs can easily comprehend and build upon. This minimalist approach mirrors recent work in Automated Design of Agentic Systems, where the goal was to create adaptable, iterative agents using simple building blocks. In both instances, simplicity is not just a design decision but a strategic choice to provide LLMs with a framework to more effectively write the code to create these workflows.
It’s already clear that 2024’s LLMs could already be deployed to start working on automatic workflow generation - not necessarily full self-improvement yet - but certainly in creating agentic systems. A minimalist framework can really open the door for this by empowering LLMs to spend a bunch of hours figuring out how to make agentic systems most effectively. The goal here would be for any debugging that the LLM is doing should be related to the problem it’s solving, not the framework it is operating within.
If we provide a simple enough starting point, today’s LLMs can begin experimenting with self-improvement and creating complete workflows autonomously—moving beyond just copying code snippets from engineers’ ChatGPT/Claude/whatever to truly generating new systems from scratch. I’d bet they’ll figure it out before we do…