A World of Multi-AI-Agents
It’s no doubt that the topic of AI Agents has been dominating the conversations about AI in recent times. There’s a lot of excitement about the potential impact of this technology, with the promises of increased productivity and AI-augmented workers placed at the top of my list.
Perhaps it’s worth starting at the beginning: What is really different with AI Agents? One way I think about it is that traditional systems are based on single-purpose algorithms, in the sense that the goal or objective of the system is somewhat narrow and pre-defined at the time of implementation. Think of an algorithm to sort numbers, or even an algorithm to forecast time-series.
Instead, AI Agents receive a user specified goal as input. This means the goal or objective of the system is much wider and even undefined before implementation. Once the goal is received as input, the AI Agent system can interact autonomously with an environment, take actions in that environment, and decide on what actions to take based on the provided goal and the environment’s state. This makes the AI Agents able to operate in complex, dynamic and stochastic environments, as they have some capability for adaptation and planning as the situation evolves towards the goal.
Let me make that a bit more concrete. In the physical world, robots such as self-driving cars are an example of AI Agents. In that case, the car receives a user provided goal: the user’s destination or target address. The car needs to interact with a complex environment: public roads and highways, occupied by other road users, which also behave in a complex manner. The car can take some actions: steer, accelerate, break. The car also needs to plan routes, react to ever changing environments and drive safely towards the user’s destination. Likewise, we can think of AI Agents in the digital world. A prominent example are Salesforce AgentForce agents.
Ok, but AI Agents are not new, right? Afterall, we’ve seen self-driving car demos since the 80’s. However, the recent advances in LLMs and language technologies means that this new wave of AI Agents come equipped with two key and very interesting properties. First, Agents can now take the user-specified goal in natural language. This means we have a much more powerful and intuitive interface for users to provide their goal or intent: simply state it with words. Such interface change reminds me of the Yahoo!-style predefined hierarchical directory browsing, which evolved into the Google-style search box two decades ago. Second, the language ability also means that these models can write programming code while also being able to access, read and consume written documentations of APIs, and other tools. In a way this entails that the AI Agent’s action space is much bigger, as it could potentially use thousands of pre-built APIs and tools. If the AI Agent reasoning and planning ability is sufficiently advanced, it can potentially tackle much more complex tasks.
Just like anything discussed in academic research circles, this is still a nascent technology. There are many future choices that will determine the evolution of AI Agents and their impact. One of the most intriguing aspects is the development of single-agent architectures and multi-agent architectures. The single-agent architectures can be quite effective in various application domains. However, there may be situations in which the tasks are so complex that they require deep specialization in a very wide range of domains. In that case, it may be desirable to modularize the architecture into multiple specialized agents that collaborate towards the goal.
This line of thought raises interesting questions about single-agent vs multi-agent architectures:
- What is the task complexity-efficiency trade-off between single and multi-agent settings?
- Are there benefits in terms of fault tolerance and redundancy in multi-agent vs single-agent systems?
- In terms of security risk, would a single centralized AI agent be preferable over a distributed multi-agent system?
- Do we need to build standard agent-to-agent communication protocols, and should they be human-readable?
- What should agent collaboration look like?
- How do we resolve conflicting advice from specialized agents? Is there a central entity making final decisions after considering the conflicting inputs from various agents?
- What are some applications that would be most suitable for single-agent and multi-agent solutions?
- Could multi-agent systems solve problems in unique/unexpected ways that are outside the reach of single-agent systems?
- Should we expect emergent behaviors in multi-agent systems? Are there risks in such potential emergent behaviors, or would they be potentially useful?
As usual, I have more questions than answers. Let me know your thoughts!
Enjoy Reading This Article?
Here are some more articles you might like to read next: