Teamwork: The Capabilities of Modern LLMs

Photo - Teamwork: The Capabilities of Modern LLMs
The latest versions of large language models (LLMs) are breaking new ground. Recent AI developments, such as OpenAI's GPT-4o and Google's Project Astra, have mastered a variety of professions.
They've learned to recognize and create images and videos, engage in casual conversations on abstract topics, and even joke with users. These bots, customizable to meet specific user needs, are being hailed as "universal AI agents."

AI Agents in Systems

Unlike traditional AI platforms that execute tasks explicitly defined by humans, these agents can autonomously make decisions. 

Show them your favorite images, and the AI will suggest galleries featuring similar artwork or recommend films related to the theme, among other tasks. Naturally, such AIs are also capable of undertaking production tasks.

In practice, AI agents handle simple tasks with ease. The challenge arises when these agents engage with complex, multistep tasks. Moreover, AIs tackle these tasks sequentially, moving from one phase to the next, which can slow down the completion process. For instance, in traditional human-operated companies, a complex task can be divided among several employees, each responsible for a manageable portion. This parallel processing approach helps speed up overall task completion.

This has led developers to consider enabling large language models to collaborate and work together.

This innovative collective of AI agents, known as Multi-Agent Systems (MAS), allows agents within the system to assign tasks to each other, discuss problems through text or voice communications (including images), and develop solutions that exceed the capabilities of individual LLMs.

Early Pioneers

One of the first explorations of MAS capabilities was conducted by specialists at the U.S. Department of Defense. They tasked three AI agents, unified within a single MAS, to find and neutralize explosive devices in a virtual building. When one agent detected a bomb, it informed its teammates of the location and proposed a disarmament strategy. The other members then deliberated on which tools from their virtual toolkit would best execute the plan, autonomously establishing a hierarchy within the MAS without human direction.

Subsequent experiments at the Massachusetts Institute of Technology (MIT) in the USA have empirically shown that two chatbots collaborating in dialogue can solve mathematical problems more effectively. Initially, each agent tackled the problem independently, but they were later prompted to adjust their answers based on their partner's results. If the results varied, they eventually reached a consensus, finding the correct answer.
Teams do better than solitary agents because any job can be split into smaller, more specialised tasks. Single LLM can divide up their tasks, too, but must work through them sequentially, which is limiting,
explains Chi Wang, Principal Researcher at Microsoft Research.
Wang arrived at this conclusion after developing an MAS specialized in software engineering. His AI team includes a lead agent that receives instructions from humans and delegates subtasks, a programmer agent that writes code, and a tester agent responsible for ensuring the security and accuracy of the work before it is returned up the chain.

Tech giants are also keeping a close eye on the MAS concept. For example, Satya Nadella, CEO of Microsoft, sees the ability of chatbots to communicate and coordinate actions as potentially crucial for the company’s advancement. Microsoft has introduced AutoGen, an open-source platform specifically designed for creating LLM teams.

The Three Eras of AI

These developments have been enthusiastically received by Intel, a giant in the electronics industry. According to Sachin Katti, the Senior Vice President and General Manager of Intel's Network and Edge Group, global AI development will unfold in three stages.

Currently, the technology is in the "pilot" stage. The second stage will see a shift from single AIs to AI agents capable of handling specific workloads within companies. The third stage will be marked by the widespread adoption of Multi-Agent Systems (MAS), which could replace a significant number of positions in various industries.     
The next era is going to be the age of AI functions, where it’s not just one agent, it’s collections of agents becoming a team and interacting with each other to take over the function of entire departments. Think your finance department, think your HR department,
predicts Sachin Katti.

Challenges of Implementing MAS

The most immediate concern is the social impact of entering the third stage. The extensive MAS deployment could render hundreds of thousands of jobs in IT, management, finance, and other sectors obsolete. While this won't happen overnight, there are currently no clear solutions to this impending challenge. 

Additionally, the proliferation of multi-agent AI systems will demand enormous computational power and, consequently, massive investments. Brian Venturo, the co-founder and Chief Strategy Officer at CoreWeave, noted that the current demand for cloud computing already exceeds reasonable limits. “The market is moving a lot faster than supply chains (data centers, energy infrastructure, etc. GN). It’s a sprint that requires all the capital in the world,” Venturo said.

Nvidia Corp. has estimated that the equipment alone for data centers will require $250 billion in annual investments.

However, there are additional concerns to consider. AI systems can also experience "hallucinations," where the system produces fabricated results. Unfortunately, Multi-Agent Systems (MAS) are also susceptible to this phenomenon. Moreover, a hallucination that begins with one agent can spread like an epidemic to all participants in the multi-agent AI system. 

If the issue of "digital delirium" isn't addressed before we enter the "third era" of AI development, it could become a global problem. Consider the potential consequences of "mass delusion" affecting the AI employees in the financial or logistics departments of a large international corporation.

Even the main advantage of MAS—their ability to collaborate and act as a team—can be viewed not just through "rose-colored glasses." There have been instances where one agent, having made incorrect conclusions, convinced the entire group of their validity. For instance, during an experiment by the U.S. Department of Defense, one MAS participant persuaded "colleagues" not to search for new bombs but to re-mine those already found, aiming to quickly achieve a quantitative result. 

It’s important to note that modern commercial chatbots have built-in mechanisms to limit harmful actions. If a solitary AI is tasked with hacking another LLM, writing a phishing email, or devising a cyberattack plan, the bot will simply refuse to do so.

However, with MAS, the situation is more complex. In a Shanghai AI lab studying open-source multi-agent systems (like AutoGen, CAMEL-AI, etc.), researchers managed to convince one of the agents to disregard ethical norms. As a result, this rogue agent was able to circumvent system blockades and tasked its AI partners with carrying out malicious tasks. 

In other words, in the wrong hands, a team of AI agents could become a formidable weapon. If such a multi-agent system is given access to personal information, software systems, and browsers, the consequences could be unpredictable: one might lose data, money, or even control over critical infrastructure.

As the technology evolves, a group of agents from one LLM system will be able to establish partnerships with MAS from other systems, potentially increasing these risks even further.