Turing Post
Posts
Topic 27: What are Chain-of-Agents and Chain-of-RAG?

Topic 27: What are Chain-of-Agents and Chain-of-RAG?

We explore Google's and Microsoft's advancements that implement "chain" approaches for long context and multi-hop reasoning

Alyona Vert.
February 12, 2025

With the shift to deep, step-by-step reasoning in AI, we continue to observe a trend of creating Chain-of-”…” methods. Previously, we explored three Chains-of-Knowledge and other "chain" spin-offs in “From Chain-of-Thoughts to Skeleton-of-Thoughts, and everything in between”, but “chains” keep coming! Today, we’re going to discuss the two advancements from Google Cloud AI Research and Microsoft, called Chain-of-Agents (CoA) and Chain-of-Retrieval Augmented Generation (CoRAG), respectively. They both approach the challenge of handling long-context tasks, but from different perspectives. Google’s CoA employs a multi-agent framework, where worker agents process text segments sequentially in a structured chain, while Microsoft’s CoRAG introduces an iterative retrieval approach as a solution for strong multi-hop reasoning. Understanding techniques like CoA and CoRAG is crucial if you are working toward improving AI's performance in complex reasoning tasks. So, let’s explore how these new ‘chains’ can impact the accuracy and quality of AI models!

In today’s episode, we will cover:

Chain-of-Agents from Google: what’s the idea?
- How does CoA work?
- How good CoA actually is?
- CoA advantages and why it is better than RAG and other methods
- Not without limitations
The key idea of Chain-of-RAG (CoRAG) from Microsoft
- How does CoRAG work?
- Performance of CoRAG
- CoRAG’s advantages
- Not without limitations
Conclusion
Bonus: Resources to dive deeper

Chain-of-Agents from Google: what’s the idea?

Even when you are working with state-of-the-art models, you can notice that tasks with long context, like entire books, long articles, or lengthy conversations, still remain a challenge for LLMs. One of the widespread ideas is to expand the model’s memory, in other words, context window. However, models tend to lose the track of main information as the input grows longer. Another way is to shorten input instead by selecting only the most relevant parts of the text. Here RAG may be used for effective retrieval, but this method may lead to losing important parts of information.

What to do? Google Cloud AI Research and Penn State University’s researchers pursued another strategy to create a method that would be better than RAG, full-context models, and multi-agent LLMs. They proposed the Chain-of-Agents (CoA) framework, inspired by how humans process long texts step-by-step. Instead of relying on a single model, CoA enables multiple AI agents to collaborate and process unlimited amounts of text.

Collaboration among agents may not seem a new concept, but researchers have discovered some tips that make their method stand out. Many methods use tree structure where agents work separately without direct communication (for example, LongAgent). In contrast, CoA follows a chain structure with a strong order, at the same time ensuring agents share information for better accuracy. Let’s look at how CoA exactly does it.

How does CoA work?

You can read this article for free on our page on Hugging Face. Follow us there ;)

Reply

or to participate.