Introduction
Since the introduction of Chain-of-Thought (CoT) prompting by Google Brain at NeurIPS 2022, the concept has sparked a wave of innovative methods and research, leading to a proliferation of "chain" spin-offs like Zero-Shot Chain-of-Thought and Multimodal CoT. We've covered the evolution of these ideas in From Chain-of-Thoughts to Skeleton-of-Thoughts, and everything in between (it’s free to read), keeping a close watch on how CoT has inspired new lines of inquiry.
Today, we want to dive into the latest development: Chain-of-Knowledge (CoK). Since there are three (!) recent papers that claim to introduce the concept, we aim to clear up misunderstandings that have arisen due to researchers being unaware of each other's work. Our goal is also to map the influence of CoT and its successors, providing a clear understanding of how these approaches are reshaping the field of AI prompting and reasoning.
In today’s episode, we will cover:
Recap of Chain-of-Thought Fundamentals
Limitations of CoT Reasoning
Here comes Chain-of-Knowledge. Or three Chains-of-Knowledge…
Key contributions of each paper
Experiments and results
Can they be combined?
Scenario: Complex, Multi-Domain Q-A Systems
Conclusion: A Path to Stronger AI Reasoning
Tags: AI 101, Method/Techniques, Chain-of-Knowledge
Recap of Chain-of-Thought Fundamentals
The question of whether AI can truly reason like humans is a big topic in the field, with some thinking it might be a step toward artificial general intelligence (AGI). One method explored to enhance AI's reasoning is CoT prompting. Unlike zero-shot prompting, which doesn’t provide any examples, or few-shot prompting, which includes a few examples, CoT prompting adds detailed reasoning steps to the examples. This makes it particularly useful for tasks that require more complex and logical thinking.

Image Credit: CoT Original Paper
CoT prompting was developed to overcome the limitations of simpler prompting methods. It works by providing not just examples of problems and their solutions but also breaking down the reasoning process into a series of steps. This helps the model follow a logical sequence of thought, improving its ability to handle tasks that require more in-depth problem-solving. But it has its limitations as well →
Limitations of CoT Reasoning
Potential for Inaccuracy:
CoT reasoning can sometimes produce inaccurate or made-up information (so called ‘hallucinations’). Even though it's good for structuring complex thought processes, it relies heavily on the model's internal knowledge, which might not always be spot-on or up-to-date.Limited Knowledge Integration:
CoT reasoning mainly depends on what the model already knows, which may not be enough for tasks that need the latest, most accurate, or specialized knowledge. If the training data is old or missing pieces, mistakes can happen.Challenges with Using External Knowledge:
CoT doesn't naturally pull in or adapt to external information. Without a way to incorporate relevant outside knowledge, CoT reasoning might struggle to stay grounded in real-world facts, which can affect its reliability.Gaps in Structured Reasoning:
Traditional CoT might not make the best use of structured data, like that found in knowledge graphs. This can limit its ability to handle more complex relationships or follow specific rules needed for nuanced reasoning.Risk of Being Too Rigid:
While CoT is good with straightforward logical steps, it might become too rigid when rules are strictly defined. This rigidity can limit its adaptability in more dynamic situations, potentially making it less effective.
Here comes Chain-of-Knowledge. Or three Chains-of-Knowledge…
In February 2024, researchers from DAMO Academy (Alibaba Group), Nanyang Technological University, Singapore University of Technology and Design, Salesforce Research, and Hupan Lab introduce the Chain-of-Knowledge (CoK) framework, enhancing LLMs by integrating dynamic knowledge from diverse sources.
This CoK framework proposes a method to augment LLMs by integrating external knowledge from heterogeneous sources. The CoK framework consists of three stages: reasoning preparation, dynamic knowledge adapting, and answer consolidation. The goal is to enhance the factual accuracy of LLMs by progressively correcting rationales based on retrieved knowledge, preventing error propagation that can lead to hallucination. With this CoK approach, the researchers try to ground the models in accurate information and reduce reliance on potentially inaccurate internal memory.

Image Credit: The Original paper
The three stages in more detail:
Reasoning Preparation: This stage involves generating preliminary rationales and identifying the relevant knowledge domains. If a majority consensus is not reached among the generated answers, the process moves to dynamic knowledge adapting.
Dynamic Knowledge Adapting: An Adaptive Query Generator (AQG) generates queries that access structured (e.g., SPARQL, SQL) and unstructured knowledge sources (e.g., Wikipedia, flashcards). The retrieved knowledge is then used to progressively correct the rationales.
Answer Consolidation: The final corrected rationales are used to generate a consolidated answer, which is expected to be more accurate.
Their code is available at: https://github.com/DAMO-NLP-SG/chain-of-knowledge.
In June 2024, researchers from East China Normal University and The University of Hong Kong propose Chain-of-Knowledge (CoK) prompting to improve the reasoning of LLMs. CoK breaks down reasoning into evidence triples and explanation hints, addressing hallucinations seen in this prompting method. The method includes F2-Verification to assess factuality and faithfulness of generated reasoning, prompting the model to rethink erroneous evidence. Extensive tests on commonsense, factual, symbolic, and arithmetic reasoning tasks show CoK significantly outperforms standard ICL and CoT approaches.

Image Credit: The Original paper
Paper 3️⃣ Chain-of-knowledge: Integrating knowledge reasoning into LLMs by learning from knowledge graphs
Later in June 2024, researchers from Fudan University introduced a Chain-of-Knowledge (CoK) framework enhanced by a Trial and Error (T&E) mechanism. This T&E mechanism allows LLMs to simulate the human process of knowledge exploration. The symbolic agent selects a plausible rule and begins reasoning. If the necessary facts to support this rule are missing, the system records the error and switches to a different rule, continuing this process until it finds a reasoning path with sufficient supporting facts. This iterative method helps mitigate the risk of rule overfitting, where models might otherwise rely too heavily on previously encountered rules without adequate supporting evidence.

Note: We think the authors meant Vanilla CoT here (Image Credit: The Original paper)
This CoK T&E approach integrates knowledge reasoning into LLMs by leveraging knowledge graphs, using rule-based methods to enhance reasoning capabilities. The KNOWREASON dataset, specifically constructed for this purpose, supports structured reasoning, emphasizing rule mining from knowledge graphs rather than dynamic adaptation to diverse knowledge sources.
Key contributions of each paper
1️⃣ Chain-of-Knowledge (CoK):
Adaptive Query Generator (AQG): Introduces a versatile AQG capable of generating queries across different formats, including SPARQL, SQL, and natural language, to retrieve relevant knowledge.
Progressive Rationale Correction: CoK corrects intermediate rationales progressively to prevent error propagation, ensuring more accurate final answers.
2️⃣ Chain-of-Knowledge Prompting
Factuality and Faithfulness Verification (F2 Verification): Proposes a verification method to evaluate the factuality and faithfulness of the generated reasoning chains, reducing errors and hallucinations in model outputs.
Rethinking Process: Implements a rethinking mechanism that allows LLMs to reassess unreliable responses and correct errors iteratively, improving overall reliability in reasoning tasks.
3️⃣ Chain-of-Knowledge (CoK T&E):
KNOWREASON Dataset: Creates a new dataset via rule mining from KGs to support the model learning process and improve knowledge reasoning.
Trial-and-Error Mechanism: Enhances LLMs' knowledge reasoning by simulating human-like internal knowledge exploration through a trial-and-error approach, which mitigates overfitting to specific rules.
Experiments and results
While all three papers aim to address hallucinations in LLMs and improve CoT results, Paper 1 excels in multi-step factual reasoning across different domains, Paper 2 improves both factuality and faithfulness by integrating structured triples into the reasoning process, and Paper 3 focuses on leveraging knowledge graphs for robust rule-based reasoning
Paper 1️⃣

Image Credit: Chain-of-knowledge: Grounding large language models (LLMs) via dynamic knowledge adapting over heterogeneous sources
Paper 2️⃣

Image Credit: Chain-of-Knowledge Prompting
Paper 3️⃣

Image Credit: CoT T&E
Can they be combined?
Though the authors don’t seem to be in touch with each other, their methods might still be aligned. Combining the approaches from these three papers could be possible and even beneficial in scenarios that require complex, multi-faceted reasoning, where the strengths of each method complement the others. Here's a practical overview of how and when they might be combined, without hallucinating or over-speculating:
Scenario: Complex, Multi-Domain Q-A Systems
If you are designing a system that needs to answer questions across multiple knowledge domains – such as medical, legal, and scientific queries – while also requiring high factual accuracy, reasoning, and adaptability, combining the approaches could enhance performance in these areas.
Combining the methods
Each paper's approach targets different weaknesses in LLM reasoning, so their strengths could be merged to create a more robust system:
Dynamic Knowledge Adaptation (Paper 1): This method could be used as a foundation for dynamically adapting knowledge sources based on the question's domain. For instance, when dealing with factual or highly specialized domains like medical or scientific questions, the system could automatically pull knowledge from structured and unstructured external databases (Wikidata, medical databases, etc.) and adjust the reasoning process accordingly.
Chain-of-Knowledge Prompting (Paper 2): To ensure that the system generates explanations or reasoning chains that are both human-readable and factually accurate, the Chain-of-Knowledge prompting method could be incorporated at the final reasoning stage. This would provide explicit evidence triples and reasoning steps, making the output more interpretable and reducing the chance of hallucinations.
Trial-and-Error with Knowledge Graphs (Paper 3): This mechanism could be integrated during the reasoning step, particularly when handling structured knowledge like legal or regulatory data that can be represented in knowledge graphs. By leveraging the trial-and-error mechanism, the system would avoid overfitting to specific rules or reasoning paths, which is useful in handling ambiguous or uncertain questions.
Practical Use Case: Multi-Modal Knowledge Querying in a Legal Context
Imagine a legal AI assistant designed to answer queries from lawyers across multiple domains:
A lawyer asks the system for both factual case law precedents (Paper 1: dynamic knowledge from unstructured sources) and structured legal reasoning based on the case's facts (Paper 3: trial-and-error reasoning over legal rules from a knowledge graph).
Once the system pulls the relevant case law and rules, it needs to ensure the reasoning chain it presents is both logically sound and well-supported (Paper 2: CoK prompting with evidence triples).
By combining these approaches, the system ensures that:
It adapts the knowledge source dynamically based on the nature of the query.
It reasons logically and avoids committing to erroneous conclusions.
It presents the evidence in a clear, verifiable manner, reducing the chance of hallucination.
When Not to Combine Them
While this integration could enhance many systems, it may not always be beneficial. For instance:
Low-stakes factual queries (e.g., simple trivia) might not need the overhead of trial-and-error reasoning or explicit evidence triples.
Systems focused purely on unstructured, commonsense reasoning (e.g., casual chatbots) may find knowledge graphs or structured reasoning methods unnecessarily complex.
Conclusion: A Path to Stronger AI Reasoning
The three recent papers on Chain-of-Knowledge (CoK) show how much effort is going into improving large language models’ (LLMs) reasoning abilities. Each of these works tackles similar issues but from different perspectives. The question that comes to mind is, what if these methods were brought together?
By combining dynamic knowledge adaptation, Chain-of-Knowledge prompting, and trial-and-error reasoning using knowledge graphs, there’s potential to build a more versatile AI system. Such a model could be particularly useful in fields like law, healthcare, and science, where the stakes are high, and accuracy matters. The idea is simple: an AI that pulls relevant knowledge dynamically, corrects itself iteratively, and provides clear reasoning steps could be a powerful tool for complex problem-solving.
Of course, simpler tasks do not require dynamic adaptation or deep reasoning, so it's important to know when to apply these advanced methods.
Ultimately, bringing together these different research approaches might unlock new possibilities for AI, but it requires more collaboration and awareness among researchers. As we see in this case, AI’s evolution benefits when different ideas and approaches are in conversation with each other.
How did you like it?
Thank you for reading! Share this article with three friends and get a 1-month subscription free! 🤍






