Automated Fact Checking using LLMs and SERP

In today’s digital age, misinformation spreads rapidly, making it crucial to verify facts quickly and accurately. Leveraging Large Language Models (LLMs) and Search Engine Results Pages (SERP) APIs, we can automate the fact-checking process. Here’s a step-by-step guide on how to implement this.

What is SERP?

SERP stands for Search Engine Results Page. It is the page displayed by search engines in response to a user’s query. SERPs typically contain a list of web pages, snippets, and other relevant information that matches the search intent. In automated fact-checking, SERP APIs allow us to programmatically retrieve these results for further analysis.

1: Getting User Input and Generating SERP Prompts

The process begins by collecting user input, this could be a statement or claim that needs verification. To identify and extract claims from the input, we can use Large Language Models (LLMs) for advanced understanding and structuring. Alternatively, Named Entity Recognition (NER) and traditional NLP techniques can also be employed to extract claims in a structured schema.

Once claims are identified, we generate SERP queries tailored to each claim. For example, if the claim is “The Eiffel Tower is the tallest structure in Paris,” the SERP query might be “tallest structure in Paris” or “height of Eiffel Tower compared to other buildings in Paris.”

In practice, using local LLMs like Ollama (Llama 3.2) and cloud-based models such as Gemini 2.5 Pro, Gemini was found to be more effective for claim extraction and prompt generation.

2: Querying the SERP Provider

With the generated queries, we use a SERP API provider to fetch search results. The API returns results in a structured format, such as JSON, which includes snippets, URLs, and sometimes direct answers. Parsing this data allows us to gather relevant evidence for each claim.

3: Validating Claims with LLMs

Then we prompt LLM models to validate the extracted claims using the evidence gathered from the SERP results. The LLM analyzes the claims alongside the retrieved information, providing a verdict on whether the claim is supported, refuted, or requires further investigation.

4. Presenting the Results

The final step involves presenting the fact-checking results to the user in a clear and concise manner. This can be done through a web interface, a report, or an API response, depending on the application. We can include things like:

  • original claim
  • verdict (supported/refuted/uncertain)
  • evidence snippets
  • source URLs
  • confidence score

This workflow streamlines the fact-checking process, combining the strengths of LLMs and real-time web data to deliver accurate and efficient verification.

The sample code for implementing this workflow can be found in my GitHub Repository. This was inspired by the demo made by Exa.ai, where they presented a LLM hallucination detector here.

Share :

Related Posts

Crop Yield Predictor for Cereal Crops of Nepal

Introduction This project implements a comprehensive pipeline for predicting cereal crop yields in Nepal using machine learning techniques. The workflow begins with raw data extraction via Optical Character Recognition (OCR) and culminates in model training and performance evaluation. The pipeline is built using Python and leverages powerful tools such as Tesseract OCR, Pandas, NumPy, scikit-learn, Matplotlib, and Seaborn.

Read More

MLP based CNN for Classification

Perceptrons are the foundational elements of artificial neural networks, inspired by the biological neuron. Developed by Frank Rosenblatt in the late 1950s, they represent one of the earliest models of machine learning capable of learning from data.

Read More

OMR

This project automates the evaluation of Optical Mark Recognition (OMR) answer sheets using Python. Leveraging computer vision techniques and OCR, it reads scanned OMR forms, detects marked responses, and computes total scores. The processed results are exported in a structured CSV format.

Read More