Relevancite: your personal research validating AI assistant

AayushStha
Ai ml , Web
November 28, 2025

Motivation

Academic research depends on accurate citations, yet verifying whether a referenced paper truly supports a claim can be tedious and error-prone. In large literature reviews, this task becomes nearly impossible to perform manually for every claim. RelevanCite was created to address this challenge, aiming to reduce errors, save researchers time, and strengthen the integrity of scholarly work.

Introduction

RelevanCite is an automated system that checks if citation claims in a manuscript are actually supported by the referenced research papers. Unlike other tools that rely on cloud services, RelevanCite runs completely offline, using local models to preserve privacy and ensure sensitive research data remains secure. All downloaded papers are stored locally, and all processing is done on the user’s machine. It aims to be a tool useful for researchers (authors), reviewers, and editors.

How It Works

The system processes manuscripts through a series of modular steps:

Citation Extraction
Structured metadata is extracted from documents using GROBID, including references, authors, and bibliographic information.
Full-Text Retrieval
Papers are automatically retrieved from open-access sources or manually uploaded and stored locally.
Claim Verification
Claims from the manuscript are compared with the relevant sections of cited papers using vector embeddings and similarity search.
Structured Reporting
Verification results and evidence passages are produced in JSON format, ensuring transparency and reproducibility.

System Design and Architecture

RelevanCite employs a modular and loosely coupled architecture, designed for flexibility, maintainability, and extensibility. Each component can operate independently, allowing:

Easy replacement or upgrades of embedding models, vector stores, or frontend interfaces.
Transparent data processing, with all intermediate artifacts stored as JSON.
Secure offline operation, ensuring sensitive research data is never uploaded to external servers.

Main Components

Citation Extraction Module
- Uses GROBID to parse manuscripts into structured citation metadata.
- Produces bibliographic fields and reference lists as JSON.
Document Retrieval Module
- Fetches full-text papers from open-access sources like Unpaywall.
- Supports manual PDF uploads and maintains a local PDF repository.
Embedding and Similarity Engine
- Converts claims and paper sections into vector embeddings.
- Performs similarity search to identify relevant evidence.
- Runs entirely on local GPU/CPU for security and privacy.
Verification and Reporting
- Compares claims against extracted evidence and produces structured JSON reports.
- Ensures reproducibility and easy debugging.
Watchdog and Service Layer
- Monitors embedding and verification tasks.
- Supports concurrent processing of multiple manuscripts.

Data Handling

Papers: Stored locally as PDFs.
Intermediate results: All structured data, embeddings, and verification results are JSON.
Reports: Detailed JSON reports indicating claim support and evidence sections.

Future Enhancements

RelevanCite is evolving into a full research assistant with features such as:

Support for custom local models to improve embeddings or verification logic.
Integration with additional open-access sources for wider coverage.
Citation recommendation features to suggest more relevant or higher-quality references.
CPU-only execution mode for systems without GPU, while retaining offline local operation.

The goal is to create a system that not only verifies citations but also actively assists researchers in producing well-supported, high-quality work.

This architecture ensures that RelevanCite remains flexible, secure, and easy to extend for future research support capabilities.

Project Demonstration

GitHub Repository: github.com/Aayushstha03/RelevanCite

Relevancite: your personal research validating AI assistant

Motivation

Introduction

How It Works

System Design and Architecture

Main Components

Data Handling

Future Enhancements

Project Demonstration

Tags :

Share :

Related Posts

Narrative Reconstruction, Stitching Together the Full Story Using LLMs

Automated Fact Checking using LLMs and SERP

Dockerized FastAPI template with Celery and Redis for Asynchronous Task Management