Relevancite: your personal research validating AI assistant

Motivation

Academic research depends on accurate citations, yet verifying whether a referenced paper truly supports a claim can be tedious and error-prone. In large literature reviews, this task becomes nearly impossible to perform manually for every claim. RelevanCite was created to address this challenge, aiming to reduce errors, save researchers time, and strengthen the integrity of scholarly work.

Introduction

RelevanCite is an automated system that checks if citation claims in a manuscript are actually supported by the referenced research papers. Unlike other tools that rely on cloud services, RelevanCite runs completely offline, using local models to preserve privacy and ensure sensitive research data remains secure. All downloaded papers are stored locally, and all processing is done on the user’s machine. It aims to be a tool useful for researchers (authors), reviewers, and editors.

How It Works

The system processes manuscripts through a series of modular steps:

  1. Citation Extraction
    Structured metadata is extracted from documents using GROBID, including references, authors, and bibliographic information.

  2. Full-Text Retrieval
    Papers are automatically retrieved from open-access sources or manually uploaded and stored locally.

  3. Claim Verification
    Claims from the manuscript are compared with the relevant sections of cited papers using vector embeddings and similarity search.

  4. Structured Reporting
    Verification results and evidence passages are produced in JSON format, ensuring transparency and reproducibility.

System Design and Architecture

RelevanCite employs a modular and loosely coupled architecture, designed for flexibility, maintainability, and extensibility. Each component can operate independently, allowing:

  • Easy replacement or upgrades of embedding models, vector stores, or frontend interfaces.
  • Transparent data processing, with all intermediate artifacts stored as JSON.
  • Secure offline operation, ensuring sensitive research data is never uploaded to external servers.

Main Components

  1. Citation Extraction Module

    • Uses GROBID to parse manuscripts into structured citation metadata.
    • Produces bibliographic fields and reference lists as JSON.
  2. Document Retrieval Module

    • Fetches full-text papers from open-access sources like Unpaywall.
    • Supports manual PDF uploads and maintains a local PDF repository.
  3. Embedding and Similarity Engine

    • Converts claims and paper sections into vector embeddings.
    • Performs similarity search to identify relevant evidence.
    • Runs entirely on local GPU/CPU for security and privacy.
  4. Verification and Reporting

    • Compares claims against extracted evidence and produces structured JSON reports.
    • Ensures reproducibility and easy debugging.
  5. Watchdog and Service Layer

    • Monitors embedding and verification tasks.
    • Supports concurrent processing of multiple manuscripts.

Data Handling

  • Papers: Stored locally as PDFs.
  • Intermediate results: All structured data, embeddings, and verification results are JSON.
  • Reports: Detailed JSON reports indicating claim support and evidence sections.

Future Enhancements

RelevanCite is evolving into a full research assistant with features such as:

  • Support for custom local models to improve embeddings or verification logic.
  • Integration with additional open-access sources for wider coverage.
  • Citation recommendation features to suggest more relevant or higher-quality references.
  • CPU-only execution mode for systems without GPU, while retaining offline local operation.

The goal is to create a system that not only verifies citations but also actively assists researchers in producing well-supported, high-quality work.

This architecture ensures that RelevanCite remains flexible, secure, and easy to extend for future research support capabilities.

Project Demonstration

GitHub Repository: github.com/Aayushstha03/RelevanCite

Share :

Related Posts

Trying out 3D in the Web

I wanted to try out 3D design for the web. I utilized Spline, a powerful tool that simplifies the process of creating and integrating 3D models into web applications.

Read More

2048

A web implementation of the popular 2048 puzzle game, designed with a smooth user interface.

Read More

Pixel Art Canvas

A web-based tool that allows users to create and share pixel art, with features for drawing, coloring, and saving artwork.

Read More