Multimodal misinformation detection across diverse languages using RAG and LLMs

Harris, Sheetal; Thong Ta, Vinh; Trovati, Marcello; Nakhla, Ghada; Latif, Faiza; Korkontzelos, Ioannis

Lists

Tools

Harris, Sheetal, Thong Ta, Vinh, Trovati, Marcello ORCID: 0000-0001-6607-422X, Nakhla, Ghada, Latif, Faiza and Korkontzelos, Ioannis (2026) Multimodal misinformation detection across diverse languages using RAG and LLMs. Journal of Intelligent Information Systems . ISSN 0925-9902

Preview

PDF (VOR) - Published Version
Available under License Creative Commons Attribution.
3MB

Official URL: https://doi.org/10.1007/s10844-026-01042-x

Abstract

The rapid spread of multimodal fake news (FN) on Online Social Networks (OSNs) threatens digital information ecosystems, particularly in low-resource languages. Existing multimodal fake news detection (FND) methods are largely limited to high-resource settings, restricting their global applicability. We propose an M&M-RAG, a Multilingual & Multimodal Retrieval-Augmented Generation framework, that leverages Large Vision-Language Models (LVLMs) and Large Language Models (LLMs) to verify news claims across English, Chinese and Urdu. M&M-RAG integrates real-time multilingual evidence retrieval, language-aware prompting, and cross-modal reasoning for fact verification. We further propose Multi-Ax-to-Grind Urdu, the first large-scale, multi-domain multimodal benchmark for FND in Urdu. Experiments on typologically diverse monolingual multimodal datasets demonstrate that M&M-RAG achieves state-of-the-art (SOTA) performance, with 94.6% accuracy and 94.2% F1 score, surpassing models such as SpotFake, MPFN, MMCFND, and Semi-FND. The proposed framework remains robust in zero-shot and cross-lingual scenarios under frozen-model inference without task-specific fine-tuning. The results underscore the scalability and interpretability of LVLM-based approaches for combating multimodal misinformation, particularly in under-represented and typologically diverse languages.

Repository Staff Only: item control page

Altmetric

View Altmetric information about this item.

Summary Table

CORE (COnnecting REpositories)

Item Type:	Article
Creators (Authors or editors):	Creators Email ORCID ORCID Put Code Harris, Sheetal UNSPECIFIED UNSPECIFIED UNSPECIFIED Thong Ta, Vinh UNSPECIFIED UNSPECIFIED UNSPECIFIED Trovati, Marcello mtrovati@lancashire.ac.uk https://orcid.org/0000-0001-6607-422X 211547773 Nakhla, Ghada UNSPECIFIED UNSPECIFIED UNSPECIFIED Latif, Faiza UNSPECIFIED UNSPECIFIED UNSPECIFIED Korkontzelos, Ioannis UNSPECIFIED UNSPECIFIED UNSPECIFIED
Uncontrolled Keywords (separate with ;):	Multimodal multilingual fake news detection; Large vision-language models (LVLMs); Retrieval-augmented generation (RAG); NLP
Subjects:	I - Computer science > I400 - Artificial intelligence
Schools:	School of Business
Research Institutes:	The Institute of Business, Enterprise and Organisational Impact (BEOI)
ID Code:	58931
Depositing User ID:	Marcello Trovati
Date Deposited:	13 Apr 2026 13:47
Last Modified:	13 Apr 2026 13:56

Welcome to

Lancashire Online Knowledge

Multimodal misinformation detection across diverse languages using RAG and LLMs

Abstract

Follow Us

Search

Welcome to

Lancashire Online Knowledge

Multimodal misinformation detection across diverse languages using RAG and LLMs

Abstract

Follow Us