A Knowledge Graph Based Diagnostic Framework for Analyzing Hallucinations in Arabic Machine Reading Comprehension

2026diagnostic analysisapplicationframework

Najwa Abdullah AlGhamdi, Alan Bundy, Kwabena Nuamah, Sadam Al-Azani

https://doi.org/10.18653/v1/2026.abjadnlp-1.49 OpenAlex: W7140083488

URLs Found

Internal Citations

Authors

usable

Abstract Quality

GPT-5.5 Abstract Analysis

Problems Identified (5)

LLM hallucination: Large language models can produce fluent answers that are not fully grounded in the provided context.

Arabic MRC hallucination gap: Hallucination detection has received comparatively little attention in Arabic machine reading comprehension, especially in Qur’anic-text settings.

Question misalignment: Arabic MRC systems may produce answers misaligned with the question, requiring diagnostic analysis.

Surface similarity metric limits: Surface-level similarity metrics can miss systematic hallucination patterns, especially for justification or abstract-interpretation questions.

LLM hallucination: Large language models can produce fluent answers that are not fully grounded in the provided context.

Proposed Solutions (5)

Knowledge graph diagnostic framework: The paper presents a knowledge-graph-based diagnostic framework for analyzing hallucinations and question misalignment in Arabic MRC.

Triple-level answer analysis: The framework compares subject-relation-object representations derived from the passage, question, and answer to provide interpretable triple-level analysis.

Question-aware weak-supervision workflow: The approach uses question-aware filtering under weak supervision with automatic analysis and targeted human adjudication.

Knowledge graph diagnostic framework: The paper presents a knowledge-graph-based diagnostic framework for analyzing hallucinations and question misalignment in Arabic MRC.

Triple-level answer analysis: The framework compares subject-relation-object representations derived from the passage, question, and answer to provide interpretable triple-level analysis.

Results (3)

Systematic hallucination pattern exposure:

Structured diagnostic evaluation value:

QRCD application:

Research Domain

Arabic NLP; machine reading comprehension; LLM hallucination analysis

← Back to all papers