A Knowledge Graph Based Diagnostic Framework for Analyzing Hallucinations in Arabic Machine Reading Comprehension
Najwa Abdullah AlGhamdi, Alan Bundy, Kwabena Nuamah, Sadam Al-Azani
Problems Identified (5)
LLM hallucination: Large language models can produce fluent answers that are not fully grounded in the provided context.
Arabic MRC hallucination gap: Hallucination detection has received comparatively little attention in Arabic machine reading comprehension, especially in Qur’anic-text settings.
Question misalignment: Arabic MRC systems may produce answers misaligned with the question, requiring diagnostic analysis.
Surface similarity metric limits: Surface-level similarity metrics can miss systematic hallucination patterns, especially for justification or abstract-interpretation questions.
LLM hallucination: Large language models can produce fluent answers that are not fully grounded in the provided context.
Proposed Solutions (5)
Knowledge graph diagnostic framework: The paper presents a knowledge-graph-based diagnostic framework for analyzing hallucinations and question misalignment in Arabic MRC.
Triple-level answer analysis: The framework compares subject-relation-object representations derived from the passage, question, and answer to provide interpretable triple-level analysis.
Question-aware weak-supervision workflow: The approach uses question-aware filtering under weak supervision with automatic analysis and targeted human adjudication.
Knowledge graph diagnostic framework: The paper presents a knowledge-graph-based diagnostic framework for analyzing hallucinations and question misalignment in Arabic MRC.
Triple-level answer analysis: The framework compares subject-relation-object representations derived from the passage, question, and answer to provide interpretable triple-level analysis.
Results (3)
Systematic hallucination pattern exposure:
Structured diagnostic evaluation value:
QRCD application:
Research Domain
Arabic NLP; machine reading comprehension; LLM hallucination analysis