Assessing Large Language Models Suitability for Knowledge Graph Construction
Vasile Ionut Remus Iga, Gheorghe Cosmin Silaghi
Neurosymbolic Artificial Intelligence
Problems Identified (5)
LLM hallucination and nondeterminism: Large language models are prone to hallucinated information and non-deterministic outputs that can cause flawed reasoning.
Pipeline integration limits: The unpredictability of LLM outputs limits their integration into automated NLP pipelines such as chatbots and task-oriented dialogue systems.
LLM suitability for KG tasks: The paper investigates the potential and limitations of LLMs for knowledge graph tasks, especially static knowledge graph construction.
LLM hallucination and nondeterminism: Large language models are prone to hallucinated information and non-deterministic outputs that can cause flawed reasoning.
Pipeline integration limits: The unpredictability of LLM outputs limits their integration into automated NLP pipelines such as chatbots and task-oriented dialogue systems.
Proposed Solutions (5)
LLM KG construction evaluation: The study evaluates Mixtral-8x7b-Instruct-v0.1, GPT-3.5-Turbo-0125, and GPT-4o on static knowledge graph construction.
TELeR prompt scenarios: The approach uses TELeR-taxonomy-based prompts in zero-shot and one-shot scenarios for task-oriented dialogue contexts.
Flexible KG evaluation framework: The paper proposes a flexible evaluation framework that captures usable model-generated information in addition to strict metrics.
TODSet benchmark dataset: The paper introduces TODSet, a dataset for measuring LLM performance on knowledge graph-related tasks.
LLM KG construction evaluation: The study evaluates Mixtral-8x7b-Instruct-v0.1, GPT-3.5-Turbo-0125, and GPT-4o on static knowledge graph construction.
Results (3)
Prompt detail improves LLM KG construction:
TODSet introduced:
Flexible evaluation framework introduced:
Research Domain
Large language models for knowledge graph construction