Solved Status: 850 Problems
GPT-5.5-pro analyzed every identified research problem in the KG field. Not a single one is fully solved.
761
Active (89.5%)
55
Superseded (6.5%)
34
Partial (4%)
0
Fully Solved
0.75
Avg Confidence
Zero problems are fully solved or abandoned. The entire KG research field is in active evolution — problems get partially addressed or reframed, but never fully closed. LLM-related problems (hallucination, knowledge limitations) are the fastest-growing, appearing only since 2021.
Top Active Problems (by paper count)
| Problem | Papers | Years | Status | Conf |
|---|---|---|---|---|
| kg incompleteness KG incompleteness remains a fundamental open-world problem: embedding-based knowledge graph completion, open-world KGC benchmarks, representation lear... | 1,726 | 1995–2026 | active | 0.93 |
| kg question answering KG question answering has seen substantial progress through semantic parsing, entity-linking pipelines, neural retrieval/reasoning models, KGQA system... | 348 | 2015–2026 | active | 0.92 |
| automated knowledge graph construction Automated knowledge graph construction has progressed through extraction/linking pipelines, domain-specific construction methods such as course knowle... | 303 | 2009–2026 | active | 0.90 |
| knowledge graph embedding learning Knowledge graph embedding learning has become a mature and widely used approach for KG completion and link prediction, with standard families such as ... | 233 | 2015–2026 | active | 0.88 |
| llm hallucination LLM hallucination in KG contexts is being actively mitigated through KG-enhanced LLMs, KG-enhanced RAG, graph reasoning, QA systems, and agentic retri... | 197 | 2023–2026 | active | 0.92 |
| temporal kg reasoning Temporal KG reasoning has seen strong progress through temporal KG embeddings, attention/GNN-based temporal encoders, subgraph extraction, adaptive te... | 184 | 2016–2026 | active | 0.90 |
| high-dimensional kge cost High-dimensional KGE cost has been attacked through tensor/matrix factorization, more efficient training frameworks, convolutional or dynamic embeddin... | 181 | 2014–2026 | active | 0.84 |
| kg recommendation data sparsity Knowledge graphs have clearly helped alleviate recommender-system data sparsity by injecting entity relations, attributes, paths, and neighborhoods in... | 160 | 2016–2026 | active | 0.86 |
| llm knowledge limitations LLM knowledge limitations remain an active problem: KG-enhanced LLMs, KG-enhanced RAG, graph retrieval, prompting, and KG reasoning have improved fact... | 141 | 2021–2026 | active | 0.88 |
| kg link prediction KG link prediction has made substantial progress through knowledge graph embeddings, tensor/matrix factorization, GNN-based models, temporal regulariz... | 130 | 2015–2026 | active | 0.88 |
| data sparsity and cold start KG-based recommendation has become the dominant mitigation strategy for sparsity and cold start, using entity relations, embeddings, graph attention/p... | 130 | 2016–2026 | active | 0.89 |
| heterogeneous data integration Heterogeneous data integration remains an open KG problem: ontology/semantic integration, KG construction pipelines, embeddings/representation learnin... | 127 | 2005–2026 | active | 0.87 |
| domain-specific kg construction challenges The literature has moved from conventional domain-specific construction pipelines, graph databases, and manually engineered schemas toward LLM-assiste... | 112 | 2016–2026 | active | 0.82 |
| kg entity alignment KG entity alignment has seen substantial methodological progress, moving from embedding and GCN/attention-based approaches toward hierarchical alignme... | 94 | 2018–2026 | active | 0.88 |
| rag knowledge conflicts RAG knowledge conflicts are a very recent and rapidly growing problem, especially where biomedical answers must reconcile retrieved evidence, knowledg... | 94 | 2024–2026 | active | 0.78 |
| multi-hop kg reasoning Multi-hop KG reasoning remains an active open problem: early RL/path-based methods and later GNN, subgraph, noise-aware, KGQA, and LLM-augmented appro... | 89 | 2017–2026 | active | 0.91 |
| static kg temporal limitation The community has mainly addressed static KG temporal limitations by moving toward temporal knowledge graphs and time-aware embedding or tensor models... | 76 | 2015–2026 | active | 0.84 |
| llm hallucination in kgc LLM hallucination in knowledge graph completion has become a sharply active problem as LLMs are increasingly used to predict missing KG facts, generat... | 75 | 2020–2026 | active | 0.88 |
| joint entity-relation extraction Joint entity-relation extraction has seen substantial progress from neural joint models, especially transformer-based span, table-filling, sequence-to... | 75 | 2008–2026 | active | 0.88 |
| large-scale knowledge graph management Large-scale knowledge graph management has seen incremental techniques such as scalable reasoning, distributed KG use, embedding variants, corpus augm... | 65 | 2015–2026 | active | 0.78 |
Partially Solved (34)
| Problem | Papers | Key Solution | Year | Conf |
|---|---|---|---|---|
| cf sparsity and cold start | 26 | knowledge-graph collaborative filtering with knowledge-enhanced GNNs and relation-aware attention propagation | 2021 | 0.78 |
| multi-modal kg resource need | 24 | Multimodal KG construction and enrichment using DBpedia/Wikidata extensions, domain-specific MMKG construction, and multimodal knowledge fusion/representation learning | 2022 | 0.67 |
| up-to-date biomedical knowledge access | 17 | automated biomedical knowledge graph construction using biomedical NER/relation extraction, disease-specific KG generators, and KG-enhanced retrieval interfaces | 2021 | 0.72 |
| biodiversity data integration | 14 | domain-specific biodiversity knowledge graphs with FAIR/global identifier reconciliation | 2018 | 0.74 |
| author name disambiguation | 13 | metadata-rich, graph-based author identity resolution using multimodal literal features, bibliometric/semantic KG context, and curated evaluation datasets | 2021 | 0.68 |
| research hotspot identification | 13 | CiteSpace-based bibliometric knowledge graph analysis with co-citation, keyword co-occurrence, burst detection, and visualization | 2006 | 0.76 |
| traditional recommender limitations | 12 | KG-enhanced recommendation using knowledge graph embeddings and GNN-based reasoning | 2019 | 0.72 |
| heterogeneous vulnerability data management | 12 | ontology- and RDF/SPARQL-based domain knowledge graphs, often reusing Wikidata or domain ontologies | 2020 | 0.68 |
| human disease mechanism representation | 10 | PheKnowLator-style biomedical knowledge graph construction with ontology/data-source integration and OWL reasoning closure | 2021 | 0.72 |
| collaborative filtering limitations | 9 | knowledge graph-enhanced collaborative filtering / knowledge graph-based recommendation | 2020 | 0.74 |
| relation pattern modeling | 8 | RotatE-style relational rotation embeddings with self-adversarial negative sampling | 2019 | 0.78 |
| heterogeneous climate data integration | 7 | linked-data/RDF climate knowledge graph platforms with SPARQL endpoints, graph databases, and emerging virtual knowledge graph integration | 2022 | 0.72 |
| scalable reasoning integration | 7 | restricted-fragment scalable reasoning, especially OWL 2 RL/Datalog-style distributed materialization, with probabilistic soft logic for soft or uncertain inference | 2016 | 0.66 |
| seed-dependent entity alignment | 7 | semi-supervised seed expansion and one-to-one matching, later complemented by LLM/text-based entity alignment | 2021 | 0.72 |
| distant supervision label noise | 7 | bag-level multi-instance learning with attention/instance selection, often combined with relation embeddings | 2016 | 0.78 |
| medical named entity recognition | 7 | BERT-based neural sequence labeling, especially BERT-BiLSTM-CRF and KG-infused BERT variants | 2020 | 0.74 |
| systematic review need | 6 | systematic literature reviews and paradigm-comparison/resource-brief frameworks | 2023 | 0.68 |
| drug discovery data integration | 6 | biomedical knowledge graph construction with RDF/SPARQL access, exemplified by OREGANO drug knowledge graph resources | 2023 | 0.68 |
| incomplete type constraints | 6 | local closed-world type approximation, later complemented by soft type-constrained latent and embedding models | 2015 | 0.68 |
| knowledge noise in kg-enhanced text | 6 | K-BERT-style knowledge injection with soft-position embeddings and visible-matrix attention masking | 2019 | 0.68 |
| missing semantic hierarchy modeling | 6 | HAKE-style polar-coordinate / radial-angular hierarchy-aware embeddings | 2019 | 0.74 |
| semantic scholarly querying | 6 | Open Research Knowledge Graph (ORKG) infrastructure with semantic scholarly APIs and structured contribution modeling | 2018 | 0.72 |
| translation kge misses relation patterns | 6 | relation-pattern-aware KGE models, especially RotatE/HRotatE-style rotation embeddings and rule-augmented embedding methods | 2019 | 0.72 |
| using side information for recommendation | 5 | KG-enhanced recommendation using multi-task feature learning, especially MKR-style cross-compress feature sharing between recommendation and KG representation learning | 2019 | 0.70 |
| power equipment information retrieval | 5 | domain-specific power knowledge graph construction with a semantic retrieval layer | 2019 | 0.62 |
| knowledge graph querying | 5 | standardized graph query languages and optimized graph/RDF query engines, especially SPARQL/SPARQL 1.1, triplestores, graph-database APIs, and abstract query-interface layers | 2013 | 0.68 |
| loss of kg structure in linearized encoders | 5 | structure-preserving graph encoders, especially bidirectional Graph2Seq/subgraph-GCN encoders with node-level copy decoders | 2019 | 0.64 |
| scalable rdf knowledge graph creation from complex data | 5 | RML-family declarative mappings with scalable RDFizer engines such as SDM-RDFizer and optimized logical-operator execution | 2020 | 0.70 |
| multi-domain dialogue state tracking | 5 | DST-as-question-answering with turn-level domain-slot questions, augmented by dynamic or conversational knowledge-graph state representations | 2019 | 0.62 |
| user-item graph insufficiency | 5 | knowledge graph-based recommendation with KG-enhanced user/item representations, propagation/attention, and contrastive-learning variants | 2022 | 0.68 |
| no datasets for video kg extraction | 5 | video KG extraction task formulations with heterogeneous video knowledge graph datasets | 2020 | 0.68 |
| traditional database limitations | 5 | native graph databases and RDF/graph stores, especially Neo4j graph storage, combined with graph-specific indexing, pruning, and query optimization | 2015 | 0.64 |
| ned-based retrieval limitation | 5 | CLOCQ-style top-k KB item retrieval with KB-aware pruning | 2021 | 0.68 |
| persistent dataset metadata and citation | 5 | JSON-LD knowledge-graph dataset descriptions paired with persistent DOI metadata records | 2026 | 0.68 |
Superseded (55)
| Problem | Papers | What Happened | Conf |
|---|---|---|---|
| scholarly kg knowledge discovery | 34 | The standalone problem of doing knowledge discovery directly over scholarly KGs has not converged on a standard solution: most work is proof-of-concept, with small clusters around semantic graph parti | 0.72 |
| incomplete knowledge bases | 21 | The underlying issue of incomplete knowledge bases has not been solved, but the problem label has largely been absorbed into knowledge graph completion, link prediction, entity linking, cross-lingual | 0.74 |
| triple-independent kg embeddings | 16 | The limitation of treating KG triples as independent has been mitigated by coupled tensor-matrix factorization, latent imputation, neighborhood attention, multi-hop reasoning, and rule/logic-enhanced | 0.74 |
| external knowledge underuse | 14 | Work on external knowledge underuse mainly tried KG/GNN-style augmentation, including syntax-knowledge GCNs, KG representations and embeddings, heterogeneous document graphs, entity-comparison network | 0.72 |
| knowledge base completion | 13 | Knowledge base completion was not truly solved; embedding, classification, co-occurrence, multilingual, hyperbolic, and GNN-based methods improved benchmark link prediction but did not eliminate open- | 0.78 |
| natural language question understanding | 11 | As a stand-alone KGQA problem, natural language question understanding has not been resolved by a single broadly adopted solution. The listed work is mostly complete KGQA systems, domain-specific know | 0.72 |
| target-specific kg exploitation | 11 | Target-specific KG exploitation saw a small burst of work around 2020–2022 using target-specific subgraph distillation, knowledge-aware attention, KG embeddings, and entity-description augmentation, b | 0.68 |
| pipeline error propagation | 10 | Pipeline error propagation has not been cleanly solved as a standalone entity-linking problem; instead, it has been mitigated through joint/end-to-end EL and KGQA models, zero-shot evaluation splits, | 0.72 |
| covid-19 information overload | 10 | COVID-19 information overload was addressed mainly through COVID-specific knowledge graph applications, Neo4j-backed graph stores, biomedical entity enrichment, and discovery frameworks, which helped | 0.72 |
| covid-19 drug repurposing | 9 | COVID-19 drug repurposing with knowledge graphs was mainly an urgent early-pandemic application: groups built or merged biomedical KGs, selected semantic predications, and applied probabilistic reason | 0.82 |
| document-based scholarly communication | 8 | The original framing—moving scholarly communication from static documents to machine-actionable KG-based artifacts—had a small burst of work around 2019, mainly around ORKG-style scholarly KG infrastr | 0.74 |
| entity-pair relation prediction | 8 | Entity-pair relation prediction had a small burst of work around 2018-2019 using latent path modeling, structural supervised methods, variational architectures such as DiVa, attention embeddings such | 0.74 |
| scalable semantic web kr | 8 | The exact framing of "scalable semantic web KR" shows little sustained activity: after one 2008 paper, it reappeared sparsely from 2018 to 2022, mostly as representation, benchmark, survey, visualizat | 0.74 |
| specific disease kg reasoning gap | 8 | Early work tried to close disease-specific KG reasoning gaps through NER/relation-extraction benchmarks, SDKG construction pipelines/resources, curated schema frameworks, and embedding or tensor-based | 0.66 |
| language-specific relation extraction limitation | 8 | As a standalone KG extraction problem, language-specific relation extraction limitation has not been convincingly solved; the proposed fixes are one-off supervised resources, transfer experiments, fea | 0.72 |