Benchmarking Knowledge Editing using Logical Rules
Mirrored from arXiv — NLP / Computation & Language for archival readability. Support the source by reading on the original site.
Computer Science > Computation and Language
Title:Benchmarking Knowledge Editing using Logical Rules
Abstract:Large Language Models (LLMs) are increasingly deployed in real-world applications that require access to up-to-date knowledge. However, retraining LLMs is computationally expensive. Therefore, knowledge editing techniques are crucial for maintaining current information and correcting erroneous assertions within pre-trained models. Current benchmarks for knowledge editing primarily focus on recalling edited facts, often neglecting their logical consequences. To address this limitation, we introduce a new benchmark designed to evaluate how knowledge editing methods handle the logical consequences of a single fact edit. Our benchmark extracts relevant logical rules from a knowledge graph for a given edit. Then, it generates multi-hop questions based on these rules to assess the impact on logical consequences. Our findings indicate that while existing knowledge editing approaches can accurately insert direct assertions into LLMs, they frequently fail to inject entailed knowledge. Specifically, experiments with popular methods like ROME and FT reveal a substantial performance gap, up to 24%, between evaluations on directly edited knowledge and on entailed knowledge. This highlights the critical need for semantics-aware evaluation frameworks in knowledge editing.
| Comments: | Accepted at the 24th International Semantic Web Conference 2025 |
| Subjects: | Computation and Language (cs.CL); Artificial Intelligence (cs.AI) |
| Cite as: | arXiv:2606.10554 [cs.CL] |
| (or arXiv:2606.10554v1 [cs.CL] for this version) | |
| https://doi.org/10.48550/arXiv.2606.10554
arXiv-issued DOI via DataCite (pending registration)
|
|
| Journal reference: | The Semantic Web. ISWC 2025. ISWC 2025. Lecture Notes in Computer Science, vol 16141. Springer, Cham |
| Related DOI: | https://doi.org/10.1007/978-3-032-09530-5_3
DOI(s) linking to related resources
|
Submission history
From: Tatiana Moteu Ngoli [view email][v1] Tue, 9 Jun 2026 08:21:56 UTC (585 KB)
Access Paper:
- View PDF
- HTML (experimental)
- TeX Source
References & Citations
Bibliographic and Citation Tools
Code, Data and Media Associated with this Article
Demos
Recommenders and Search Tools
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.
More from arXiv — NLP / Computation & Language
-
EDEN: A Large-Scale Corpus of Clinical Notes for Italian
Jun 12
-
Helping Figures Tell their Story! Paper-Grounded Video Generation Explaining Complex Scientific Figures
Jun 12
-
MARD: Mirror-Augmented Reasoning Distillation for Mechanism-Level Drug-Drug Interaction Prediction
Jun 12
-
Constrained Semantic Decompression in LLMs through Persian Proverb-Conditioned Story Generation
Jun 12
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.