RDF-TR: Exploiting structural redundancies to boost RDF compression

Hernández-Illera, Antonio and Martínez-Prieto, Miguel A. and Fernández, Javier D. (2020) RDF-TR: Exploiting structural redundancies to boost RDF compression. Information Sciences, 508. pp. 234-259. ISSN 00200255

HDTpp_journal_Submission_2019 (8).pdf

Download (1MB) | Preview


The number and volume of semantic data have grown impressively over the last decade, promoting compression as an essential tool for RDF preservation, sharing and management. In contrast to universal compressors, RDF compression techniques are able to detect and exploit specific forms of redundancy in RDF data. Thus, state-of-the-art RDF compressors excel at exploiting syntactic and semantic redundancies, i.e., repetitions in the serialization format and information that can be inferred implicitly. However, little attention has been paid to the existence of structural patterns within the RDF dataset; i.e. structural redundancy. In this paper, we analyze structural regularities in real-world datasets, and show three schema-based sources of redundancies that underpin the schema-relaxed nature of RDF. Then, we propose RDF-Tr (RDF Triples Reorganizer), a preprocessing technique that discovers and removes this kind of redundancy before the RDF dataset is effectively compressed. In particular, RDF-Tr groups subjects that are described by the same predicates, and locally re-codes the objects related to these predicates. Finally, we integrate RDF-Tr with two RDF compressors, HDT and k2-triples. Our experiments show that using RDF-Tr with these compressors improves by up to 2.3 times their original effectiveness, outperforming the most prominent state-of-the-art techniques.

Item Type: Article
Keywords: RDF compression, Linked Data
Divisions: Departments > Informationsverarbeitung u Prozessmanag. > Informationswirtschaft
Departments > Informationsverarbeitung u Prozessmanag. > Informationswirtschaft > Polleres
Version of the Document: Submitted
Variance from Published Version: Typographical
Depositing User: Doris Wyk
Date Deposited: 25 Mar 2020 07:36
Last Modified: 27 Mar 2020 10:17
Related URLs:
FIDES Link: https://bach.wu.ac.at/d/research/results/94929/
URI: https://epub.wu.ac.at/id/eprint/7526


View Item View Item


Downloads per month over past year

View more statistics