SERIMI - Resource Description Similarity, RDF Instance Matching and Interlinking

Araujo, Samur; Hidders, Jan; Schwabe, Daniel; de Vries, Arjen P.

by Samur Araujo, Jan Hidders, Daniel Schwabe, Arjen P. de Vries

Abstract:

The interlinking of datasets published in the Linked Data Cloud is a challenging problem and a key factor for the success of the Semantic Web. Manual rule-based methods are the most effective solution for the problem, but they require skilled human data publishers going through a laborious, error prone and time-consuming process for manually describing rules mapping instances between two datasets. Thus, an automatic approach for solving this problem is more than welcome. In this paper, we propose a novel interlinking method, SERIMI, for solving this problem automatically. SERIMI matches instances between a source and a target datasets, without prior knowledge of the data, domain or schema of these datasets. Experiments conducted with benchmark collections demonstrate that our approach considerably outperforms state-of-the-art automatic approaches for solving the interlinking problem on the Linked Data Cloud.

View PDF

Reference:

SERIMI - Resource Description Similarity, RDF Instance Matching and Interlinking (Samur Araujo, Jan Hidders, Daniel Schwabe, Arjen P. de Vries), In arXiv preprint arXiv:1107.1104, 2011.

Bibtex Entry:

@article{Araujo2011,
abstract = {The interlinking of datasets published in the Linked Data Cloud is a challenging problem and a key factor for the success of the Semantic Web. Manual rule-based methods are the most effective solution for the problem, but they require skilled human data publishers going through a laborious, error prone and time-consuming process for manually describing rules mapping instances between two datasets. Thus, an automatic approach for solving this problem is more than welcome. In this paper, we propose a novel interlinking method, SERIMI, for solving this problem automatically. SERIMI matches instances between a source and a target datasets, without prior knowledge of the data, domain or schema of these datasets. Experiments conducted with benchmark collections demonstrate that our approach considerably outperforms state-of-the-art automatic approaches for solving the interlinking problem on the Linked Data Cloud.},
archivePrefix = {arXiv},
arxivId = {1107.1104},
author = {Araujo, Samur and Hidders, Jan and Schwabe, Daniel and de Vries, Arjen P.},
eprint = {1107.1104},
journal = {arXiv preprint arXiv:1107.1104},
keywords = {SML-LIB-BIBLIO,lang:ENG},
mendeley-tags = {SML-LIB-BIBLIO,lang:ENG},
month = jul,
title = {{SERIMI - Resource Description Similarity, RDF Instance Matching and Interlinking}},
url = {http://arxiv.org/abs/1107.1104},
year = {2011}
}