SimPack: A Generic Java Library for Similarity Measures in Ontologies

Bernstein, Abraham; Kaufmann, Esther; Kiefer, Christoph; Bürki, Christoph

by Abraham Bernstein, Esther Kaufmann, Christoph Kiefer, Christoph Bürki

Abstract:

Good similarity measures are central for techniques such as retrieval, matchmaking, clustering, data-mining, ontology translations, automatic database schema matching, and simple object comparisons. Measures for the use with complex (or aggregated) objects in ontologies are, however, rare, even though they are central for semantic web applications. This paper first introduces SimPack, a library of similarity measures for the use in ontologies (of complex objects). The measures of the library are then experimentally compared with a similarity “gold standard ” established by surveying 94 human subjects in two ontologies. Results show that human and algorithm assessments vary (both between people and across ontologies), but can be grouped into cohesive clusters, each of which is well modeled by one of the measures in the library. Furthermore, we show two increasingly accurate methods to predict the cluster membership of the subjects providing the foundation for the construction of personalized similarity measures.

View PDF

Reference:

SimPack: A Generic Java Library for Similarity Measures in Ontologies (Abraham Bernstein, Esther Kaufmann, Christoph Kiefer, Christoph Bürki), Technical report, University of Zurich Department of Informatics, 2005.

Bibtex Entry:

@techreport{Bernstein2006,
abstract = {Good similarity measures are central for techniques such as retrieval, matchmaking, clustering, data-mining, ontology translations, automatic database schema matching, and simple object comparisons. Measures for the use with complex (or aggregated) objects in ontologies are, however, rare, even though they are central for semantic web applications. This paper first introduces SimPack, a library of similarity measures for the use in ontologies (of complex objects). The measures of the library are then experimentally compared with a similarity “gold standard ” established by surveying 94 human subjects in two ontologies. Results show that human and algorithm assessments vary (both between people and across ontologies), but can be grouped into cohesive clusters, each of which is well modeled by one of the measures in the library. Furthermore, we show two increasingly accurate methods to predict the cluster membership of the subjects providing the foundation for the construction of personalized similarity measures.},
annote = {
        From Duplicate 2 ( 
        
        
          SimPack : A Generic Java Library for Similarity Measures in Ontologies
        
        
         - Bernstein, Abraham; Kaufmann, Esther; Kiefer, Christoph; Burki, Christoph )
And  Duplicate 3 ( 
        
        
          SimPack: A Generic Java Library for Similarity Measures in Ontologies
        
        
         - Bernstein, Abraham; Kaufmann, Esther; Kiefer, Christoph; B\"{u}rki, Christoph )

        
        

        

        

      },
author = {Bernstein, Abraham and Kaufmann, Esther and Kiefer, Christoph and B\"{u}rki, Christoph},
booktitle = {Most},
institution = {University of Zurich Department of Informatics},
keywords = {SML-LIB-BIBLIO,Semantic Similarity,lang:ENG,semantic similarity},
mendeley-tags = {SML-LIB-BIBLIO,lang:ENG,semantic similarity},
title = {{SimPack: A Generic Java Library for Similarity Measures in Ontologies}},
url = {http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.91.8152},
year = {2005}
}