Comparing geospatial entity classes: an asymmetric and context-dependent similarity measure

return to the website
by M. Andrea Rodríguez, Max J. Egenhofer
Abstract:
Semantic similarity plays an important role in geographic information systems as it supports the identification of objects that are conceptually close, but not identical. Similarity assessments are particularly important for retrieval of geospatial data in such settings as digital libraries, heterogeneous databases, and the World Wide Web. Although some computational models for semantic similarity assessment exist, these models are typically limited by their inability to handle such important cognitive properties of similarity judgements as their inherent asymmetry and their dependence on context. This paper defines the Matching-Distance Similarity Measure (MDSM) for determining semantic similarity among spatial entity classes, taking into account the distinguishing features of these classes (parts, functions, and attributes) and their semantic interrelations (is?a and part?whole relations). A matching process is combined with a semantic-distance calculation to obtain asymmetric values of similarity that depend on the degree of generalization of entity classes. MDSM's matching process is also driven by contextual considerations, where the context determines the relative importance of distinguishing features. Based on a human-subject experiment, MDSM results correlate well with people's judgements of similarity. When contextual information is used for determining the importance of distinguishing features, this correlation increases; however, the major component of the correlation between MDSM results and people's judgements is due to a detailed definition of entity classes. Semantic similarity plays an important role in geographic information systems as it supports the identification of objects that are conceptually close, but not identical. Similarity assessments are particularly important for retrieval of geospatial data in such settings as digital libraries, heterogeneous databases, and the World Wide Web. Although some computational models for semantic similarity assessment exist, these models are typically limited by their inability to handle such important cognitive properties of similarity judgements as their inherent asymmetry and their dependence on context. This paper defines the Matching-Distance Similarity Measure (MDSM) for determining semantic similarity among spatial entity classes, taking into account the distinguishing features of these classes (parts, functions, and attributes) and their semantic interrelations (is?a and part?whole relations). A matching process is combined with a semantic-distance calculation to obtain asymmetric values of similarity that depend on the degree of generalization of entity classes. MDSM's matching process is also driven by contextual considerations, where the context determines the relative importance of distinguishing features. Based on a human-subject experiment, MDSM results correlate well with people's judgements of similarity. When contextual information is used for determining the importance of distinguishing features, this correlation increases; however, the major component of the correlation between MDSM results and people's judgements is due to a detailed definition of entity classes.
Reference:
Comparing geospatial entity classes: an asymmetric and context-dependent similarity measure (M. Andrea Rodríguez, Max J. Egenhofer), In International Journal of Geographical Information Science, Taylor & Francis, volume 18, 2004.
Bibtex Entry:
@article{AndreaRodriguez2004,
abstract = {Semantic similarity plays an important role in geographic information systems as it supports the identification of objects that are conceptually close, but not identical. Similarity assessments are particularly important for retrieval of geospatial data in such settings as digital libraries, heterogeneous databases, and the World Wide Web. Although some computational models for semantic similarity assessment exist, these models are typically limited by their inability to handle such important cognitive properties of similarity judgements as their inherent asymmetry and their dependence on context. This paper defines the Matching-Distance Similarity Measure (MDSM) for determining semantic similarity among spatial entity classes, taking into account the distinguishing features of these classes (parts, functions, and attributes) and their semantic interrelations (is?a and part?whole relations). A matching process is combined with a semantic-distance calculation to obtain asymmetric values of similarity that depend on the degree of generalization of entity classes. MDSM's matching process is also driven by contextual considerations, where the context determines the relative importance of distinguishing features. Based on a human-subject experiment, MDSM results correlate well with people's judgements of similarity. When contextual information is used for determining the importance of distinguishing features, this correlation increases; however, the major component of the correlation between MDSM results and people's judgements is due to a detailed definition of entity classes.
Semantic similarity plays an important role in geographic information systems as it supports the identification of objects that are conceptually close, but not identical. Similarity assessments are particularly important for retrieval of geospatial data in such settings as digital libraries, heterogeneous databases, and the World Wide Web. Although some computational models for semantic similarity assessment exist, these models are typically limited by their inability to handle such important cognitive properties of similarity judgements as their inherent asymmetry and their dependence on context. This paper defines the Matching-Distance Similarity Measure (MDSM) for determining semantic similarity among spatial entity classes, taking into account the distinguishing features of these classes (parts, functions, and attributes) and their semantic interrelations (is?a and part?whole relations). A matching process is combined with a semantic-distance calculation to obtain asymmetric values of similarity that depend on the degree of generalization of entity classes. MDSM's matching process is also driven by contextual considerations, where the context determines the relative importance of distinguishing features. Based on a human-subject experiment, MDSM results correlate well with people's judgements of similarity. When contextual information is used for determining the importance of distinguishing features, this correlation increases; however, the major component of the correlation between MDSM results and people's judgements is due to a detailed definition of entity classes.},
author = {{Andrea Rodr\'{\i}guez}, M. and Egenhofer, Max J.},
doi = {10.1080/13658810310001629592},
issn = {1365-8816},
journal = {International Journal of Geographical Information Science},
keywords = {SML-LIB-BIBLIO,lang:ENG},
mendeley-tags = {SML-LIB-BIBLIO,lang:ENG},
month = apr,
number = {3},
pages = {229--256},
publisher = {Taylor \& Francis},
title = {{Comparing geospatial entity classes: an asymmetric and context-dependent similarity measure}},
url = {http://dx.doi.org/10.1080/13658810310001629592},
volume = {18},
year = {2004}
}
Powered by bibtexbrowser