Ontology-based information content computation

return to the website
by David Sánchez, Montserrat Batet, David Isern
Abstract:
The information content (IC) of a concept provides an estimation of its degree of generality/concreteness, a dimension which enables a better understanding of concept's semantics. As a result, IC has been successfully applied to the automatic assessment of the semantic similarity between concepts. In the past, IC has been estimated as the probability of appearance of concepts in corpora. However, the applicability and scalability of this method are hampered due to corpora dependency and data sparseness. More recently, some authors proposed IC-based measures using taxonomical features extracted from an ontology for a particular concept, obtaining promising results. In this paper, we analyse these ontology-based approaches for IC computation and propose several improvements aimed to better capture the semantic evidence modelled in the ontology for the particular concept. Our approach has been evaluated and compared with related works (both corpora and ontology-based ones) when applied to the task of semantic similarity estimation. Results obtained for a widely used benchmark show that our method enables similarity estimations which are better correlated with human judgements than related works.
Reference:
Ontology-based information content computation (David Sánchez, Montserrat Batet, David Isern), In Knowledge-Based Systems, volume 24, 2011.
Bibtex Entry:
@article{Sanchez2011a,
abstract = {The information content (IC) of a concept provides an estimation of its degree of generality/concreteness, a dimension which enables a better understanding of concept's semantics. As a result, IC has been successfully applied to the automatic assessment of the semantic similarity between concepts. In the past, IC has been estimated as the probability of appearance of concepts in corpora. However, the applicability and scalability of this method are hampered due to corpora dependency and data sparseness. More recently, some authors proposed IC-based measures using taxonomical features extracted from an ontology for a particular concept, obtaining promising results. In this paper, we analyse these ontology-based approaches for IC computation and propose several improvements aimed to better capture the semantic evidence modelled in the ontology for the particular concept. Our approach has been evaluated and compared with related works (both corpora and ontology-based ones) when applied to the task of semantic similarity estimation. Results obtained for a widely used benchmark show that our method enables similarity estimations which are better correlated with human judgements than related works.},
author = {S\'{a}nchez, David and Batet, Montserrat and Isern, David},
issn = {09507051},
journal = {Knowledge-Based Systems},
keywords = {SML-LIB-BIBLIO,information content,lang:ENG,semantic similarity},
mendeley-tags = {SML-LIB-BIBLIO,information content,lang:ENG,semantic similarity},
month = mar,
number = {2},
pages = {297--303},
title = {{Ontology-based information content computation}},
volume = {24},
year = {2011}
}
Powered by bibtexbrowser