An Intrinsic Information Content Metric for Semantic Similarity in WordNet

return to the website
by Nuno Seco, Tony Veale, Jer Hayes
Abstract:
Information Content (IC) is an important dimension of word knowledge when assessing the similarity of two terms or word senses. The conventional way of measuring the IC of word senses is to combine knowledge of their hierarchical structure from an ontology like WordNet with statistics on their actual usage in text as derived from a large corpus. In this paper we present a wholly intrinsic measure of IC that relies on hierarchical structure alone. We report that this measure is consequently easier to calculate, yet when used as the basis of a similarity mechanism it yields judgments that correlate more closely with human assessments than other, extrinsic measures of IC that additionally employ corpus analysis.
Reference:
An Intrinsic Information Content Metric for Semantic Similarity in WordNet (Nuno Seco, Tony Veale, Jer Hayes), In 16th European Conference on Artificial Intelligence, IOS Press, 2004.
Bibtex Entry:
@inproceedings{Seco2004,
abstract = {Information Content (IC) is an important dimension of word knowledge when assessing the similarity of two terms or word senses. The conventional way of measuring the IC of word senses is to combine knowledge of their hierarchical structure from an ontology like WordNet with statistics on their actual usage in text as derived from a large corpus. In this paper we present a wholly intrinsic measure of IC that relies on hierarchical structure alone. We report that this measure is consequently easier to calculate, yet when used as the basis of a similarity mechanism it yields judgments that correlate more closely with human assessments than other, extrinsic measures of IC that additionally employ corpus analysis.},
author = {Seco, Nuno and Veale, Tony and Hayes, Jer},
booktitle = {16th European Conference on Artificial Intelligence},
keywords = {Intrinsec IC,SML-LIB-BIBLIO,Semantic Similarity,information content,lang:ENG},
mendeley-tags = {Intrinsec IC,SML-LIB-BIBLIO,Semantic Similarity,information content,lang:ENG},
pages = {1--5},
publisher = {IOS Press},
title = {{An Intrinsic Information Content Metric for Semantic Similarity in WordNet}},
year = {2004}
}
Powered by bibtexbrowser