A New Model of Information Content for Semantic Similarity in WordNet

return to the website
by Zili Zhou, Yanna Wang, Junzhong Gu
Abstract:
Information Content(IC) is an important dimension of assessing the semantic similarity between two terms or word senses in word knowledge. The conventional method of obtaining IC of word senses is to combine knowledge of their hierarchical structure from an ontology like WordNet with actual usage in text as derived from a large corpus. In this paper, a new model of IC is presented, which relies on hierarchical structure alone. The model considers not only the hyponyms of each word sense but also its depth in the structure. The IC value is easier to calculate based on our model, and when used as the basis of a similarity approach it yields judgments that correlate more closely with human assessments than others, which using IC value obtained only considering the hyponyms and IC value got by employing corpus analysis.
Reference:
A New Model of Information Content for Semantic Similarity in WordNet (Zili Zhou, Yanna Wang, Junzhong Gu), In FGCNS'08 Proceedings of the 2008 Second International Conference on Future Generation Communication and Networking Symposia - Volume 03, IEEE Computer Society, 2008.
Bibtex Entry:
@inproceedings{Zhou2008,
abstract = {Information Content(IC) is an important dimension of assessing the semantic similarity between two terms or word senses in word knowledge. The conventional method of obtaining IC of word senses is to combine knowledge of their hierarchical structure from an ontology like WordNet with actual usage in text as derived from a large corpus. In this paper, a new model of IC is presented, which relies on hierarchical structure alone. The model considers not only the hyponyms of each word sense but also its depth in the structure. The IC value is easier to calculate based on our model, and when used as the basis of a similarity approach it yields judgments that correlate more closely with human assessments than others, which using IC value obtained only considering the hyponyms and IC value got by employing corpus analysis.},
author = {Zhou, Zili and Wang, Yanna and Gu, Junzhong},
booktitle = {FGCNS'08 Proceedings of the 2008 Second International Conference on Future Generation Communication and Networking Symposia - Volume 03},
doi = {10.1109/FGCNS.2008.16},
isbn = {978-1-4244-3430-5},
keywords = {Information Content,SML-LIB-BIBLIO,Semantic Similarity,WordNet,information content,lang:ENG},
mendeley-tags = {SML-LIB-BIBLIO,Semantic Similarity,information content,lang:ENG},
month = dec,
pages = {85--89},
publisher = {IEEE Computer Society},
title = {{A New Model of Information Content for Semantic Similarity in WordNet}},
year = {2008}
}
Powered by bibtexbrowser