Using WordNet-based context vectors to estimate the semantic relatedness of concepts

Patwardhan, Siddharth; Pedersen, Ted

by Siddharth Patwardhan, Ted Pedersen

Abstract:

In this paper, we introduce a WordNetbased measure of semantic relatedness by combining the structure and content of WordNet with co–occurrence information derived from raw text. We use the co–occurrence information along with the WordNet definitions to build gloss vectors corresponding to each concept in Word-Net. Numeric scores of relatedness are assigned to a pair of concepts by measuring the cosine of the angle between their respective gloss vectors. We show that this measure compares favorably to other measures with respect to human judgments of semantic relatedness, and that it performs well when used in a word sense disambiguation algorithm that relies on semantic relatedness. This measure is flexible in that it can make comparisons between any two concepts without regard to their part of speech. In addition, it can be adapted to different domains, since any plain text corpus can be used to derive the co–occurrence information. 1

View PDF

Reference:

Using WordNet-based context vectors to estimate the semantic relatedness of concepts (Siddharth Patwardhan, Ted Pedersen), In EACL Workshop Making Sense of Sense --- Bringing Computational Linguistics and Psycholinguistics TogetherWorkshop Making Sense of Sense---Bringing Computational Linguistics and Psycholinguistics Together, 2006.

Bibtex Entry:

@inproceedings{Patwardhan2006,
abstract = {In this paper, we introduce a WordNetbased measure of semantic relatedness by combining the structure and content of WordNet with co–occurrence information derived from raw text. We use the co–occurrence information along with the WordNet definitions to build gloss vectors corresponding to each concept in Word-Net. Numeric scores of relatedness are assigned to a pair of concepts by measuring the cosine of the angle between their respective gloss vectors. We show that this measure compares favorably to other measures with respect to human judgments of semantic relatedness, and that it performs well when used in a word sense disambiguation algorithm that relies on semantic relatedness. This measure is flexible in that it can make comparisons between any two concepts without regard to their part of speech. In addition, it can be adapted to different domains, since any plain text corpus can be used to derive the co–occurrence information. 1},
author = {Patwardhan, Siddharth and Pedersen, Ted},
booktitle = {EACL Workshop Making Sense of Sense --- Bringing Computational Linguistics and Psycholinguistics TogetherWorkshop Making Sense of Sense---Bringing Computational Linguistics and Psycholinguistics Together},
keywords = {SML-LIB-BIBLIO,lang:ENG},
mendeley-tags = {SML-LIB-BIBLIO,lang:ENG},
pages = {1--8},
title = {{Using WordNet-based context vectors to estimate the semantic relatedness of concepts}},
url = {http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.116.9473},
year = {2006}
}