Enhancing MEDLINE document clustering by incorporating MeSH semantic similarity.

return to the website
by Shanfeng Zhu, Jia Zeng, Hiroshi Mamitsuka
Abstract:
Clustering MEDLINE documents is usually conducted by the vector space model, which computes the content similarity between two documents by basically using the inner-product of their word vectors. Recently, the semantic information of MeSH (Medical Subject Headings) thesaurus is being applied to clustering MEDLINE documents by mapping documents into MeSH concept vectors to be clustered. However, current approaches of using MeSH thesaurus have two serious limitations: first, important semantic information may be lost when generating MeSH concept vectors, and second, the content information of the original text has been discarded.
Reference:
Enhancing MEDLINE document clustering by incorporating MeSH semantic similarity. (Shanfeng Zhu, Jia Zeng, Hiroshi Mamitsuka), In Bioinformatics (Oxford, England), volume 25, 2009.
Bibtex Entry:
@article{Zhu2009,
abstract = {Clustering MEDLINE documents is usually conducted by the vector space model, which computes the content similarity between two documents by basically using the inner-product of their word vectors. Recently, the semantic information of MeSH (Medical Subject Headings) thesaurus is being applied to clustering MEDLINE documents by mapping documents into MeSH concept vectors to be clustered. However, current approaches of using MeSH thesaurus have two serious limitations: first, important semantic information may be lost when generating MeSH concept vectors, and second, the content information of the original text has been discarded.},
author = {Zhu, Shanfeng and Zeng, Jia and Mamitsuka, Hiroshi},
doi = {10.1093/bioinformatics/btp338},
issn = {1367-4811},
journal = {Bioinformatics (Oxford, England)},
keywords = {Abstracting and Indexing as Topic,Algorithms,Cluster Analysis,Computational Biology,Computational Biology: methods,Database Management Systems,Information Storage and Retrieval,MEDLINE,Medical Subject Headings,Pattern Recognition, Automated,Pattern Recognition, Automated: methods,SML-LIB-BIBLIO,Vocabulary, Controlled,lang:ENG},
mendeley-tags = {SML-LIB-BIBLIO,lang:ENG},
month = aug,
number = {15},
pages = {1944--51},
pmid = {19497938},
title = {{Enhancing MEDLINE document clustering by incorporating MeSH semantic similarity.}},
url = {http://www.ncbi.nlm.nih.gov/pubmed/19497938},
volume = {25},
year = {2009}
}
Powered by bibtexbrowser