A novel insight into Gene Ontology semantic similarity

return to the website
by Yungang Xu, Maozu Guo, Wenli Shi, Xiaoyan Liu, Chunyu Wang
Abstract:
Existing methods for computing the semantic similarity between Gene Ontology (GO) terms are often based on external datasets and, therefore are not intrinsic to GO. Furthermore, they not only fail to handle identical annotations but also show a strong bias toward well-annotated proteins when being used for measuring similarity of proteins. Inspired by the concept of cellular differentiation and dedifferentiation in developmental biology, we propose a shortest semantic differentiation distance (SSDD) based on the concept of semantic totipotency to measure the semantic similarity of GO terms and further compare the functional similarity of proteins. Using human ratings and a benchmark dataset, SSDD was found to improve upon existing methods for computing the semantic similarity of GO terms. An in-depth analysis shows that SSDD is able to distinguish identical annotations and does not depend on annotation richness, thus producing more unbiased and reliable results. Online services can be accessed at the Gene Functional Similarity Analysis Tools website (GFSAT: http://nclab.hit.edu.cn/GFSAT).
Reference:
A novel insight into Gene Ontology semantic similarity (Yungang Xu, Maozu Guo, Wenli Shi, Xiaoyan Liu, Chunyu Wang), In Genomics, Elsevier B.V., volume 101, 2013.
Bibtex Entry:
@article{Xu2013,
abstract = {Existing methods for computing the semantic similarity between Gene Ontology (GO) terms are often based on external datasets and, therefore are not intrinsic to GO. Furthermore, they not only fail to handle identical annotations but also show a strong bias toward well-annotated proteins when being used for measuring similarity of proteins. Inspired by the concept of cellular differentiation and dedifferentiation in developmental biology, we propose a shortest semantic differentiation distance (SSDD) based on the concept of semantic totipotency to measure the semantic similarity of GO terms and further compare the functional similarity of proteins. Using human ratings and a benchmark dataset, SSDD was found to improve upon existing methods for computing the semantic similarity of GO terms. An in-depth analysis shows that SSDD is able to distinguish identical annotations and does not depend on annotation richness, thus producing more unbiased and reliable results. Online services can be accessed at the Gene Functional Similarity Analysis Tools website (GFSAT: http://nclab.hit.edu.cn/GFSAT).},
author = {Xu, Yungang and Guo, Maozu and Shi, Wenli and Liu, Xiaoyan and Wang, Chunyu},
doi = {10.1016/j.ygeno.2013.04.010},
issn = {08887543},
journal = {Genomics},
keywords = {SML-LIB-BIBLIO,lang:ENG},
mendeley-tags = {SML-LIB-BIBLIO,lang:ENG},
month = apr,
number = {6},
pages = {368----375},
publisher = {Elsevier B.V.},
title = {{A novel insight into Gene Ontology semantic similarity}},
url = {http://linkinghub.elsevier.com/retrieve/pii/S0888754313000876},
volume = {101},
year = {2013}
}
Powered by bibtexbrowser