Fuzzy Measures on the Gene Ontology for Gene Product Similarity

return to the website
by Mihail Popescu, James M Keller, Joyce A Mitchell
Abstract:
One of the most important objects in bioinformatics is a gene product (protein or RNA). For many gene products, functional information is summarized in a set of Gene Ontology (GO) annotations. For these genes, it is reasonable to include similarity measures based on the terms found in the GO or other taxonomy. In this paper, we introduce several novel measures for computing the similarity of two gene products annotated with GO terms. The fuzzy measure similarity (FMS) has the advantage that it takes into consideration the context of both complete sets of annotation terms when computing the similarity between two gene products. When the two gene products are not annotated by common taxonomy terms, we propose a method that avoids a zero similarity result. To account for the variations in the annotation reliability, we propose a similarity measure based on the Choquet integral. These similarity measures provide extra tools for the biologist in search of functional information for gene products. The initial testing on a group of 194 sequences representing three proteins families shows a higher correlation of the FMS and Choquet similarities to the BLAST sequence similarities than the traditional similarity measures such as pairwise average or pairwise maximum.
Reference:
Fuzzy Measures on the Gene Ontology for Gene Product Similarity (Mihail Popescu, James M Keller, Joyce A Mitchell), In IEEEACM Transactions on Computational Biology and Bioinformatics, volume 3, 2006.
Bibtex Entry:
@article{Popescu2006,
abstract = {One of the most important objects in bioinformatics is a gene product (protein or RNA). For many gene products, functional information is summarized in a set of Gene Ontology (GO) annotations. For these genes, it is reasonable to include similarity measures based on the terms found in the GO or other taxonomy. In this paper, we introduce several novel measures for computing the similarity of two gene products annotated with GO terms. The fuzzy measure similarity (FMS) has the advantage that it takes into consideration the context of both complete sets of annotation terms when computing the similarity between two gene products. When the two gene products are not annotated by common taxonomy terms, we propose a method that avoids a zero similarity result. To account for the variations in the annotation reliability, we propose a similarity measure based on the Choquet integral. These similarity measures provide extra tools for the biologist in search of functional information for gene products. The initial testing on a group of 194 sequences representing three proteins families shows a higher correlation of the FMS and Choquet similarities to the BLAST sequence similarities than the traditional similarity measures such as pairwise average or pairwise maximum.},
author = {Popescu, Mihail and Keller, James M and Mitchell, Joyce A},
doi = {10.1109/TCBB.2006.37},
institution = {Health Management and Informatics Department, University of Missouri, Columbia, MO 65211, USA. popescum@missouri.edu},
issn = {15455963},
journal = {IEEEACM Transactions on Computational Biology and Bioinformatics},
keywords = {SML-LIB-BIBLIO,lang:ENG},
mendeley-tags = {SML-LIB-BIBLIO,lang:ENG},
number = {3},
pages = {263--274},
pmid = {17048464},
title = {{Fuzzy Measures on the Gene Ontology for Gene Product Similarity}},
url = {http://doi.ieeecomputersociety.org/10.1109/TCBB.2006.37},
volume = {3},
year = {2006}
}
Powered by bibtexbrowser