A graph-theoretic modeling on GO space for biological interpretation of gene clusters.

return to the website
by Sung Geun Lee, Jung Uk Hur, Yang Seok Kim
Abstract:
MOTIVATION: With the advent of DNA microarray technologies, the parallel quantification of genome-wide transcriptions has been a great opportunity to systematically understand the complicated biological phenomena. Amidst the enthusiastic investigations into the intricate gene expression data, clustering methods have been the useful tools to uncover the meaningful patterns hidden in those data. The mathematical techniques, however, entirely based on the numerical expression data, do not show biologically relevant information on the clustering results. RESULTS: We present a novel methodology for biological interpretation of gene clusters. Our graph theoretic algorithm extracts common biological attributes of the genes within a cluster or a group of interest through the modified structure of gene ontology (GO) called GO tree. After genes are annotated with GO terms, the hierarchical nature of GO terms is used to find the representative biological meanings of the gene clusters. In addition, the biological significance of gene clusters can be assessed quantitatively by defining a distance function on the GO tree. Our approach has a complementary meaning to many statistical clustering techniques; we can see clustering problems from a different viewpoint by use of biological ontology. We applied this algorithm to the well-known data set and successfully obtained the biological features of the gene clusters with the quantitative biological assessment of clustering quality through GO Biological Process.
Reference:
A graph-theoretic modeling on GO space for biological interpretation of gene clusters. (Sung Geun Lee, Jung Uk Hur, Yang Seok Kim), In Bioinformatics (Oxford, England), volume 20, 2004.
Bibtex Entry:
@article{Lee2004,
abstract = {MOTIVATION: With the advent of DNA microarray technologies, the parallel quantification of genome-wide transcriptions has been a great opportunity to systematically understand the complicated biological phenomena. Amidst the enthusiastic investigations into the intricate gene expression data, clustering methods have been the useful tools to uncover the meaningful patterns hidden in those data. The mathematical techniques, however, entirely based on the numerical expression data, do not show biologically relevant information on the clustering results. RESULTS: We present a novel methodology for biological interpretation of gene clusters. Our graph theoretic algorithm extracts common biological attributes of the genes within a cluster or a group of interest through the modified structure of gene ontology (GO) called GO tree. After genes are annotated with GO terms, the hierarchical nature of GO terms is used to find the representative biological meanings of the gene clusters. In addition, the biological significance of gene clusters can be assessed quantitatively by defining a distance function on the GO tree. Our approach has a complementary meaning to many statistical clustering techniques; we can see clustering problems from a different viewpoint by use of biological ontology. We applied this algorithm to the well-known data set and successfully obtained the biological features of the gene clusters with the quantitative biological assessment of clustering quality through GO Biological Process.},
author = {Lee, Sung Geun and Hur, Jung Uk and Kim, Yang Seok},
doi = {10.1093/bioinformatics/btg420},
issn = {1367-4803},
journal = {Bioinformatics (Oxford, England)},
keywords = {Algorithms,Automated,Cluster Analysis,Database Management Systems,Databases,Gene Expression Profiling,Gene Expression Profiling: methods,Genetic,Information Storage and Retrieval,Information Storage and Retrieval: methods,Natural Language Processing,Pattern Recognition,Proteins,Proteins: classification,Proteins: genetics,Reproducibility of Results,SML-LIB-BIBLIO,Saccharomyces cerevisiae Proteins,Saccharomyces cerevisiae Proteins: classification,Saccharomyces cerevisiae Proteins: genetics,Sensitivity and Specificity,lang:ENG},
mendeley-tags = {SML-LIB-BIBLIO,lang:ENG},
month = feb,
number = {3},
pages = {381--8},
pmid = {14960465},
title = {{A graph-theoretic modeling on GO space for biological interpretation of gene clusters.}},
url = {http://www.ncbi.nlm.nih.gov/pubmed/14960465},
volume = {20},
year = {2004}
}
Powered by bibtexbrowser