Phylogenomic inference of protein molecular function: advances and challenges
- PMID: 14734307
- DOI: 10.1093/bioinformatics/bth021
Phylogenomic inference of protein molecular function: advances and challenges
Abstract
Motivation: Protein families evolve a multiplicity of functions through gene duplication, speciation and other processes. As a number of studies have shown, standard methods of protein function prediction produce systematic errors on these data. Phylogenomic analysis--combining phylogenetic tree construction, integration of experimental data and differentiation of orthologs and paralogs--has been proposed to address these errors and improve the accuracy of functional classification. The explicit integration of structure prediction and analysis in this framework, which we call structural phylogenomics, provides additional insights into protein superfamily evolution.
Results: Results of protein functional classification using phylogenomic analysis show fewer expected false positives overall than when pairwise methods of functional classification are employed. We present an overview of the motivations and fundamental principles of phylogenomic analysis, new methods developed for the key tasks, benchmark datasets for these tasks (when available) and suggest procedures to increase accuracy. We also discuss some of the methods used in the Celera Genomics high-throughput phylogenomic classification of the human genome.
Availability: Software tools from the Berkeley Phylogenomics Group are available at http://phylogenomics.berkeley.edu
Similar articles
-
Berkeley Phylogenomics Group web servers: resources for structural phylogenomic analysis.Nucleic Acids Res. 2007 Jul;35(Web Server issue):W27-32. doi: 10.1093/nar/gkm325. Epub 2007 May 8. Nucleic Acids Res. 2007. PMID: 17488835 Free PMC article.
-
Automated protein subfamily identification and classification.PLoS Comput Biol. 2007 Aug;3(8):e160. doi: 10.1371/journal.pcbi.0030160. PLoS Comput Biol. 2007. PMID: 17708678 Free PMC article.
-
On the quality of tree-based protein classification.Bioinformatics. 2005 May 1;21(9):1876-90. doi: 10.1093/bioinformatics/bti244. Epub 2005 Jan 12. Bioinformatics. 2005. PMID: 15647305
-
Key challenges in proteomics and proteoinformatics. Progress in proteins.IEEE Eng Med Biol Mag. 2005 May-Jun;24(3):34-40. doi: 10.1109/memb.2005.1436456. IEEE Eng Med Biol Mag. 2005. PMID: 15971839 Review. No abstract available.
-
Alignment-free inference of hierarchical and reticulate phylogenomic relationships.Brief Bioinform. 2019 Mar 22;20(2):426-435. doi: 10.1093/bib/bbx067. Brief Bioinform. 2019. PMID: 28673025 Free PMC article. Review.
Cited by
-
FlowerPower: clustering proteins into domain architecture classes for phylogenomic inference of protein function.BMC Evol Biol. 2007 Feb 8;7 Suppl 1(Suppl 1):S12. doi: 10.1186/1471-2148-7-S1-S12. BMC Evol Biol. 2007. PMID: 17288570 Free PMC article.
-
ANEXdb: an integrated animal ANnotation and microarray EXpression database.Mamm Genome. 2009 Nov-Dec;20(11-12):768-77. doi: 10.1007/s00335-009-9234-1. Epub 2009 Nov 20. Mamm Genome. 2009. PMID: 19936830
-
B2G-FAR, a species-centered GO annotation repository.Bioinformatics. 2011 Apr 1;27(7):919-24. doi: 10.1093/bioinformatics/btr059. Epub 2011 Feb 18. Bioinformatics. 2011. PMID: 21335611 Free PMC article.
-
From Molecular Phylogenetics to Quantum Chemistry: Discovering Enzyme Design Principles through Computation.Comput Struct Biotechnol J. 2012 Nov 30;2:e201209018. doi: 10.5936/csbj.201209018. eCollection 2012. Comput Struct Biotechnol J. 2012. PMID: 24688659 Free PMC article. Review. No abstract available.
-
Global alignment of protein-protein interaction networks by graph matching methods.Bioinformatics. 2009 Jun 15;25(12):i259-67. doi: 10.1093/bioinformatics/btp196. Bioinformatics. 2009. PMID: 19477997 Free PMC article.
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Research Materials
Miscellaneous