Codon test : modeling amino acid substitution preferences in coding sequences
Date
2010-08
Authors
Delport, Wayne
Scheffler, Konrad
Botha, Gordon
Gravenor, Mike B.
Muse, Spencer V.
Pond, Sergei L. Kosakovsky
Journal Title
Journal ISSN
Volume Title
Publisher
PLOS Computational Biology
Abstract
Codon models of evolution have facilitated the interpretation of selective forces operating on genomes. These models,
however, assume a single rate of non-synonymous substitution irrespective of the nature of amino acids being exchanged.
Recent developments have shown that models which allow for amino acid pairs to have independent rates of substitution
offer improved fit over single rate models. However, these approaches have been limited by the necessity for large
alignments in their estimation. An alternative approach is to assume that substitution rates between amino acid pairs can
be subdivided into K rate classes, dependent on the information content of the alignment. However, given the
combinatorially large number of such models, an efficient model search strategy is needed. Here we develop a Genetic
Algorithm (GA) method for the estimation of such models. A GA is used to assign amino acid substitution pairs to a series of
K rate classes, where K is estimated from the alignment. Other parameters of the phylogenetic Markov model, including
substitution rates, character frequencies and branch lengths are estimated using standard maximum likelihood optimization
procedures. We apply the GA to empirical alignments and show improved model fit over existing models of codon
evolution. Our results suggest that current models are poor approximations of protein evolution and thus gene and
organism specific multi-rate models that incorporate amino acid substitution biases are preferred. We further anticipate that the clustering of amino acid substitution rates into classes will be biologically informative, such that genes with similar functions exhibit similar clustering, and hence this clustering will be useful for the evolutionary fingerprinting of genes.
Description
The original publication is available at www.ploscompbiol.org
Keywords
Codon models, Amino acids
Citation
Delport, W.et al. 2010. Codon test: modeling amino acid substitution preferences in coding sequences. PLoS Computational Biology, 6(8):1-17. doi:10.1371/journal.pcbi.1000885.