Browsing by Author "Murrell, Ben"
Now showing 1 - 6 of 6
Results Per Page
Sort Options
- ItemDetecting individual sites subject to episodic diversifying selection(Public Library of Science, 2012-07-02) Murrell, Ben; Wertheim, Joel O.; Moola, Sasha; Weighill, Thomas; Scheffler, Konrad; Pond, Sergei L. KosakovskyThe imprint of natural selection on protein coding genes is often difficult to identify because selection is frequently transient or episodic, i.e. it affects only a subset of lineages. Existing computational techniques, which are designed to identify sites subject to pervasive selection, may fail to recognize sites where selection is episodic: a large proportion of positively selected sites. We present a mixed effects model of evolution (MEME) that is capable of identifying instances of both episodic and pervasive positive selection at the level of an individual site. Using empirical and simulated data, we demonstrate the superior performance of MEME over older models under a broad range of scenarios. We find that episodic selection is widespread and conclude that the number of sites experiencing positive selection may have been vastly underestimated.
- ItemIdentification of broadly neutralizing antibody epitopes in the HIV-1 envelope glycoprotein using evolutionary models(BioMed Central, 2013-12-02) Lacerda, Miguel; Moore, Penny L.; Ngandu, Nobubelo K.; Seaman, Michael; Gray, Elin S.; Murrell, Ben; Krishnamoorthy, Mohan; Nonyane, Molati; Madiga, Maphuti; Wibmer, Constantinos K.; Sheward, Daniel; Bailer, Robert T.; Gao, Hongmei; Greene, Kelli M.; Karim, Salim S. A.; Mascola, John R.; Korber, Bette T. M.; Montefiori, David C.; Morris, Lynn; Williamson, Carolyn; Seoighe, Cathal; the CAVD-NSDP ConsortiumBackground Identification of the epitopes targeted by antibodies that can neutralize diverse HIV-1 strains can provide important clues for the design of a preventative vaccine. Methods We have developed a computational approach that can identify key amino acids within the HIV-1 envelope glycoprotein that influence sensitivity to broadly cross-neutralizing antibodies. Given a sequence alignment and neutralization titers for a panel of viruses, the method works by fitting a phylogenetic model that allows the amino acid frequencies at each site to depend on neutralization sensitivities. Sites at which viral evolution influences neutralization sensitivity were identified using Bayes factors (BFs) to compare the fit of this model to that of a null model in which sequences evolved independently of antibody sensitivity. Conformational epitopes were identified with a Metropolis algorithm that searched for a cluster of sites with large Bayes factors on the tertiary structure of the viral envelope. Results We applied our method to ID50 neutralization data generated from seven HIV-1 subtype C serum samples with neutralization breadth that had been tested against a multi-clade panel of 225 pseudoviruses for which envelope sequences were also available. For each sample, between two and four sites were identified that were strongly associated with neutralization sensitivity (2ln(BF) > 6), a subset of which were experimentally confirmed using site-directed mutagenesis. Conclusions Our results provide strong support for the use of evolutionary models applied to cross-sectional viral neutralization data to identify the epitopes of serum antibodies that confer neutralization breadth.
- ItemModeling HIV-1 drug resistance as episodic directional selection(PLOS Computational Biology, 2011-05) Murrell, Ben; De Oliveira, Tulio; Seebregts, Chris; Pond, Sergei L. Kosakovsky; Scheffler, KonradThe evolution of substitutions conferring drug resistance to HIV-1 is both episodic, occurring when patients are on antiretroviral therapy, and strongly directional, with site-specific resistant residues increasing in frequency over time. Whilemethods exist to detect episodic diversifying selection and continuous directional selection, no evolutionary model combining these two properties has been proposed. We present two models of episodic directional selection (MEDSand) which allow the a priori specification of lineages expected to have undergone directional selection. The models infer the sites and target residues that were likely subject to directional selection, using either codon or protein sequences. Compared to its null model of episodic diversifying selection, MEDS provides a superior fit to most sites known to be involved in drug resistance, and neither one test for episodic diversifying selection nor another for constant directional selection are able to detect as many true positives as MEDS and EDEPS while maintaining acceptable levels of false positives. This suggests that episodic directional selection is a better description of the process driving the evolution of drug resistance.
- ItemNon-Negative Matrix Factorization for Learning Alignment-Specific Models of Protein Evolution(PLOS, 2011-12-22) Murrell, Ben; Weighill, Thomas; Buys, Jan; Ketteringham, Robert; Moola, Sasha; Benade, Gerdus; du Buisson, Lise; Kaliski, Daniel; Hands, Tristan; Scheffler, KonradModels of protein evolution currently come in two flavors: generalist and specialist. Generalist models (e.g. PAM, JTT, WAG) adopt a one-size-fits-all approach, where a single model is estimated from a number of different protein alignments. Specialist models (e.g. mtREV, rtREV, HIVbetween) can be estimated when a large quantity of data are available for a single organism or gene, and are intended for use on that organism or gene only. Unsurprisingly, specialist models outperform generalist models, but in most instances there simply are not enough data available to estimate them. We propose a method for estimating alignment-specific models of protein evolution in which the complexity of the model is adapted to suit the richness of the data. Our method uses non-negative matrix factorization (NNMF) to learn a set of basis matrices from a general dataset containing a large number of alignments of different proteins, thus capturing the dimensions of important variation. It then learns a set of weights that are specific to the organism or gene of interest and for which only a smaller dataset is available. Thus the alignment-specific model is obtained as a weighted sum of the basis matrices. Having been constrained to vary along only as many dimensions as the data justify, the model has far fewer parameters than would be required to estimate a specialist model. We show that our NNMF procedure produces models that outperform existing methods on all but one of 50 test alignments. The basis matrices we obtain confirm the expectation that amino acid properties tend to be conserved, and allow us to quantify, on specific alignments, how the strength of conservation varies across different properties. We also apply our new models to phylogeny inference and show that the resulting phylogenies are different from, and have improved likelihood over, those inferred under standard models.
- ItemOn the validity of evolutionary models with site-specific parameters(PLoS, 2014-04-10) Scheffler, Konrad; Murrell, Ben; Pond, Sergei L. KosakovskyEvolutionary models that make use of site-specific parameters have recently been criticized on the grounds that parameter estimates obtained under such models can be unreliable and lack theoretical guarantees of convergence. We present a simulation study providing empirical evidence that a simple version of the models in question does exhibit sensible convergence behavior and that additional taxa, despite not being independent of each other, lead to improved parameter estimates. Although it would be desirable to have theoretical guarantees of this, we argue that such guarantees would not be sufficient to justify the use of these models in practice. Instead, we emphasize the importance of taking the variance of parameter estimates into account rather than blindly trusting point estimates – this is standardly done by using the models to construct statistical hypothesis tests, which are then validated empirically via simulation studies.
- ItemSocial and genetic networks of HIV-1 transmission in New York City(PLoS, 2017-01-09) Wertheim, Joel O.; Kosakovsky Pond, Sergei L.; Forgione, Lisa A.; Mehta, Sanjay R.; Murrell, Ben; Shah, Sharmila; Smith, Davey M.; Scheffler, Konrad; Torian, Lucia V.Background Sexually transmitted infections spread across contact networks. Partner elicitation and notification are commonly used public health tools to identify, notify, and offer testing to persons linked in these contact networks. For HIV-1, a rapidly evolving pathogen with low per-contact transmission rates, viral genetic sequences are an additional source of data that can be used to infer or refine transmission networks. Methods and Findings The New York City Department of Health and Mental Hygiene interviews individuals newly diagnosed with HIV and elicits names of sexual and injection drug using partners. By law, the Department of Health also receives HIV sequences when these individuals enter healthcare and their physicians order resistance testing. Our study used both HIV sequence and partner naming data from 1342 HIV-infected persons in New York City between 2006 and 2012 to infer and compare sexual/drug-use named partner and genetic transmission networks. Using these networks, we determined a range of genetic distance thresholds suitable for identifying potential transmission partners. In 48% of cases, named partners were infected with genetically closely related viruses, compatible with but not necessarily representing or implying, direct transmission. Partner pairs linked through the genetic similarity of their HIV sequences were also linked by naming in 53% of cases. Persons who reported high-risk heterosexual contact were more likely to name at least one partner with a genetically similar virus than those reporting their risk as injection drug use or men who have sex with men. Conclusions We analyzed an unprecedentedly large and detailed partner tracing and HIV sequence dataset and determined an empirically justified range of genetic distance thresholds for identifying potential transmission partners. We conclude that genetic linkage provides more reliable evidence for identifying potential transmission partners than partner naming, highlighting the importance and complementarity of both epidemiological and molecular genetic surveillance for characterizing regional HIV-1 epidemics.