Browsing by Author "Weighill, Thomas"
Now showing 1 - 3 of 3
Results Per Page
Sort Options
- ItemBifibrational duality in non-abelian algebra and the theory of databases(Stellenbosch : Stellenbosch University, 2014-12) Weighill, Thomas; Janelidze, Zurab; Stellenbosch University. Faculty of Science. Department of Mathematical Sciences.ENGLISH ABSTRACT: In this thesis we develop a self-dual categorical approach to some topics in non-abelian algebra, which is based on replacing the framework of a category with that of a category equipped with a functor to it. We also make some first steps towards a possible link between this theory and the theory of databases in computer science. Both of these theories are based around the study of Grothendieck bifibrations and their generalisations. The main results in this thesis concern correspondences between certain structures on a category which are relevant to the study of categories of non-abelian group-like structures, and functors over that category. An investigation of these correspondences leads to a system of dual axioms on a functor, which can be considered as a solution to the proposal of Mac Lane in his 1950 paper "Duality for Groups" that a self-dual setting for formulating and proving results for groups be found. The part of the thesis concerned with the theory of databases is based on a recent approach by Johnson and Rosebrugh to views of databases and the view update problem.
- ItemDetecting individual sites subject to episodic diversifying selection(Public Library of Science, 2012-07-02) Murrell, Ben; Wertheim, Joel O.; Moola, Sasha; Weighill, Thomas; Scheffler, Konrad; Pond, Sergei L. KosakovskyThe imprint of natural selection on protein coding genes is often difficult to identify because selection is frequently transient or episodic, i.e. it affects only a subset of lineages. Existing computational techniques, which are designed to identify sites subject to pervasive selection, may fail to recognize sites where selection is episodic: a large proportion of positively selected sites. We present a mixed effects model of evolution (MEME) that is capable of identifying instances of both episodic and pervasive positive selection at the level of an individual site. Using empirical and simulated data, we demonstrate the superior performance of MEME over older models under a broad range of scenarios. We find that episodic selection is widespread and conclude that the number of sites experiencing positive selection may have been vastly underestimated.
- ItemNon-Negative Matrix Factorization for Learning Alignment-Specific Models of Protein Evolution(PLOS, 2011-12-22) Murrell, Ben; Weighill, Thomas; Buys, Jan; Ketteringham, Robert; Moola, Sasha; Benade, Gerdus; du Buisson, Lise; Kaliski, Daniel; Hands, Tristan; Scheffler, KonradModels of protein evolution currently come in two flavors: generalist and specialist. Generalist models (e.g. PAM, JTT, WAG) adopt a one-size-fits-all approach, where a single model is estimated from a number of different protein alignments. Specialist models (e.g. mtREV, rtREV, HIVbetween) can be estimated when a large quantity of data are available for a single organism or gene, and are intended for use on that organism or gene only. Unsurprisingly, specialist models outperform generalist models, but in most instances there simply are not enough data available to estimate them. We propose a method for estimating alignment-specific models of protein evolution in which the complexity of the model is adapted to suit the richness of the data. Our method uses non-negative matrix factorization (NNMF) to learn a set of basis matrices from a general dataset containing a large number of alignments of different proteins, thus capturing the dimensions of important variation. It then learns a set of weights that are specific to the organism or gene of interest and for which only a smaller dataset is available. Thus the alignment-specific model is obtained as a weighted sum of the basis matrices. Having been constrained to vary along only as many dimensions as the data justify, the model has far fewer parameters than would be required to estimate a specialist model. We show that our NNMF procedure produces models that outperform existing methods on all but one of 50 test alignments. The basis matrices we obtain confirm the expectation that amino acid properties tend to be conserved, and allow us to quantify, on specific alignments, how the strength of conservation varies across different properties. We also apply our new models to phylogeny inference and show that the resulting phylogenies are different from, and have improved likelihood over, those inferred under standard models.