Browsing by Author "Henn, Brenna M."
Now showing 1 - 5 of 5
Results Per Page
Sort Options
- ItemDetermining ancestry proportions in complex admixture scenarios in South Africa using a novel proxy ancestry selection method(PLoS, 2013-09) Chimusa, Emile R.; Daya, Michelle; Möller, Marlo; Ramesar, Raj; Henn, Brenna M.; Van Helden, Paul D.; Mulder, Nicola J.; Hoal, Eileen G.Admixed populations can make an important contribution to the discovery of disease susceptibility genes if the parental populations exhibit substantial variation in susceptibility. Admixture mapping has been used successfully, but is not designed to cope with populations that have more than two or three ancestral populations. The inference of admixture proportions and local ancestry and the imputation of missing genotypes in admixed populations are crucial in both understanding variation in disease and identifying novel disease loci. These inferences make use of reference populations, and accuracy depends on the choice of ancestral populations. Using an insufficient or inaccurate ancestral panel can result in erroneously inferred ancestry and affect the detection power of GWAS and meta-analysis when using imputation. Current algorithms are inadequate for multi-way admixed populations. To address these challenges we developed PROXYANC, an approach to select the best proxy ancestral populations. From the simulation of a multi-way admixed population we demonstrate the capability and accuracy of PROXYANC and illustrate the importance of the choice of ancestry in both estimating admixture proportions and imputing missing genotypes. We applied this approach to a complex, uniquely admixed South African population. Using genome-wide SNP data from over 764 individuals, we accurately estimate the genetic contributions from the best ancestral populations: isiXhosa (33%±0:226), {Khomani SAN (31%±0:195), European (16%±0:118), Indian (13%±0:094), and Chinese (7%±0:0488). We also demonstrate that the ancestral allele frequency differences correlate with increased linkage disequilibrium in the South African population, which originates from admixture events rather than population bottlenecks.
- ItemExome capture from saliva produces high quality genomic and metagenomic data(BioMed Central, 2014-04) Kidd, Jeffrey M.; Sharpton, Thomas J.; Bobo, Dean; Norman, Paul J.; Martin, Alicia R.; Carpenter, Meredith L.; Sikora, Martin; Gignoux, Christopher R.; Nemat-Gorgani, Neda; Adams, Alexandra; Guadalupe, Moraima; Guo, Xiaosen; Feng, Qiang; Li, Yingrui; Liu, Xiao; Parham, Peter; Hoal, Eileen G.; Feldman, Marcus W.; Pollard, Katherine S.; Wall, Jeffrey D.; Bustamante, Carlos D.; Henn, Brenna M.Background Targeted capture of genomic regions reduces sequencing cost while generating higher coverage by allowing biomedical researchers to focus on specific loci of interest, such as exons. Targeted capture also has the potential to facilitate the generation of genomic data from DNA collected via saliva or buccal cells. DNA samples derived from these cell types tend to have a lower human DNA yield, may be degraded from age and/or have contamination from bacteria or other ambient oral microbiota. However, thousands of samples have been previously collected from these cell types, and saliva collection has the advantage that it is a non-invasive and appropriate for a wide variety of research. Results We demonstrate successful enrichment and sequencing of 15 South African KhoeSan exomes and 2 full genomes with samples initially derived from saliva. The expanded exome dataset enables us to characterize genetic diversity free from ascertainment bias for multiple KhoeSan populations, including new exome data from six HGDP Namibian San, revealing substantial population structure across the Kalahari Desert region. Additionally, we discover and independently verify thirty-one previously unknown KIR alleles using methods we developed to accurately map and call the highly polymorphic HLA and KIR loci from exome capture data. Finally, we show that exome capture of saliva-derived DNA yields sufficient non-human sequences to characterize oral microbial communities, including detection of bacteria linked to oral disease (e.g. Prevotella melaninogenica). For comparison, two samples were sequenced using standard full genome library preparation without exome capture and we found no systematic bias of metagenomic information between exome-captured and non-captured data. Conclusions DNA from human saliva samples, collected and extracted using standard procedures, can be used to successfully sequence high quality human exomes, and metagenomic data can be derived from non-human reads. We find that individuals from the Kalahari carry a higher oral pathogenic microbial load than samples surveyed in the Human Microbiome Project. Additionally, rare variants present in the exomes suggest strong population structure across different KhoeSan populations.
- ItemIMPUTOR : phylogenetically aware software for imputation of errors in next-generation sequencing(Oxford University Press, 2018) Jobin, Matthew; Schurz, Haiko; Henn, Brenna M.ENGLISH ABSTRACT: We introduce IMPUTOR, software for phylogenetically aware imputation of missing haploid nonrecombining genomic data. Targeted for next-generation sequencing data, IMPUTOR uses the principle of parsimony to impute data marked as missing due to low coverage.Alongwith efficiently imputingmissing variant genotypes, IMPUTOR is capable of reliably and accurately correcting manynonmissingsites that represent spurious sequencing errors. Testsonsimulateddata showthatIMPUTORis capable of detecting many induced mutations without making erroneous imputations/corrections, with as many as 95% of missing sites imputed and 81%of errors corrected under optimal conditions.We tested IMPUTOR with human Y-chromosomes from pairs of close relatives and demonstrate IMPUTOR’s efficacy in imputing missing and correcting erroneous calls.
- ItemA panel of ancestry informative markers for the complex five-way admixed South African Coloured population(PLoS, 2013-12) Daya, Michelle; Van der Merwe, Lize; Ushma Galal; Möller, Marlo; Salie, Muneeb; Chimusa, Emile R.; Galanter, Joshua M.; Van Helden, Paul D.; Henn, Brenna M.; Gignoux, Chris R.; Hoal, EileenAdmixture is a well known confounder in genetic association studies. If genome-wide data is not available, as would be the case for candidate gene studies, ancestry informative markers (AIMs) are required in order to adjust for admixture. The predominant population group in the Western Cape, South Africa, is the admixed group known as the South African Coloured (SAC). A small set of AIMs that is optimized to distinguish between the five source populations of this population (African San, African non-San, European, South Asian, and East Asian) will enable researchers to cost-effectively reduce falsepositive findings resulting from ignoring admixture in genetic association studies of the population. Using genome-wide data to find SNPs with large allele frequency differences between the source populations of the SAC, as quantified by Rosenberg et. al’s In-statistic, we developed a panel of AIMs by experimenting with various selection strategies. Subsets of different sizes were evaluated by measuring the correlation between ancestry proportions estimated by each AIM subset with ancestry proportions estimated using genome-wide data. We show that a panel of 96 AIMs can be used to assess ancestry proportions and to adjust for the confounding effect of the complex five-way admixture that occurred in the South African Coloured population.
- ItemA post-GWAS analysis of predicted regulatory variants and tuberculosis susceptibility(Public Library of Science, 2017) Uren, Caitlin; Henn, Brenna M.; Franke, Andre; Wittig, Michael; Van Helden, Paul D.; Hoal, Eileen G.; Moller, MarloUtilizing data from published tuberculosis (TB) genome-wide association studies (GWAS), we use a bioinformatics pipeline to detect all polymorphisms in linkage disequilibrium (LD) with variants previously implicated in TB disease susceptibility. The probability that these variants had a predicted regulatory function was estimated using RegulomeDB and Ensembl's Variant Effect Predictor. Subsequent genotyping of these 133 predicted regulatory polymorphisms was performed in 400 admixed South African TB cases and 366 healthy controls in a population-based case-control association study to fine-map the causal variant. We detected associations between tuberculosis susceptibility and six intronic polymorphisms located in MARCO, IFNGR2, ASHAS2, ACACA, NISCH and TLR10. Our post- GWAS approach demonstrates the feasibility of combining multiple TB GWAS datasets with linkage information to identify regulatory variants associated with this infectious disease.