Doctoral Degrees (Epidemiology and Biostatistics)
Permanent URI for this collection
Browse
Recent Submissions
- ItemOptimisation and benchmarking of analytical approaches to estimation of population level HIV incidence from survey data(Stellenbosch : Stellenbosch University, 2022-04) Mhlanga, Laurette; Welte, Alex; Grebe, Eduard; Stellenbosch University. Faculty of Medicine and Health Sciences. Dept. of Global Health. Epidemiology and Biostatistics.ENGLISH SUMMARY: Disease prevalence (the proportion of a population with a condition of interest) is conceptually and procedurally much more straightforward to estimate than disease incidence (the rate of occurrence of new cases - for example, infections). For long-lasting conditions, incidence is fundamentally more difficult to estimate than prevalence, but also more interesting, as it sheds light on current epidemiological trends such as the emerging burden on health systems and the impact of recent policy interventions. Progress towards reducing reliance on questionable assumptions in the analysis of large population based surveys (for the estimation of HIV incidence) has been slow. The work of Kassanjee et al and the work of Mahiane et al, in particular, provide rigorous ways of estimating incidence by using 1) markers of ‘recent infection’, 2) the ‘gradient’ of prevalence, and 3) ‘excess mortality’ associated with HIV infection, without the need for simplifying assumptions to the effect that any particular parameters are constant over ranges of time and/or age. To date, the use of these methods has largely ignored 1) the rich details of the age and time structure of survey data, and 2) the opportunities for combining the two methods. The primary objective of this work was to find stable approaches to applying the Mahiane and Kassanjee methods to large age/time structured population survey data sets which include HIV status, and optionally, ‘recent infection’ status. In order to evaluate proposed methods, a sophisticated simulation platform was created to simulate HIV epidemics and generate survey data sets that are structured like real population survey data, with the underlying incidence, prevalence, and mortality explicitly known. The first non-trivial step in the analysis of survey data amounts essentially to performing a smoothing procedure from which the (age/time specific) prevalence of HIV infection, the prevalence of ‘recent infection’, and the gradient of prevalence of infection can be inferred without recourse to ‘epidemiological’ assumptions. The second step involves the correct accounting for uncertainty in a context-specific weighted mean of the Mahiane and Kassanjee estimators. These two steps are approached incrementally, as there are numerous details which have not previously been systematically elucidated. The investigation culminates in a proposed generic ‘once size fits most’ algorithm based on: 1) fitting survey data to generalised linear models defined by simple link functions and high order polynomials in age and time; 2) the use of a ‘moving window’ rule for data inclusion into a separate analysis for each age/time point for which incidence is to be estimated; 3) a ‘variance optimal’ weighting scheme for the combination of the Mahiane and Kassanjee estimators (when both are applicable); 4) flexible use of a delta method expansion or bootstrapping to estimate confidence intervals and p values. We find it is relatively easy to obtain estimates with practically negligible bias, but samplesizes/ sampling-density requirements are always considerable. We also make numerous observations on survey design and the inherent challenges faced by all attempts to estimate HIV incidence using surveys of reasonable size.
- ItemThe impact of missing data on estimating HIV/AIDS prevalence and incidence in demographic sentinel survey studies(Stellenbosch : Stellenbosch University, 2022-04) Mosha, Neema Ramadhani; Machekano, Rhoderick; Young, Taryn; Todd, Jim; Stellenbosch University. Faculty of Medicine and Health Sciences. Dept. of Global Health. Epidemiology and Biostatistics.ENGLISH SUMMARY: Background: Missing data is a challenge in most research, especially with observational population data such as demographic surveys. These studies often account for survey designs and clustering when estimating disease prevalence or incidence, but do not account for missing data. In other circumstances they do not explicitly state how they dealt with missing data during analysis or inappropriately handles them in practice. There are many challenges in conceptualising the pattern of missingness, its occurrence mechanism and complexity of methods for handling the problem of missing data. Ignoring the missingness of survey data can cause biased estimates and invalid conclusions. The primary aim of this PhD was to evaluate the impact of missing data on estimating HIV/AIDS prevalence in demographic sentinel surveillance studies. Methods: A systematic review of HIV studies to identify and describe methods used to analyse studies with missing data was done. A series of simulation studies to explore the precision and efficiency of the prevalence estimates using complete case analysis (CCA), multiple imputation (MI), inverse probability weighting (IPW) and double robust estimator (DR), when data are missing at random (MAR) in survey studies was done. A descriptive statistics and a complete case analysis to determine the incidence and population prevalence estimates ignoring the missingness on four different survey rounds of Magu Health Demographic Sentinel Surveillance (HDSS) was done.The surveys were conducted between 2006 and 2016, they included adults aged 15 years and above and about 50% of the population was tested for HIV in each survey. This was followed by data exploration assessing the missingness occurrence and association between missingness and other study characteristics. Finally, application of the statistical methods used in the simulations study was performed to re-estimate the prevalence of the surveys data taking into account the missingness. Results: The systematic review found 24 eligible articles from population, demographic and cross-sectional surveys that acknowledged the presence of missing data. In these studies, complete case analysis was the standard method of choice (100%) followed by multiple imputations (46%) and Heckman’s selection models (38%). A simulation study generated a hypothetical HIV survey with 32 different scenarios exploring data when an outcome is missing 20% and 55%. This simulation showed that when data are MAR, complete case analysis produces biased and inefficient estimates. Results showed that the three methods (MI, IPW and DR) were valid and efficient if the missingness or imputation models are correctly specified, but if either of the MI or IPW models are mis-specified, then the DR estimator can still be valid. Regarding to performance of the methods, provided that correct models are used, MI is more unbiased even when there is 55% of the data missing. However with 55% missingness all estimators are less reliable. In the complete case analysis, the overall population prevalence estimates for HIV decreased from 7.2% in 2006 to 6.6% in 2016. Cox models were used to determine HIV incidence rates and risk factor analysis by sex. The incidence rate was 5.5 per 1000 person - years in women compared to 4.6 per 1000 person-years in men. Residence, marital status, mobile individuals, and individuals with two or more partners were associated with the increase in incidence of HIV in bivariate analysis. The missingness OF HIV was as high as 60.3% (in the 2016 survey) and in all surveys(Sero 5 to 8) it was associated with age, sex, residence, and marital status. Further analysis using MI, IPW and DR assuming the outcome was MAR showed that the overall HIV prevalence was not significantly different from the complete case analysis in all four of the surveys. However, there were significant differences in the HIV estimates when stratified by the covariates. Looking at the confidence intervals width multiple imputations outperformed IPW and DR by producing more narrower estimates. Conclusion: Overall, this dissertation showed that despite the availability of methods to adjust for missing data, many surveys still ignore the missingness. The reporting among articles adjusted for missingness was below guideline standards. Understanding the mechanism of missingness enhances the proper application of advanced methods to account for the missingness. With data missing at random, IPW, MI, and DR can account for the missingness and produce unbiased and efficient estimates in HIV survey studies. Also, more simplified information and awareness are still needed to allow researchers to make informed choices, specifically on which method to apply and in which situation it works best for the estimates to be more reliable and representative.
- ItemModel-based inference on the impact of early access to antiretroviral therapy to all on HIV incidence among adolescent girls and young women in Eswatini(Stellenbosch : Stellenbosch University, 2021-04) Chibawara, Trust; Nyasulu, Peter; Kajungu, Dan; Stellenbosch University. Faculty of Medicine and Health Sciences. Dept. of Global Health. Epidemiology and Biostatistics.ENGLISH SUMMARY: Introduction: The introduction of antiretroviral drugs has enabled people living with HIV (PLHIV) to have a much better prognosis. As such, the use of antiretroviral drugs has resulted in the decline of global HIV incidence over the last decade. Whilst this achievement is important, the role of the widespread use of antiretroviral drugs on the HIV epidemic among adolescent girls and young women is still unknown. This study aimed to evaluate the impact of Early Access for all HIV-positive Adults to Antiretroviral (EAAA) on HIV incidence among adolescent girls and young women in Eswatini. Methods: To accomplish our research objectives, this research provided elaborate mathematical concepts that are multidisciplinary in nature and included evidence based systematic review, statistics, data science and public health approaches. Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) guidelines was used for the systematic review. Using Simpact, an individual-based, event-driven, stochastic simulation tool, a specially designed HIV transmission model was adopted to simulate the heterosexual transmission of HIV in Eswatini. The application of a simplified model calibration approach combined clinical, biological, and behavioural indicators from the Eswatini implementation study called “Maximizing Antiretroviral Treatment for Better Health and Zero New HIV Infection: Early Access to Antiretroviral Treatment for All (MaxART EAAA)” and Eswatini demographic summary statistics to infer the impact of EAAA on HIV incidence in adolescent girls and young women. Results: The results of the systematic review showed that globally, there was no published or unpublished research found on the impact of the use of ART by HIV positive adults on HIV incidence in adolescent girls and young women. While on the other hand, our model which aimed to evaluate the impact of EAAA on older men aged 18 years and above in Eswatini showed a 45% (95% Confidence interval (CI): 37-55) reduction on HIV incidence among the adolescent girls and young women aged 15-24-years-old as opposed to CD4 cell count threshold for ART eligibility (Standard of care). Furthermore, simulated data showed that early access to ART has a similar impact of 47% (95% CI: 33-59) reduction in HIV incidence among adolescent boys and young men of the same age group. Conclusion: This study has demonstrated the impact of EAAA as a strategy to reduce new HIV infections among adolescent girls and young women aged between 15-24-years-old in the Eswatini population. These findings reinforce the need to adopt provisions for early initiation of ART treatment among HIV infected adults as a catalyst to minimize transmission of HIV to the adolescent population. Data from this study also highlight the need for other countries in the region who are faced with similar challenges of harbouring a high HIV prevalence to adopt EAAA as it has shown to be an effective approach to reduce HIV/AIDS incidence in the population. While these benefits are applaudable, we do recognize that HIV/AIDS treatment on its on is not sufficient; therefore, behavioural changes that guard against age-disparate relationships should be reinforced.
- ItemAssessment of point-of-care testing for prediction of aromatase inhibitor-associated side effects in obese postmenopausal breast cancer patients screened for cardiovascular risk factors(Stellenbosch : Stellenbosch University, 2021-12) Milambo, Jean Paul Muambangu; Akudugu, John M.; Nyasulu, Peter S.; Stellenbosch University. Faculty of Medicine and Health Sciences. Dept. of Global Health. Epidemiology and Biostatistics.ENGLISH SUMMARY : Background: Aromatase inhibitors (AIs) constitute a standard of care for post- and premenopausal patients with estrogen receptor-positive breast cancer (BC). Obesity and mediators of inflammation have been identified as the most important risk and predictive factors in postmenopausal breast cancer survivors (BCS) using AIs. However, data on the feasibility of point-of-care (POC) genotyping using high sensitivity C-reactive protein (hs-CRP) and body mass index (BMI) as predictors of drug toxicity among postmenopausal BCS in African clinical settings are lacking. Aim: The study was conducted to assess the impact of AIs on hs-CRP and BMI, which are used at POC for prediction of therapy-associated side effects among obese postmenopausal breast cancer patients in Africa. Methods: One hundred and twenty-six female BC patients with cancer stages ranging from 0-III were recruited at Tygerberg Hospital (TBH) in the Western Cape Province of South Africa, between August 2014 and February 2017, for the study. A Quasi-experimental study was conducted. Patients were initially subjected to AIs and subsequently followed up at months 4, 12, and 24. Baseline clinical and biomedical assessments were conducted at commencement of study to predict hs-CRP and BMI at months 12 and 24, using a multiple imputation model. A random effects model was used to monitor the changes over the time. Statistical analyses were performed using SPSS 18.0 software (SPSS Inc., Chicago, IL, USA) and STATA version 16. Analyses were two-tailed and a p-value < 0.05 was considered statistically significant. Results: The mean age of the participants was 61 years (SD = 7.11 years; 95% CI: 60-62 years). Linear regression revealed that hs-CRP was associated with waist circumference (OR: 7.5; p= 0. 0116; 95%CI: 1.45 to 39.61) and BMI (OR: 2.15; p=0.034, 95%CI: 1.02 to 4.56). Waist circumference was associated with hypertension (OR: 3, 83; p= 0.003, 95%CI: 1.56 to 9.39), and chemotherapy was associated with waist circumference by (p= 0. 016; 95%CI: 0.11 to 0. 79). hs-CRP levels were significantly correlated with BMI and total body fat (TBF) among postmenopausal using aromatase inhibitors. Random linear effects modelling revealed stronger statistical association between BMI and homocysteine (p=0.021, 95%CI: 0.0083 to 0.1029). Weight and TBF were strongly associated after 24 months of follow-up. In addition, hs-CRP was associated with BMI (p=0.0001) and other inflammatory markers such as calcium (p=0.021, 95%CI: 0.0083 to 0.1029), phosphate (p=0.039, 95%CI: 0.0083 to 0.1029), and ferritin (p=0.002, 95%CI: 0.0199 to 0.084). Multiple imputation modelling indicated that there were statistically significant variations in TBF, weight, homocysteine, ferritin, and calcium between baseline and after 24 months of follow-up. Mathematical modeling Comparison of genotyping from HyBeacon® probe technology to Sanger sequencing showed that yielded sensitivity of 99% (95% CI: 94.55 to 99.97%), specificity of 89.44% (95% CI: 87.25 to 91.38%), PPV of 51% (95%: 43.77 to 58.26%), and NPV of 99.88% (95% CI: 99.31 to 100.00%). Based on the mathematical model, the assumptions revealed that incremental cost-effective ratio (ICER) was R7 044.55. Conclusion: This study revealed that hs-CRP and BMI are predictors of CVD-related adverse events in obese postmenopausal patients. Calcium, phosphate, homocysteine, and ferritin should also be incorporated in POCT. There were statistically significant variations in TBF, weight, hs-CRP, BMI, homocysteine, ferritin, and calcium between baseline and after 24 months of follow-up. HyBeacon® probe technology at POC for AI-associated adverse events maybe cost-effective in Africa while adjunct to standard practice. The appropriate pathways for implementation of POC testing in postmenopausal breast cancer survivors need further investigation in different clinical settings with real data for external validation.
- ItemCombining sexual behavioural survey data, phylodynamics and agent-based models towards a unified framework for HIV prevention research(Stellenbosch : Stellenbosch University, 2021-12) Niyukuri, David; Nyasulu, Peter Suwirakwenda; Delva, Wim; Stellenbosch University. Faculty of Medicine and Health Sciences. Dept. of Global Health. Epidemiology and Biostatistics.ENGLISH SUMMARY: Background: Sub-Saharan African countries carry a disproportionate burden of the Human Immunodeficiency Virus (HIV) infection. Thus, beyond estimation tools which are used to produce HIV epidemic estimates, there is a need for simulation tools to understand the structure and the dynamics of sexual networks, and HIV transmission underlying factors. This can help to design and implement effective interventions. These simulation tools should be able to take advantage of existing multi-source data. Furthermore, with such multi-data generation tools, we can be able to assess new methodologies and the accuracy of different inferences made from available real-world data. Methods: We developed a unified simulation framework which combines in one model world the simulation of sexual dynamic network, HIV transmission, and between-host viral evolution for infected individuals. We used that simulation framework to run a benchmark study to infer age-mixing patterns in HIV transmission in different sequence missingness scenarios. We used transmission clusters from phylogenetic trees and compute proportions of pairings between men and women who were phylogenetically linked across different age groups. We assessed the usability of our simulation framework through a calibration study. We focused on fitting the simulation framework to summary features from multiple data sources to increase the accuracy of estimates. The case study was the estimation of determinants of HIV transmission network, namely age-mixing patterns in sexual partnerships, distribution of onward transmission, and temporal trend of HIV incidence. We also used simulated polymerase and protease viral data on same transmission network with Simpact Cyan to check in the phylogenetic results, mainly root-to-tip regression, and transmission clusters. Results: The proof of concept of the appropriateness of the modelling framework was determined by the ability to capture HIV transmission dynamics, and the temporal trends of branching times of a phylogenetic tree built from simulated viral sequence data. For age-mixing patterns in HIV transmission, the results of the simulation suggested that proportions of men/women linked to women/men across different age groups, together with the mean and standard deviation of age difference can unveil age-mixing patterns in HIV transmission networks. For the calibration study, the results showed that the relative errors between true benchmark values and post calibration values of the determinants of HIV transmission network were relatively close in the three calibration scenarios. In post-calibration simulation age-mixing patterns and the distribution of onward HIV transmission had relatively small error values, but the age-gender strata temporal trend of incidence was poorly captured. The root-to-tip regression of phylogenetic trees from protease and polymerase data simulated on the same HIV transmission network showed that the dispersion of the genetic distance with branching and sampling times was explained at 95% and 49% for polymerase and protease data, respectively. For transmission clusters, we could still get at least 90% of individuals within big the size clusters if we use polymerase or protease viral sequence data. This showed that even with the short sequences we could still get useful epidemiological data. Conclusion: The unified framework could be used as a data generation method for benchmark studies. This is so despite the simplistic assumption for HIV viral evolutionary dynamic through consideration of host evolution only. These methods could also help to investigate the effect of sexual dynamic network on HIV transmission and estimate age related individual-level features affecting the HIV transmission dynamic. Furthermore, this simulation framework could i) contribute to the advancement of phylogenetic-based inference methodology; and ii) advance epidemiological methods focusing on combining epidemiological data, sexual behaviour data, viral phylodynamics, and agent-based simulation models.
- «
- 1 (current)
- 2
- 3
- »