Methods for analysis of deep sequencing data from mixtures of Plasmodium falciparum clones or stage-specific transcriptomes

Lerch, Anita. Methods for analysis of deep sequencing data from mixtures of Plasmodium falciparum clones or stage-specific transcriptomes. 2018, Doctoral Thesis, University of Basel, Faculty of Science.


Official URL: http://edoc.unibas.ch/diss/DissB_12703

Downloads: Statistics Overview


Malaria is a life-threatening infectious disease caused by Plasmodium parasites transmitted to humans through bites of infected Anopheles mosquitos. An estimated 445,000 people die every year by an infection with Plasmodium parasites, most of them children living in sub-Saharan Africa. As a result of increased malaria control, the mortality was greatly reduced in the last decades. To develop new tools for elimination and to evaluate the impact of control, a good understanding of the epidemiology and biology of malaria parasites is required.
Studies of infection and transmission dynamics of Plasmodium parasites were greatly improved by distinguishing individual parasite clones and monitoring their infection dynamics over time. In regions with high transmission of Plasmodium parasites, individuals are often infected with several clones concurrently. Individual parasites clones can be identified by genotyping. The current standard method used for genotyping is amplification of highly length-polymorphic merozoite surface protein 2 (msp2) or other antigen genes followed by sizing of the amplicon by capillary electrophoresis (CE). The sensitivity to detect low-abundant clones (minority clones) of msp2-CE genotyping is however limited, resulting in an underestimation of multiplicity of infection (MOI). A shortfall of this genotyping method is that frequency of individual clones within a sample cannot be determined. This urges the search for new genotyping methods that rely on sequencing of genomic fragments with extensive single nucleotide polymorphism (SNP).
Improvement in next generation sequencing (NGS) technologies permitted the use of amplicon sequencing (Amp-Seq) in epidemiological studies. Genotyping by amplicon sequencing has a higher sensitivity to detect minority clones, can quantify the frequency of each clone within a sample, and allows the use of SNP polymorphic markers. In the frame of this thesis, a new Amp-Seq genotyping assay was developed, including known SNP polymorphic markers, and novel marker ‘cpmp’, as well as a bioinformatic analysis workflow. This genotyping assay was applied on field samples from a longitudinal study conducted in Papua New Guinea. A comparison to msp2-CE genotyping confirmed the higher sensitivity to detect minority clones by Amp-Seq genotyping method and showed a significant underestimation of MOI by classical size polymorphic marker. However, no significant increase in molecular force of infection (molFOI), i.e. number of new infections per individual per year, was observed.
Quantification of the frequency of individual clones in longitudinal samples permitted to infer multi-locus haplotypes. Multi-locus haplotypes increased discriminatory power of genotyping and robustly distinguished new infections from those detected in an individual earlier. For calculating the density of clones from multi-clone infections the within-host clone frequency is multiplied by parasitaemia of this infection determined by quantitative PCR. Density of individual parasites clones in multi-clone infections over time is a new parameter for epidemiological studies. It will permit to study the dynamics, and thus fitness, of parasite clones exposed to within-host competition or to acquired natural immunity.
NGS also gained great importance in gene expression studies of Plasmodium parasites in patient samples. Transcriptome studies are complicated by the mixture of different developmental stages present concurrently in samples collected from patients. Even in in vitro cultured samples after tight synchronisation or enrichment of a specific developmental stage, small fractions of other development stages are still found. This problem is of particular relevance for P. vivax, as the absence of continuous in vitro culture so far has hampered the study of isolated parasite stages. For example, the transcriptome of P. vivax gametocytes, one of the stages found in peripheral blood and infective to mosquitos, has not yet been described.
A solution for differentiating mixed transcription may come from deconvolution methods, which either infer the stage proportion in samples or stage-specific transcriptome signatures. A large selection of different deconvolution methods has been developed for the analysis of heterogeneous tissues, e.g. cancer tissues or hematopoietic cell, but these methods have rarely been applied to mixed stages of malaria parasites. The best suited combination of normalisation and deconvolution methods for analysis of RNA sequencing (RNA-Seq) data from mixed-stage samples of Plasmodium parasites was evaluated based on experimentally mixed highly synchronised blood stages. Normalisation by count per million and deconvolution with a negative binomial regression model followed by selection of genes with significant fold change resulted in the best agreement with transcriptomes as observed in single stages. This strategy can easily be transferred to Plasmodium field samples with known stage proportions. This analysis performed in cultured parasites of defined mixed stages served as proof-of-concept and confirmed that identification of stage-specific genes is feasible also in field samples, notably in species that cannot be cultivated, such as P. vivax.
NGS permits fundamentally new approaches to study Plasmodium parasites. This thesis presents a novel marker and data analysis platform for highly sensitive P. falciparum genotyping. Furthermore, a best practice workflow was identified to infer stage-specific gene expression from parasite infections consisting of mixed developmental stages. This provides a crucial tool for the analysis of gene expression data generated from Plasmodium field samples.
Advisors:Felger, Ingrid and Robinson, Mark D.
Faculties and Departments:05 Faculty of Science
09 Associated Institutions > Swiss Tropical and Public Health Institute (Swiss TPH) > Former Units within Swiss TPH > Molecular Diagnostics (Felger)
UniBasel Contributors:Felger, Ingrid
Item Type:Thesis
Thesis Subtype:Doctoral Thesis
Thesis no:12703
Thesis status:Complete
Number of Pages:1 Online-Ressource (150 Seiten)
Identification Number:
edoc DOI:
Last Modified:17 Aug 2018 04:30
Deposited On:16 Aug 2018 08:17

Repository Staff Only: item control page