Computational models to infer regulators of gene expression for high-throughput data

Katsantoni, Maria. Computational models to infer regulators of gene expression for high-throughput data. 2023, Doctoral Thesis, University of Basel, Faculty of Science.


Official URL: https://edoc.unibas.ch/96092/

Downloads: Statistics Overview


Ever since the formulation of the central dogma of biology, the focus was shifted into how the various steps up to the protein formation are regulated. RNA-binding proteins (RBPs) have been shown to be instrumental in a vast number of post-transcriptional processes. Crosslinking immunoprecipitation (CLIP) is a mainstay in the experimental approaches used for detecting the binding partners of the RBPs. Many variations of these protocols have been developed, along with multiple software solutions for their analysis. However, most of the existing methods do not include efficient pre-processing in their processing and their statistical models still do not efficiently remove all sources of noise. To deal with these issues, we developed RCRUNCH. RCRUNCH is an automated workflow, that deals with pre-processing, has its own statistical approach to efficiently detect significant binding events and also reliably infers motifs. Additional features are inclusion of multi-mappers, selective removal of ncRNAs and a transcriptomic approach for considering binding events spanning splice junctions. RCRUNCH was shown to have a reliable performance in comparison to the most wide used methods in a number of metrics. ENCODE eCLIP data were analysed using RCRUNCH and many known interactions and motifs were successfully reproduced, along with some new interesting findings. This interest in high-throughput data analysis led to a more collaborative project called ZARP, used for high-throughput analysis of RNA-seq data, developed in the Zavolan group with the FAIR principles in mind, and a good roadmap on how to apply best practices in the specific context of a bioinformatics analysis. In this project we focused more on flexibility and usability of the workflow even with minimal bioinformatics expertise.
Advisors:Zavolan, Mihaela
Committee Members:van Nimwegen, Erik and König, Julian
Faculties and Departments:05 Faculty of Science > Departement Biozentrum > Computational & Systems Biology > Bioinformatics (Zavolan)
UniBasel Contributors:Zavolan, Mihaela and van Nimwegen, Erik
Item Type:Thesis
Thesis Subtype:Doctoral Thesis
Thesis no:15241
Thesis status:Complete
Number of Pages:ix, 82
Identification Number:
  • urn: urn:nbn:ch:bel-bau-diss152416
edoc DOI:
Last Modified:20 Jan 2024 05:30
Deposited On:19 Jan 2024 15:10

Repository Staff Only: item control page