Development of methods for the analysis of deep sequencing data; applications to the discovery of functions of RNA-binding proteins

Bilen, Biter. Development of methods for the analysis of deep sequencing data; applications to the discovery of functions of RNA-binding proteins. 2014, Doctoral Thesis, University of Basel, Faculty of Science.

Available under License CC BY-NC-ND (Attribution-NonCommercial-NoDerivatives).


Official URL: http://edoc.unibas.ch/diss/DissB_10772

Downloads: Statistics Overview


With the recent advances in nucleotide sequencing technologies, it became easy to generate tens of millions of reads with genome- or transcriptome-wide distribution with reduced cost and high accuracy. One of the applications of deep sequencing is the determination of the repertoire of targets of RNA-binding proteins. The method, called CLIP (for UV crosslinking and immune-precipitation) is now widely used to characterize a variety of proteins with regulatory as well as enzymatic functions. Here we focus on the statistical analysis of data obtained through a variant of CLIP, called PAR-CLIP (Photoactivatable-Ribonucleoside-Enhanced CLIP), which was applied to three different RNA binding proteins whose function was previously not well characterized: PAPD5 (PAP associated domain containing 5), DIS3L2 (DIS3 mitotic control homolog (S. cerevisiae)-like 2), and EWSR1 (Ewing sarcoma breakpoint region 1). Our computational analysis was instrumental for the definition of the main in vivo substrates of these proteins, which were confirmed by additional experiments. In the analysis, we also used extensively publicly available high-throughput data sets that enabled us make inferences about the function of the proteins. The main results of biological significance were as follows. We determined ribosomal RNAs are the main targets of PAPD5 and that the main substrates of the DIS3L2 nuclease are tRNAs and found that the tRNA-derived fragments processed by DIS3L2 could be loaded in the RNA silencing complex and be involved in gene silencing. Finally, we determined that EWSR1preferentially binds to RNAs that originate from instability-prone regions like sub-telomeres, known to be hotspots of genomic rearrangements, as well as other genes located in internal regions of chromosomes, that have been implicated in genomic translocations. These include EWSR1’s own pre-mRNA. All together this dissertation illustrates the point that when coupled with proper statistical analysis, CLIP is able to reveal targets of RNA-binding proteins that were difficult to study with other methods and that and integration of public domain datasets is very powerful in deciphering complex RNA-protein and regulatory RNA networks implicated in post-transcriptional gene regulation.
Advisors:Zavolan, Mihaela
Committee Members:Vaňáčová, Štěpánka
Faculties and Departments:05 Faculty of Science > Departement Biozentrum > Computational & Systems Biology > Bioinformatics (Zavolan)
UniBasel Contributors:Zavolan, Mihaela
Item Type:Thesis
Thesis Subtype:Doctoral Thesis
Thesis no:10772
Thesis status:Complete
Number of Pages:153 S.
Identification Number:
edoc DOI:
Last Modified:22 Jan 2018 15:51
Deposited On:12 May 2014 14:56

Repository Staff Only: item control page