Garofoli, Andrea. Computational approaches to improve precision oncology. 2021, Doctoral Thesis, University of Basel, Faculty of Science.
|
PDF
16Mb |
Official URL: https://edoc.unibas.ch/82897/
Downloads: Statistics Overview
Abstract
The word “cancer” identifies a collection of remarkably diverse range of diseases whose common trait is the presence of accelerated and unregulated cell proliferation that escalates into the development of so-called “tumoral tissue”. Molecular profiling of cancers has uncovered the presence of a vast diversity between cancers, laying the foundations for the use of case-by-case defined clinical decisions. The philosophy of precision oncology is based on the idea that patient care must take into account their molecular characteristics, to define the best therapy possible. The rise of big data and the computational approaches able to dissect it has enabled the profiling of an extraordinary number of diseases, whose characterization can be the stepping stone of precision oncology itself.
The aim of this project is the development of computational methodologies to support modern precision oncology and to help expand its modern implementations. Results are divided in two sections called Chapter I and Chapter II.
In the Chapter I we present PipeIT, a somatic variant caller we have developed to help researchers and clinicians to detect potential driver mutations in patients. PipeIT has been specifically designed to process data obtained from Ion Torrent, a sequencing platform frequently used in diagnostic settings but, compared to the other sequencing platforms, with few analysis tools. The novelty brought by PipeIT is its Singularity container nature, which ensures reproducibility of its analyses and enhances its ease of use. Two different PipeIT versions were developed. PipeIT was designed to perform variant calling on tumor-germline matched data. PipeIT2 was later developed to enable variant calling analysis of tumor only data, to broaden its use in the typical clinical setting. PipeIT2 takes advantage of publicly accessible databases and on panels of unmatched normals to account for the absence of a matched germline control. Both PipeIT pipelines were able to detect important driver genomic variants, proving to be a powerful tool for modern precision oncology.
In Chapter II we investigated the role of gene expression data as an alternative to DNA biomarkers to detect the presence of oncogenic molecular processes in cancer patients. Based on the assumption that the activation of oncogenic pathways caused by driver mutations can produce a specific transcriptional profile, we designed a machine learning classifier able to extract said profile from patients with driver hotspot mutations and infer its presence in patients who do not have the same hotspot mutations. The classifier was first tested on one of the most frequently mutated oncogenes, PIK3CA, using publicly accessible TCGA pan-cancer data. The classifier was able to detect the presence of PIK3CA hotspot driver mutations on a testing data obtaining a ROC score of 0.87. The approach was further tested on 15 different oncogenes, demonstrating good results for the more commonly mutated oncogenes and underperforming for more rarely mutated ones. Finally, the PIK3CA model was used on an external set of TCGA samples to determine whether the classifier was also able to infer the presence of additional PIK3CA oncogenic mutations. This project highlighted the importance of novel AI based approaches on cancer data and the potential applications of transcriptomic data as biomarker to further improve precision oncology.
The aim of this project is the development of computational methodologies to support modern precision oncology and to help expand its modern implementations. Results are divided in two sections called Chapter I and Chapter II.
In the Chapter I we present PipeIT, a somatic variant caller we have developed to help researchers and clinicians to detect potential driver mutations in patients. PipeIT has been specifically designed to process data obtained from Ion Torrent, a sequencing platform frequently used in diagnostic settings but, compared to the other sequencing platforms, with few analysis tools. The novelty brought by PipeIT is its Singularity container nature, which ensures reproducibility of its analyses and enhances its ease of use. Two different PipeIT versions were developed. PipeIT was designed to perform variant calling on tumor-germline matched data. PipeIT2 was later developed to enable variant calling analysis of tumor only data, to broaden its use in the typical clinical setting. PipeIT2 takes advantage of publicly accessible databases and on panels of unmatched normals to account for the absence of a matched germline control. Both PipeIT pipelines were able to detect important driver genomic variants, proving to be a powerful tool for modern precision oncology.
In Chapter II we investigated the role of gene expression data as an alternative to DNA biomarkers to detect the presence of oncogenic molecular processes in cancer patients. Based on the assumption that the activation of oncogenic pathways caused by driver mutations can produce a specific transcriptional profile, we designed a machine learning classifier able to extract said profile from patients with driver hotspot mutations and infer its presence in patients who do not have the same hotspot mutations. The classifier was first tested on one of the most frequently mutated oncogenes, PIK3CA, using publicly accessible TCGA pan-cancer data. The classifier was able to detect the presence of PIK3CA hotspot driver mutations on a testing data obtaining a ROC score of 0.87. The approach was further tested on 15 different oncogenes, demonstrating good results for the more commonly mutated oncogenes and underperforming for more rarely mutated ones. Finally, the PIK3CA model was used on an external set of TCGA samples to determine whether the classifier was also able to infer the presence of additional PIK3CA oncogenic mutations. This project highlighted the importance of novel AI based approaches on cancer data and the potential applications of transcriptomic data as biomarker to further improve precision oncology.
Advisors: | Terracciano, Luigi M. and Hall, Michael N. and Vogt, Julia E. |
---|---|
Faculties and Departments: | 03 Faculty of Medicine > Bereich Querschnittsfächer (Klinik) > Pathologie USB > Molekulare Pathologie (Terracciano) 03 Faculty of Medicine > Departement Klinische Forschung > Bereich Querschnittsfächer (Klinik) > Pathologie USB > Molekulare Pathologie (Terracciano) |
UniBasel Contributors: | Garofoli, Andrea and Terracciano, Luigi M. and Hall, Michael N. and Vogt, Julia |
Item Type: | Thesis |
Thesis Subtype: | Doctoral Thesis |
Thesis no: | 14215 |
Thesis status: | Complete |
Number of Pages: | 123 |
Language: | English |
Identification Number: |
|
edoc DOI: | |
Last Modified: | 12 Aug 2021 04:30 |
Deposited On: | 22 Jul 2021 15:00 |
Repository Staff Only: item control page