Bendel, Alexandra Michaela. From mutations to meanings: using deep mutational scanning to understand sequence-function relationships through protein-protein interaction specificity. 2024, Doctoral Thesis, University of Basel, Associated Institution, Associated Institutions.
PDF
Restricted to Repository staff only until 13 February 2025. 18Mb |
Official URL: https://edoc.unibas.ch/96296/
Downloads: Statistics Overview
Abstract
One of the fundamental questions of biological and genetic research asks how genotypes are translated into phenotypes or, more specifically, how sequence is translated into molecular function. Protein-protein interactions (PPIs) form the functional backbone of a cell by forming large protein interaction networks (PINs) that coordinate molecular processes. To establish these networks within the crowded cellular environment and prevent unwanted crosstalk, most PPIs are highly specific. PPI specificity is encoded in sequence, and mutations can alter specificity, rewire PINs, and cause diseases. Therefore, understanding how sequence determines interaction specificity is essential for understanding how molecular processes are regulated and the mechanisms of diseases. However, finding determinants of specificity for PPIs of interest, let alone predicting them computationally remains a long-standing challenge.
Ground-breaking advances like AlphaFold2 have brought forward the challenge of predicting PPIs, yet they still cannot pin down specific determinants of specificity or account for quantitative changes or loss of interactions in case of mutations. In order to overcome these limitations, other datasets to train computational models are required.
These datasets must contain quantitative measurements of how perturbations (mutations) alter PPIs to understand how individual positions within the protein contribute to its specificity. A family of assays that is well-suited to generate such datasets at large scale is deep mutational scanning (DMS). Here, hundreds of thousands of sequence alterations of a protein function of interest can be measured in parallel.
In this work, a combined assay of DMS and a split-DHFR enzyme-protein complementation assay (ddPCA) was applied. ddPCA allows parallel assessment of the effect of thousands of mutations on PPIs. The family of human basic leucine zipper (bZIP) interaction domains was used as a model system. They display highly diverse specificities while being conserved in sequence and thus represent an appropriate model for the assessment of determinants of specificity at the network level.
To establish the assay, a first screen was performed in which was measured how every single point mutation in the JUN zipper altered JUN’s interaction with all 54 wildtype bZIPs. With the aid of
4
thermodynamic modeling, this provided the first comprehensive map of global effects on all of a protein’s binding partners and further revealed determinants of specificity for individual interactions within the network.
The assay was then optimized in order to identify potential sources of non-linearities that might result in biases when further scaling up the screen.
Finally, ddPCA was used to perform the most extensive DMS to date in which all single point mutants of the entire bZIP family were assayed for their interaction specificities and abundance. The dataset of more than two million pairwise interactions revealed network-wide mutation effects that can inform our understanding of bZIP network properties and the mechanisms of disease. Finally, this data was used to develop a deep learning model that can predict bZIP interactions from sequence and holds promise to pioneer a new generation of models that go beyond the limits of models such as AlphaFold2 and are able to capture quantitative aspects of the sequence-function relationship.
In this thesis, an experimental and analytical framework to solve sequence-function relationships was developed. It allows revealing the genetic architecture of PPI specificity in the bZIP family and build a domain-specific sequence-function model. Further, when applied more generally to different domain families with different architectures, it can aid with generalizing the model to all PPIs.
Ground-breaking advances like AlphaFold2 have brought forward the challenge of predicting PPIs, yet they still cannot pin down specific determinants of specificity or account for quantitative changes or loss of interactions in case of mutations. In order to overcome these limitations, other datasets to train computational models are required.
These datasets must contain quantitative measurements of how perturbations (mutations) alter PPIs to understand how individual positions within the protein contribute to its specificity. A family of assays that is well-suited to generate such datasets at large scale is deep mutational scanning (DMS). Here, hundreds of thousands of sequence alterations of a protein function of interest can be measured in parallel.
In this work, a combined assay of DMS and a split-DHFR enzyme-protein complementation assay (ddPCA) was applied. ddPCA allows parallel assessment of the effect of thousands of mutations on PPIs. The family of human basic leucine zipper (bZIP) interaction domains was used as a model system. They display highly diverse specificities while being conserved in sequence and thus represent an appropriate model for the assessment of determinants of specificity at the network level.
To establish the assay, a first screen was performed in which was measured how every single point mutation in the JUN zipper altered JUN’s interaction with all 54 wildtype bZIPs. With the aid of
4
thermodynamic modeling, this provided the first comprehensive map of global effects on all of a protein’s binding partners and further revealed determinants of specificity for individual interactions within the network.
The assay was then optimized in order to identify potential sources of non-linearities that might result in biases when further scaling up the screen.
Finally, ddPCA was used to perform the most extensive DMS to date in which all single point mutants of the entire bZIP family were assayed for their interaction specificities and abundance. The dataset of more than two million pairwise interactions revealed network-wide mutation effects that can inform our understanding of bZIP network properties and the mechanisms of disease. Finally, this data was used to develop a deep learning model that can predict bZIP interactions from sequence and holds promise to pioneer a new generation of models that go beyond the limits of models such as AlphaFold2 and are able to capture quantitative aspects of the sequence-function relationship.
In this thesis, an experimental and analytical framework to solve sequence-function relationships was developed. It allows revealing the genetic architecture of PPI specificity in the bZIP family and build a domain-specific sequence-function model. Further, when applied more generally to different domain families with different architectures, it can aid with generalizing the model to all PPIs.
Advisors: | Diss, Guillaume |
---|---|
Committee Members: | Bühler, Marc and Leeuwen, Jolanda <<van>> |
Faculties and Departments: | 09 Associated Institutions > Friedrich Miescher Institut FMI > Epigenetics > Non-coding RNAs and chromatin (Bühler) 05 Faculty of Science |
UniBasel Contributors: | Bühler, Marc |
Item Type: | Thesis |
Thesis Subtype: | Doctoral Thesis |
Thesis no: | 15317 |
Thesis status: | Complete |
Number of Pages: | 250 |
Language: | English |
Identification Number: |
|
edoc DOI: | |
Last Modified: | 05 Apr 2024 14:15 |
Deposited On: | 04 Apr 2024 08:46 |
Repository Staff Only: item control page