Quantum Machine Learning Applied to Chemical Reaction Space

Heinen, Stefan. Quantum Machine Learning Applied to Chemical Reaction Space. 2021, Doctoral Thesis, University of Basel, Faculty of Science.


Official URL: https://edoc.unibas.ch/89359/

Downloads: Statistics Overview


The scope of this thesis is the application of quantum machine learning (QML) methods to problems in quantum chemistry and chemical compound space, especially chemical reactions.
First, QML models were introduced to improve job scheduling of quantum chemistry tasks on small university and large super computing clusters.
Using QML based wall time predictions to optimally distribute the workload on a cluster resulted in a significant reduction of the time to solution by up to 90% depending on the type of calculation studied: Ranging from single point calculations, over geometry optimizations, to transition state searches on a variety of levels of theory and basis sets.
The main focus of this thesis remains with the navigation through the chemical reaction space using QML models.
To train and test these models large, consistent, and carefully evaluated data sets are required.
While extensive data sets with experimental results are available, consistent quantum chemical data sets, especially for reactions, are rare in literature.
Thus, a dataset for two competing text book reactions E2 and SN2 was generated, reporting thousands of reactant complexes and transition states with different nucleophiles (-H$^{-}$,-F$^{-}$, -Cl$^{-}$, -Br$^{-}$), leaving groups (-F, -Cl, -Br), and functional groups (-H, -NO$_2$, -CN, -CH$_3$, -NH$_2$) on an ethane scaffold.
The geometries were obtained on the MP2/6-311G(d) level of theory with subsequent DF-LCCSD/cc-pVTZ single point calculation.
However, limited by computational resources, the data set was incomplete.
Therefore, reactant to barrier (R2B) machine learning models were introduced to support the data generation and complete the dateset by predicting ~11'000 activation barriers solely using the reactant geometry as input.
Using R2B predictions, design rules for chemical reaction channels were derived by constructing decision trees.
Furthermore, Hammond's postulate was investigated, showing the limits for its application on reactants far away from the transition state, e.g. conformers.
Finally, the geometry relaxation and transition state search solely using machine learned energies and forces was investigated.
Trained on 200 reactions, the QML model was able to find 300 transition states, reaching out of sample RMSD of 0.14Å and 0.4Å for reactant geometries and transition states, respectively.
Although, relatively large RMSD for the geometries remain, the out of sample MAE of 26.06$\mathrm{cm^{-1}}$ for the transition state frequencies show a well described curvature of the transition state normal modes in agreement with the MP2 reference.
Advisors:von Lilienfeld, Anatole and Meuwly, Markus and Kästner, Johannes
Faculties and Departments:05 Faculty of Science > Departement Chemie > Former Organization Units Chemistry > Physikalische Chemie (Lilienfeld)
UniBasel Contributors:von Lilienfeld, Anatole and Meuwly, Markus
Item Type:Thesis
Thesis Subtype:Doctoral Thesis
Thesis no:14778
Thesis status:Complete
Number of Pages:121
Identification Number:
  • urn: urn:nbn:ch:bel-bau-diss147789
edoc DOI:
Last Modified:02 Sep 2022 04:30
Deposited On:01 Sep 2022 12:54

Repository Staff Only: item control page