edoc

Learning Invariant Representations for Deep Latent Variable Models

Wieser, Mario. Learning Invariant Representations for Deep Latent Variable Models. 2020, Doctoral Thesis, University of Basel, Faculty of Science.

[img]
Preview
PDF
Available under License CC BY-NC-ND (Attribution-NonCommercial-NoDerivatives).

7Mb

Official URL: https://edoc.unibas.ch/79859/

Downloads: Statistics Overview

Abstract

Deep latent variable models introduce a new class of generative models which are able to handle unstructured data and encode non-linear dependencies. Despite their known flexibility, these models are frequently not invariant against target-specific transformations. Therefore, they suffer from model mismatches and are challenging to interpret or control. We employ the concept of symmetry transformations from physics to formally describe these invariances. In this thesis, we investigate how we can model invariances when a symmetry transformation is either known or unknown. As a consequence, we make contributions in the domain of variable compression under side information and generative modelling. In our first contribution, we investigate the problem where a symmetry transformation is known yet not implicitly learned by the model. Specifically, we consider the task of estimating mutual information in the context of the deep information bottleneck which is not invariant against monotone transformations. To address this limitation, we extend the deep information bottleneck with a copula construction. In our second contribution, we address the problem of learning target-invariant subspaces for generative models. In this case, the symmetry transformation is unknown and has to be learned from data. We achieve this by formulating a deep information bottleneck with a target and a target-invariant subspace. To ensure invariance, we provide a continuous mutual information regulariser based on adversarial training. In our last contribution, we introduce an improved method for learning unknown symmetry transformations with cycle-consistency. To do so, we employ the equivalent deep information bottleneck method with a partitioned latent space. However, we ensure target-invariance by utilizing a cycle-consistency loss in the latent space. As a result, we overcome potential convergence issues introduced by adversarial training and are able to deal with mixed data. In summary, each of our presented models provide an attempt to better control and understand deep latent variables models by learning symmetry transformations. We demonstrated the effectiveness of our contributions with an extensive evaluation on both artificial and real-world experiments.
Advisors:Roth, Volker
Committee Members:Vetter, Thomas
Faculties and Departments:05 Faculty of Science > Departement Mathematik und Informatik > Informatik > Biomedical Data Analysis (Roth)
UniBasel Contributors:Wieser, Mario and Roth, Volker and Vetter, Thomas
Item Type:Thesis
Thesis Subtype:Doctoral Thesis
Thesis no:14000
Thesis status:Complete
Number of Pages:xv, 83
Language:English
Identification Number:
  • urn: urn:nbn:ch:bel-bau-diss140006
edoc DOI:
Last Modified:02 Mar 2021 05:30
Deposited On:01 Mar 2021 12:14

Repository Staff Only: item control page