Vogt, Julia. Supervised & unsupervised transfer learning. 2013, Doctoral Thesis, University of Basel, Faculty of Science.
|
PDF
3298Kb |
Official URL: http://edoc.unibas.ch/diss/DissB_10347
Downloads: Statistics Overview
Abstract
This thesis investigates transfer learning in two areas of data analysis, supervised
and unsupervised learning. We study multi-task learning on vectorial
data in a supervised setting and multi-view clustering on pairwise distance
data in a Bayesian unsupervised approach. The aim in both areas is to transfer
knowledge over different related data sets as opposed to learning on single
data sets separately.
In supervised learning, not only the input vectors but also the corresponding target vectors are observed. The aim is to learn a mapping from the input
space to the target space to predict the target values for new samples. In
standard classification or regression problems, one data set at a time is considered
and the learning problem for every data set is solved separately. In
this work, we are looking at the non-standard case of learning by exploiting
the information given by multiple related tasks. Multi-task learning is based
on the assumption that multiple tasks share some features or structures. One
well-known technique solving multi-task problems is the Group-Lasso with
2-norm regularization. The motivation for using the Group-Lasso is to couple
the individual tasks via the group-structure of the constraint term. Our main
contribution in the supervised learning part consists in deriving a complete
analysis of the Group-Lasso for all p-norm regularizations, including results
about uniqueness and completeness of solutions and coupling properties of
different p-norms. In addition, a highly efficient active set algorithm for all
p-norms is presented which is guaranteed to converge and which is able to
operate on extremely high-dimensional input spaces. For the first time, this
allows a direct comparison and evaluation of all possible Group-Lasso methods
for all p-norms in large scale experiments. We show that in a multi-task
setting, both, tight coupling norms with p >> 2 and loose coupling norms
with p << 2 significantly degrade the prediction performance. Moderate coupling
norms seem to be the best compromise between coupling
strength and robustness against systematic differences between the tasks.
The second area of data analysis we look at is unsupervised learning. In unsupervised
learning, the training data consists of input vectors without any
corresponding target vectors. Classical problems in unsupervised learning
are clustering, density estimation or dimensionality reduction. As in the supervised
scenario, we are not only considering single data sets independently
of each other, but we want to learn over two or more data sets simultaneously.
A problem that arises frequently is that the data is only available as
pairwise distances between objects (e.g. pairwise string alignment scores from
protein sequences) and a loss-free embedding into a vector space is usually
not possible. We propose a Bayesian clustering model that is able to operate
on this kind of distance data without explicitly embedding it into a vector
space. Our main contribution in the unsupervised learning part is twofold.
Firstly, we derive a fully probabilistic clustering method based on pairwise
Euclidean distances, that is rotation-, translation-, and scale- invariant and
uses the Wishart distribution in the likelihood term. On the algorithmic
side, a highly efficient sampling algorithm is presented. Experiments indicate
the advantage of encoding the translation invariance into the likelihood,
and our clustering algorithm clearly outperforms several hierarchical clustering
methods. Secondly, we extend this clustering method to a novel Bayesian
multi-view clustering approach based on distance data. We show that the
multi-view clustering method reveals shared information between different
views of a phenomenon and we obtain an improved clustering compared to
clustering on every view separately.
and unsupervised learning. We study multi-task learning on vectorial
data in a supervised setting and multi-view clustering on pairwise distance
data in a Bayesian unsupervised approach. The aim in both areas is to transfer
knowledge over different related data sets as opposed to learning on single
data sets separately.
In supervised learning, not only the input vectors but also the corresponding target vectors are observed. The aim is to learn a mapping from the input
space to the target space to predict the target values for new samples. In
standard classification or regression problems, one data set at a time is considered
and the learning problem for every data set is solved separately. In
this work, we are looking at the non-standard case of learning by exploiting
the information given by multiple related tasks. Multi-task learning is based
on the assumption that multiple tasks share some features or structures. One
well-known technique solving multi-task problems is the Group-Lasso with
2-norm regularization. The motivation for using the Group-Lasso is to couple
the individual tasks via the group-structure of the constraint term. Our main
contribution in the supervised learning part consists in deriving a complete
analysis of the Group-Lasso for all p-norm regularizations, including results
about uniqueness and completeness of solutions and coupling properties of
different p-norms. In addition, a highly efficient active set algorithm for all
p-norms is presented which is guaranteed to converge and which is able to
operate on extremely high-dimensional input spaces. For the first time, this
allows a direct comparison and evaluation of all possible Group-Lasso methods
for all p-norms in large scale experiments. We show that in a multi-task
setting, both, tight coupling norms with p >> 2 and loose coupling norms
with p << 2 significantly degrade the prediction performance. Moderate coupling
norms seem to be the best compromise between coupling
strength and robustness against systematic differences between the tasks.
The second area of data analysis we look at is unsupervised learning. In unsupervised
learning, the training data consists of input vectors without any
corresponding target vectors. Classical problems in unsupervised learning
are clustering, density estimation or dimensionality reduction. As in the supervised
scenario, we are not only considering single data sets independently
of each other, but we want to learn over two or more data sets simultaneously.
A problem that arises frequently is that the data is only available as
pairwise distances between objects (e.g. pairwise string alignment scores from
protein sequences) and a loss-free embedding into a vector space is usually
not possible. We propose a Bayesian clustering model that is able to operate
on this kind of distance data without explicitly embedding it into a vector
space. Our main contribution in the unsupervised learning part is twofold.
Firstly, we derive a fully probabilistic clustering method based on pairwise
Euclidean distances, that is rotation-, translation-, and scale- invariant and
uses the Wishart distribution in the likelihood term. On the algorithmic
side, a highly efficient sampling algorithm is presented. Experiments indicate
the advantage of encoding the translation invariance into the likelihood,
and our clustering algorithm clearly outperforms several hierarchical clustering
methods. Secondly, we extend this clustering method to a novel Bayesian
multi-view clustering approach based on distance data. We show that the
multi-view clustering method reveals shared information between different
views of a phenomenon and we obtain an improved clustering compared to
clustering on every view separately.
Advisors: | Roth, Volker |
---|---|
Committee Members: | Buhmann, Joachim |
Faculties and Departments: | 05 Faculty of Science > Departement Mathematik und Informatik > Informatik > Biomedical Data Analysis (Roth) |
UniBasel Contributors: | Vogt, Julia and Roth, Volker |
Item Type: | Thesis |
Thesis Subtype: | Doctoral Thesis |
Thesis no: | 10347 |
Thesis status: | Complete |
Number of Pages: | 120 S. |
Language: | English |
Identification Number: |
|
edoc DOI: | |
Last Modified: | 22 Jan 2018 15:51 |
Deposited On: | 07 May 2013 10:37 |
Repository Staff Only: item control page