Temporal multimodal video and lifelog retrieval

Heller, Silvan. Temporal multimodal video and lifelog retrieval. 2023, Doctoral Thesis, University of Basel, Faculty of Science.


Official URL: https://edoc.unibas.ch/94776/

Downloads: Statistics Overview


The past decades have seen exponential growth of both consumption and production of data, with multimedia such as images and videos contributing significantly to said growth. The widespread proliferation of smartphones has provided everyday users with the ability to consume and produce such content easily. As the complexity and diversity of multimedia data has grown, so has the need for more complex retrieval models which address the information needs of users. Finding relevant multimedia content is central in many scenarios, from internet search engines and medical retrieval to querying one's personal multimedia archive, also called lifelog. Traditional retrieval models have often focused on queries targeting small units of retrieval, yet users usually remember temporal context and expect results to include this. However, there is little research into enabling these information needs in interactive multimedia retrieval.
In this thesis, we aim to close this research gap by making several contributions to multimedia retrieval with a focus on two scenarios, namely video and lifelog retrieval. We provide a retrieval model for complex information needs with temporal components, including a data model for multimedia retrieval, a query model for complex information needs, and a modular and adaptable query execution model which includes novel algorithms for result fusion. The concepts and models are implemented in vitrivr, an open-source multimodal multimedia retrieval system, which covers all aspects from extraction to query formulation and browsing. vitrivr has proven its usefulness in evaluation campaigns and is now used in two large-scale interdisciplinary research projects. We show the feasibility and effectiveness of our contributions in two ways: firstly, through results from user-centric evaluations which pit different user-system combinations against one another. Secondly, we perform a system-centric evaluation by creating a new dataset for temporal information needs in video and lifelog retrieval with which we quantitatively evaluate our models.
The results show significant benefits for systems that enable users to specify more complex information needs with temporal components. Participation in interactive retrieval evaluation campaigns over multiple years provides insight into possible future developments and challenges of such campaigns.
Advisors:Schuldt, Heiko
Committee Members:Helmert, Malte and Crestani, Fabio
Faculties and Departments:05 Faculty of Science > Departement Mathematik und Informatik > Informatik > Artificial Intelligence (Helmert)
05 Faculty of Science > Departement Mathematik und Informatik > Informatik > Databases and Information Systems (Schuldt)
UniBasel Contributors:Schuldt, Heiko and Helmert, Malte
Item Type:Thesis
Thesis Subtype:Doctoral Thesis
Thesis no:15036
Thesis status:Complete
Number of Pages:xxi, 194
Identification Number:
  • urn: urn:nbn:ch:bel-bau-diss150360
edoc DOI:
Last Modified:24 Jun 2023 04:30
Deposited On:23 Jun 2023 08:35

Repository Staff Only: item control page