Data Management for Dynamic Multimedia Analytics and Retrieval

Gasser, Ralph Marc Philipp. Data Management for Dynamic Multimedia Analytics and Retrieval. 2023, Doctoral Thesis, University of Basel, Faculty of Science.

Available under License CC BY-NC-ND (Attribution-NonCommercial-NoDerivatives).


Official URL: https://edoc.unibas.ch/92896/

Downloads: Statistics Overview


Multimedia data in its various manifestations poses a unique challenge from a data storage and data management perspective, especially if search, analysis and analytics in large data corpora is considered. The inherently unstructured nature of the data itself and the curse of dimensionality that afflicts the representations we typically work with in its stead are cause for a broad range of issues that require sophisticated solutions at different levels. This has given rise to a huge corpus of research that puts focus on techniques that allow for effective and efficient multimedia search and exploration. Many of these contributions have led to an array of purpose-built, multimedia search systems.
However, recent progress in multimedia analytics and interactive multimedia retrieval, has demonstrated that several of the assumptions usually made for such multimedia search workloads do not hold once a session has a human user in the loop. Firstly, many of the required query operations cannot be expressed by mere similarity search and since the concrete requirement cannot always be anticipated, one needs a flexible and adaptable data management and query framework. Secondly, the widespread notion of staticity of data collections does not hold if one considers analytics workloads, whose purpose is to produce and store new insights and information. And finally, it is impossible even for an expert user to specify exactly how a data management system should produce and arrive at the desired outcomes of the potentially many different queries.
Guided by these shortcomings and motivated by the fact that similar questions have once been answered for structured data in classical database research, this Thesis presents three contributions that seek to mitigate the aforementioned issues. We present a query model that generalises the notion of proximity-based query operations and formalises the connection between those queries and high-dimensional indexing. We complement this by a cost-model that makes the often implicit trade-off between query execution speed and results quality transparent to the system and the user. And we describe a model for the transactional and durable maintenance of high-dimensional index structures.
All contributions are implemented in the open-source multimedia database system Cottontail DB, on top of which we present an evaluation that demonstrates the effectiveness of the proposed models. We conclude by discussing avenues for future research in the quest for converging the fields of databases on the one hand and (interactive) multimedia retrieval and analytics on the other.
Advisors:Schuldt, Heiko
Committee Members:Tschudin, Christian F and Þór Jónsson, Björn
Faculties and Departments:05 Faculty of Science > Departement Mathematik und Informatik > Informatik > Databases and Information Systems (Schuldt)
UniBasel Contributors:Gasser, Ralph and Schuldt, Heiko and Tschudin, Christian F
Item Type:Thesis
Thesis Subtype:Doctoral Thesis
Thesis no:14934
Thesis status:Complete
Number of Pages:xxxv, 264
Identification Number:
  • urn: urn:nbn:ch:bel-bau-diss149345
edoc DOI:
Last Modified:07 Feb 2023 05:30
Deposited On:06 Feb 2023 15:35

Repository Staff Only: item control page