Hybrid human-machine information systems for data classification

Shabani, Shaban. Hybrid human-machine information systems for data classification. 2023, Doctoral Thesis, University of Basel, Faculty of Science.


Official URL: https://edoc.unibas.ch/96236/

Downloads: Statistics Overview


Over the last decade, we have seen an intense development of machine learning approaches for solving various tasks in diverse domains. Despite the remarkable advancements in this field, there are still task categories that machine learning models fall short of the required accuracy. This is the case with tasks that require human cognitive skills, such as sentiment analysis, emotional or contextual understanding. On the other hand, human-based computation approaches, such as crowdsourcing, are popular for solving such tasks. Crowdsourcing enables access to a vast number of groups with different expertise, and if managed properly, generates high-quality results. However, crowdsourcing as a standalone approach is not scalable due to the latency and cost it brings in.
Addressing the challenges and limitations that the human and machine-based approaches have distinctly requires bridging the two fields into a hybrid intelligence, seen as a promising approach to solve critical and complex real-world tasks. This thesis focuses on hybrid human-machine information systems, combining machine and human intelligence and leveraging their complementary strengths: the data processing efficiency of machine learning and the data quality generated by crowdsourcing.
In this thesis, we present hybrid human-machine models to address the challenges falling into three dimensions: accuracy, latency, and cost. Solving data classification tasks in different domains has different requirements concerning accuracy, latency, and cost criteria. Motivated by this fact, we introduce a master component that evaluates these criteria to find the suitable model as a trade-off solution. In hybrid human-machine information systems, incorporating human judgments is expected to improve the accuracy of the system. Therefore, to ensure this, we focus on the human intelligence component, integrating profile-aware crowdsourcing for task assignment and data quality control mechanisms in the hybrid pipelines.
The proposed conceptual hybrid human-machine models materialize in conducted experiments. Motivated by challenging scenarios and using real-world datasets, we implement the hybrid models in three experiments. Evaluations show that the implemented hybrid human-machine architectures for data classification tasks lead to better results as compared to each of the two approaches individually, improving the overall accuracy at an acceptable cost and latency.
Advisors:Sokhn, Maria
Committee Members:Schuldt, Heiko and Genoud, Dominique
Faculties and Departments:05 Faculty of Science > Departement Mathematik und Informatik > Informatik > Databases and Information Systems (Schuldt)
UniBasel Contributors:Shabani, Shaban and Schuldt, Heiko
Item Type:Thesis
Thesis Subtype:Doctoral Thesis
Thesis no:15275
Thesis status:Complete
Number of Pages:xvi, 191
Identification Number:
  • urn: urn:nbn:ch:bel-bau-diss152751
edoc DOI:
Last Modified:07 Feb 2024 05:30
Deposited On:06 Feb 2024 09:53

Repository Staff Only: item control page