Machine Learning for Classical Planning: Neural Network Heuristics, Online Portfolios, and State Space Topologies

Ferber, Patrick Christoph. Machine Learning for Classical Planning: Neural Network Heuristics, Online Portfolios, and State Space Topologies. 2022, Doctoral Thesis, University of Basel, Faculty of Science.


Official URL: https://edoc.unibas.ch/91787/

Downloads: Statistics Overview


State space search solves navigation tasks and many other real world problems. Heuristic search, especially greedy best-first search, is one of the most successful algorithms for state space search. We improve the state of the art in heuristic search in three directions.
In Part I, we present methods to train neural networks as powerful heuristics for a given state space. We present a universal approach to generate training data using random walks from a (partial) state. We demonstrate that our heuristics trained for a specific task are often better than heuristics trained for a whole domain. We show that the performance of all trained heuristics is highly complementary. There is no clear pattern, which trained heuristic to prefer for a specific task. In general, model-based planners still outperform planners with trained heuristics. But our approaches exceed the model-based algorithms in the Storage domain. To our knowledge, only once before in the Spanner domain, a learning-based planner exceeded the state-of-the-art model-based planners. A priori, it is unknown whether a heuristic, or in the more general case a planner, performs well on a task. Hence, we trained online portfolios to select the best planner for a task. Today, all online portfolios are based on handcrafted features. In Part II, we present new online portfolios based on neural networks, which receive the complete task as input, and not just a few handcrafted features. Additionally, our portfolios can reconsider their choices. Both extensions greatly improve the state-of-the-art of online portfolios. Finally, we show that explainable machine learning techniques, as the alternative to neural networks, are also good online portfolios. Additionally, we present methods to improve our trust in their predictions.
Even if we select the best search algorithm, we cannot solve some tasks in reasonable time. We can speed up the search if we know how it behaves in the future. In Part III, we inspect the behavior of greedy best-first search with a fixed heuristic on simple tasks of a domain to learn its behavior for any task of the same domain. Once greedy best-first search expanded a progress state, it expands only states with lower heuristic values. We learn to identify progress states and present two methods to exploit this knowledge. Building upon this, we extract the bench transition system of a task and generalize it in such a way that we can apply it to any task of the same domain. We can use this generalized bench transition system to split a task into a sequence of simpler searches.
In all three research directions, we contribute new approaches and insights to the state of the art, and we indicate interesting topics for future work.
Advisors:Helmert, Malte
Committee Members:Hoffmann, Jörg and Schuldt, Heiko and Thiebaux, Sylvie and Sanner, Scott
Faculties and Departments:05 Faculty of Science > Departement Mathematik und Informatik > Informatik > Artificial Intelligence (Helmert)
UniBasel Contributors:Helmert, Malte and Schuldt, Heiko
Item Type:Thesis
Thesis Subtype:Doctoral Thesis
Thesis no:14907
Thesis status:Complete
Number of Pages:x, 163
Identification Number:
  • urn: urn:nbn:ch:bel-bau-diss149074
edoc DOI:
Last Modified:05 Jan 2023 05:30
Deposited On:04 Jan 2023 09:55

Repository Staff Only: item control page