Repository logo
Log In
  1. Home
  2. Unibas
  3. Publications
  4. Trial-based heuristic tree search for finite horizon MDPs
 
  • Details

Trial-based heuristic tree search for finite horizon MDPs

Date Issued
2013-01-01
Author(s)
Keller, Thomas
Helmert, Malte  
Abstract
Dynamic programming is a well-known approach for solving MDPs. In large state spaces, asynchronous versions like Real-Time Dynamic Programming have been applied successfully. If unfolded into equivalent trees, Monte-Carlo Tree Search algorithms are a valid alternative. UCT, the most popular representative, obtains good anytime behavior by guiding the search towards promising areas of the search tree. The Heuristic Search algorithm AO∗ finds optimal solutions for MDPs that can be represented as acyclic AND/OR graphs. We introduce a common framework, Trial-based Heuristic Tree Search, that subsumes these approaches and distinguishes them based on five ingredients: heuristic function, backup function, action selection, outcome selection, and trial length. Using this framework, we describe three new algorithms which mix these ingredients in novel ways in an attempt to combine their different strengths. Our evaluation shows that two of our algorithms not only provide superior theoretical properties to UCT, but also outperform state-of-the-art approaches experimentally.
File(s)
Loading...
Thumbnail Image
Name

6026-30088-1-PB.pdf

Size

384.7 KB

Format

Adobe PDF

Checksum

(MD5):8eff4390259a860c062a49a6b253c811

University of Basel

edoc
Open Access Repository University of Basel

  • About edoc
  • About Open Access at the University of Basel
  • edoc Policy

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science

  • Privacy policy
  • End User Agreement