Debugging a Policy: Automatic Action-Policy Testing in AI Planning

Steinmetz, Marcel and Fišer, Daniel and Enişer, Hasan Ferit and Ferber, Patrick and Gros, Timo and Heim, Philippe and Höller, Daniel and Schuler, Xandra and Wüstholz, Valentin and Christakis, Maria and Hoffmann, Jörg. (2022) Debugging a Policy: Automatic Action-Policy Testing in AI Planning. In: Proceedings of the Thirty-Second International Conference on Automated Planning and Scheduling (ICAPS2022). pp. 353-361.

[img] PDF - Accepted Version

Official URL: https://edoc.unibas.ch/93506/

Downloads: Statistics Overview


Testing is a promising way to gain trust in neural action policies π. Previous work on policy testing in sequential decision making targeted environment behavior leading to failure conditions. But if the failure is unavoidable given that behavior, then π is not actually to blame. For a situation to qualify as a "bug" in π, there must be an alternative policy π' that does better. We introduce a generic policy testing framework based on that intuition. This raises the bug confirmation problem, deciding whether or not a state is a bug. We analyze the use of optimistic and pessimistic bounds for the design of test oracles approximating that problem. We contribute an implementation of our framework in classical planning, experimenting with several test oracles and with random-walk methods generating test states biased to poor policy performance and/or state novelty. We evaluate these techniques on policies π learned with ASNets. We find that they are able to effectively identify bugs in these π, and that our random-walk biases improve over uninformed baselines.
Faculties and Departments:05 Faculty of Science > Departement Mathematik und Informatik > Informatik > Artificial Intelligence (Helmert)
UniBasel Contributors:Ferber, Patrick
Item Type:Conference or Workshop Item, refereed
Conference or workshop item Subtype:Conference Paper
Publisher:AAAI Press
Note:Publication type according to Uni Basel Research Database: Conference paper
Identification Number:
edoc DOI:
Last Modified:29 Mar 2023 12:14
Deposited On:15 Feb 2023 11:12

Repository Staff Only: item control page