# Bachelor's and master's theses

## Completed theses

**Download:**(PDF) (slides; PDF) (sources; ZIP)

Increasing Cost Tree Search is a promising approach to multi-agent pathfinding problems, but like all approaches it has to deal with a huge number of possible joint paths, growing exponentially with the number of agents. We explore the possibility of reducing this by introducing a value abstraction to the Multi-valued Decision Diagrams used to represent sets of joint paths. To that end we introduce a heat map to heuristically judge how collisionprone agent positions are and present how to use and possible refine abstract positions in order to still find valid paths.

**Download:**(PDF) (slides; PDF) (sources; ZIP)

Estimating cheapest plan costs with the help of network flows is an established technique. Plans and network flows are already very similar, however network flows can differ from plans in the presence of cycles. If a transition system contains cycles, flows might be composed of multiple disconnected parts. This discrepancy can make the cheapest plan estimation worse. One idea to get rid of the cycles works by introducing time steps. For every time step the states of a transition system are copied. Transitions will be changed, so that they connect states only with states of the next time step, which ensures that there are no cycles. It turned out, that by applying this idea to multiple transitions systems, network flows of the individual transition systems can be synchronized via the time steps to get a new kind of heuristic, that will also be discussed in this thesis.

**Download:**(PDF) (slides; PDF) (sources; ZIP)

Probabilistic planning expands on classical planning by tying probabilities to the effects of actions. Due to the exponential size of the states, probabilistic planners have to come up with a strong policy in a very limited time. One approach to optimising the policy that can be found in the available time is called metareasoning, a technique aiming to allocate more deliberation time to steps where more time to plan results in an improvement of the policy and less deliberation time to steps where an improvement of the policy with more time to plan is unlikely.

This thesis aims to adapt a recent proposal of a formal metareasoning procedure from Lin. et al. for the search algorithm BRTDP to work with the UCT algorithm in the Prost planner and compare its viability to the current standard and a number of less informed time management methods in order to find a potential improvement to the current uniform deliberation time distribution.

**Download:**(PDF) (slides; PDF) (sources; ZIP)

A planner tries to produce a policy that leads to a desired goal given the available range of actions and an initial state. A traditional approach for an algorithm is to use abstraction. In this thesis we implement the algorithm described in the ASAP-UCT paper: Abstraction of State-Action Pairs in UCT by Ankit Anand, Aditya Grover, Mausam and Parag Singla.

The algorithm combines state and state-action abstraction with a UCT-algorithm. We come to the conclusion that the algorithm needs to be improved because the abstraction of action-state often cannot detect a similarity that a reasonable action abstraction could find.

**Download:**(PDF) (slides; PDF) (sources; ZIP)

Classical domain-independent planning is about finding a sequence of actions which lead from an initial state to a goal state. A popular approach for solving planning problems efficiently is to utilize heuristic functions. A possible heuristic function is the perfect heuristic of a delete relaxed planning problem denoted as h+. Delete relaxation simplifies the planning problem thus making it easier to find a perfect heuristic. However computing h+ is still NP-hard problem.

In this thesis we discuss a promising looking approach to compute h+ in practice. Inspired by the paper from Gnad, Hoffmann and Domshlak about star-shaped planning problems, we implemented the Flow-Cut algorithm. The basic idea behind flow-cut to divide a problem that is unsolvable in practice, into smaller sub problems that can be solved. We further tested the flow-cut algorithm on the domains provided by the International Planning Competition benchmarks, resulting in the following conclusion: Using a divide and conquer approach can successfully be used to solve classical planning problems, however it is not trivial to design such an algorithm to be more efficient than state-of-the-art search algorithm.

**Download:**(PDF) (slides; PDF) (sources; ZIP)

Probabilistic planning is a research field that has become popular in the early 1990s. It aims at finding an optimal policy which maximizes the outcome of applying actions to states in an environment that feature unpredictable events. Such environments can consist of a large number of states and actions which make finding an optimal policy intractable using classical methods. Using a heuristic function for a guided search allows for tackling such problems. Designing a domain-independent heuristic function requires complex algorithms which may be expensive when it comes to time and memory consumption.

In this thesis, we are applying the supervised learning techniques for learning two domain-independent heuristic functions. We use three types of gradient descent methods: stochastic, batch and mini-batch gradient descent and their improved versions using momen- tum, learning decay rate and early stopping. Furthermore, we apply the concept of feature combination in order to better learn the heuristic functions. The learned functions are pro- vided to Prost, a domain-independent probabilistic planner, and benchmarked against the winning algorithms of the International Probabilistic Planning Competition held in 2014. The experiments show that learning an offline heuristic improves the overall score of the search for some of the domains used in aforementioned competition.

**Download:**(PDF) (slides; PDF) (sources; TAR.GZ)

This thesis deals with the algorithm presented in the paper "Landmark-based Meta Best-First Search Algorithm: First Parallelization Attempt and Evaluation" by Simon Vernhes, Guillaume Infantes and Vincent Vidal. Their idea was to reconsider the approach to landmarks as a tool in automated planning, but in a markedly different way than previous work had done. Their result is a meta-search algorithm which explores landmark orderings to find a series of subproblems that reliably lead to an effective solution. Any complete planner may be used to solve the subproblems. While the referenced paper also deals with an attempt to effectively parallelize the Landmark-based Meta Best-First Search Algorithm, this thesis is concerned mainly with the sequential implementation and evaluation of the algorithm in the Fast Downward planning system.

**Download:**(PDF) (slides; PDF) (sources; ZIP)

Heuristics play an important role in classical planning. Using heuristics during state space search often reduces the time required to find a solution, but constructing heuristics and using them to calculate heuristic values takes time, reducing this benefit. Constructing heuristics and calculating heuristic values as quickly as possible is very important to the effectiveness of a heuristic. In this thesis we introduce methods to bound the construction of merge-and-shrink to reduce its construction time and increase its accuracy for small problems and to bound the heuris- tic calculation of landmark cut to reduce heuristic value calculation time. To evaluate the performance of these depth-bound heuristics we have implemented them in the Fast Down- ward planning system together with three iterative-deepening heuristic search algorithms: iterative-deepening A* search, a new breadth-first iterative-deepening version of A* search and iterative-deepening breadth-first heuristic search.

**Download:**(PDF) (slides; PDF) (sources; TAR.GZ)

Greedy best-first search has proven to be a very efficient approach to satisficing planning but can potentially lose some of its effectiveness due to the used heuristic function misleading it to a local minimum or plateau. This is where exploration with additional open lists comes in, to assist greedy best-first search with solving satisficing planning tasks more effectively. Building on the idea of exploration by clustering similar states together as described by Xie et al. [2014], where states are clustered according to heuristic values, we propose in this paper to instead cluster states based on the Hamming distance of the binary representation of states [Hamming, 1950]. The resulting open list maintains k buckets and inserts each given state into the bucket with the smallest average hamming distance between the already clustered states and the new state. Additionally, our open list is capable of reclustering all states periodically with the use of the k-means algorithm. We were able to achieve promising results concerning the amount of expansions necessary to reach a goal state, despite not achieving a higher coverage than fully random exploration due to slow performance. This was caused by the amount of calculations required to identify the most fitting cluster when inserting a new state.

**Download:**(PDF) (slides; PDF) (sources; TAR.GZ)

Monte Carlo Tree Search Algorithms are an efficient method of solving probabilistic planning tasks that are modeled by Markov Decision Problems. MCTS uses two policies, a tree policy for iterating through the known part of the decission tree and a default policy to simulate the actions and their reward after leaving the tree. MCTS algorithms have been applied with great success to computer Go. To make the two policies fast many enhancements based on online knowledge have been developed. The goal of All Moves as First enhancements is to improve the quality of a reward estimate in the tree policy. In the context of this thesis the, in the field of computer Go very efficient, α-AMAF, Cutoff-AMAF as well as Rapid Action Value Estimation enhancements are implemented in the probabilistic planner PROST. To obtain a better default policy, Move Average Sampling is implemented into PROST and benchmarked against it’s current default policies.

**Download:**(PDF) (slides; PDF) (sources; ZIP)

The merge-and-shrink heuristic is a state-of-the-art admissible heuristic that is often used for optimal planning. Recent studies showed that the merge strategy is an important factor for the performance of the merge-and-shrink algorithm. There are many different merge strategies and improvements for merge strategies described in the literature. One out of these merge strategies is MIASM by Fan et al. MIASM tries to merge transition systems that produce unnecessary states in their product which can be pruned. Another merge strategy is the symmetry-based merge-and-shrink framework by Sievers et al. This strategy tries to merge transition systems that cause factored symmetries in their product. This strategy can be combined with other merge strategies and it often improves the performance for many merge strategy. However, the current combination of MIASM with factored symmetries performs worse than MIASM. We implement a different combination of MIASM that uses factored symmetries during the subset search of MIASM. Our experimental evaluation shows that our new combination of MIASM with factored symmetries solves more tasks than the existing MIASM and the previously implemented combination of MIASM with factored symmetries. We also evaluate different combinations of existing merge strategies and find combinations that perform better than their basic version that were not evaluated before.

**Download:**(PDF) (slides; PDF) (sources; TAR.GZ)

In classical planning the objective is to find a sequence of applicable actions that lead from the initial state to a goal state. In many cases the given problem can be of enormous size. To deal with these cases, a prominent method is to use heuristic search, which uses a heuristic function to evaluate states and can focus on the most promising ones. In addition to applying heuristics, the search algorithm can apply additional pruning techniques that exclude applicable actions in a state because applying them at a later point in the path would result in a path consisting of the same actions but in a different order. The question remains as to how these actions can be selected without generating too much additional work to still be useful for the overall search. In this thesis we implement and evaluate the partition-based path pruning method, proposed by Nissim et al. [1], which tries to decompose the set of all actions into partitions. Based on this decomposition, actions can be pruned with very little additional information. The partition-based pruning method guarantees with some alterations to the A* search algorithm to preserve it’s optimality. The evaluation confirms that in several standard planning domains, the pruning method can reduce the size of the explored state space.

**Download:**(PDF) (slides; PDF) (sources; ZIP)

Validating real-time systems is an important and complex task which becomes exponentially harder with increasing sizes of systems. Therefore finding an automated approach to check real-time systems for possible errors is crucial. The behaviour of such real-time systems can be modelled with timed automata. This thesis adapts and implements the under-approximation refinement algorithm developed for search based planners proposed by Heusner et al. to find error states in timed automata via the directed model checking approach. The evaluation compares the algorithm to already existing search methods and shows that a basic under-approximation refinement algorithm yields a competitive search method for directed model checking which is both fast and memory efficient. Additionally we illustrate that with the introduction of some minor alterations the proposed under- approximation refinement algorithm can be further improved.

**Download:**(PDF) (slides; PDF) (sources; ZIP)

In dieser Arbeit wird versucht eine Heuristik zu lernen. Damit eine Heuristik erlernbar ist, muss sie über Parameter verfügen, die die Heuristik bestimmen. Eine solche Möglichkeit bieten Potential-Heuristiken und ihre Parameter werden Potentiale genannt. Pattern-Databases können mit vergleichsweise wenig Aufwand Eigenschaften eines Zustandsraumes erkennen und können somit eingesetzt werden als Grundlage um Potentiale zu lernen. Diese Arbeit untersucht zwei verschiedene Ansätze zum Erlernen der Potentiale aufgrund der Information aus Pattern-Databases. In Experimenten werden die beiden Ansätze genauer untersucht und schliesslich mit der FF-Heuristik verglichen.

**Download:**(PDF) (slides; PDF) (sources; ZIP)

We consider real-time strategy (RTS) games which have temporal and numerical aspects and pose challenges which have to be solved within limited search time. These games are interesting for AI research because they are more complex than board games. Current AI agents cannot consistently defeat average human players, while even the best players make mistakes we think an AI could avoid. In this thesis, we will focus on StarCraft Brood War. We will introduce a formal definition of the model Churchill and Buro proposed for StarCraft. This allows us to focus on Build Order optimization only. We have implemented a base version of the algorithm Churchill and Buro used for their agent. Using the implementation we are able to find solutions for Build Order Problems in StarCraft Brood War.

**Download:**(PDF) (slides; PDF) (sources; ZIP)

Auf dem Gebiet der Handlungsplanung stellt die symbolische Suche eine der erfolgversprechendsten angewandten Techniken dar. Um eine symbolische Suche auf endlichen Zustandsräumen zu implementieren bedarf es einer geeigneten Datenstruktur für logische Formeln. Diese Arbeit erprobt die Nutzung von Sentential Decision Diagrams (SDDs) anstelle der gängigen Binary Decision Diagrams (BDDs) zu diesem Zweck. SDDs sind eine Generalisierung von BDDs. Es wird empirisch getestet wie eine Implementierung der symbolischen Suche mit SDDs im FastDownward-Planer sich mit verschiedenen vtrees unterscheidet. Insbesondere wird die Performance von balancierten vtrees, mit welchen die Stärken von SDDs oft gut zur Geltung kommen, mit rechtsseitig linearen vtrees verglichen, bei welchen sich SDDs wie BDDs verhalten.

**Download:**(PDF) (slides; PDF) (sources; ZIP)

Die Frage ob es gültige Sudokus - d.h. Sudokus mit nur einer
Lösung - gibt, die nur 16 Vorgaben haben, konnte im Dezember
2011 mithilfe einer erschöpfenden Brute-Force-Methode von
McGuire et al. verneint werden. Die Schwierigkeit dieser
Aufgabe liegt in dem ausufernden Suchraum des Problems und der
dadurch entstehenden Erforderlichkeit einer effizienten
Beweisidee sowie schnellerer Algorithmen. In dieser Arbeit wird
die Beweismethode von McGuire et al. bestätigt werden und für
2^{2} × 2^{2} und 3^{2} ×
3^{2} Sudokus in C++ implementiert.

**Download:**(PDF) (slides; PDF) (sources; ZIP)

Tree Cache is a pathfinding algorithm that selects one vertex as a root and constructs a tree with cheapest paths to all other vertices. A path is found by traversing up the tree from both the start and goal vertices to the root and concatenating the two parts. This is fast, but as all paths constructed this way pass through the root vertex they can be highly suboptimal.

To improve this algorithm, we consider two simple approaches. The first is to construct multiple trees, and save the distance to each root in each vertex. To find a path, the algorithm first selects the root with the lowest total distance. The second approach is to remove redundant vertices, i.e. vertices that are between the root and the lowest common ancestor (LCA) of the start and goal vertices. The performance and space requirements of the resulting algorithm are then compared to the conceptually similar hub labels and differential heuristics.

**Download:**(PDF) (slides; PDF) (sources; ZIP)

Greedy Best-First Search (GBFS) is a prominent search algorithm to find solutions for planning tasks. GBFS chooses nodes for further expansion based on a distance-to-goal estimator, the heuristic. This makes GBFS highly dependent on the quality of the heuristic. Heuristics often face the problem of producing Uninformed Heuristic Regions (UHRs). GBFS additionally suffers the possibility of simultaneously expanding nodes in multiple UHRs. In this thesis we change the heuristic approach in UHRs. The heuristic was unable to guide the search and so we try to expand novel states to escape the UHRs. The novelty measures how “new” a state is in the search. The result is a combination of heuristic and novelty guided search, which is indeed able to escape UHRs quicker and solve more problems in reasonable time.

**Download:**(PDF) (slides; PDF) (sources; ZIP)

Das Finden eines kürzesten Pfades zwischen zwei Punkten ist ein fundamentales Problem in der Graphentheorie. In der Praxis ist es oft wichtig, den Ressourcenverbrauch für das Ermitteln eines solchen Pfades minimal zu halten, was mithilfe einer komprimierten Pfaddatenbank erreicht werden kann. Im Rahmen dieser Arbeit bestimmen wir drei Verfahren, mit denen eine Pfaddatenbank möglichst platzsparend aufgestellt werden kann, und evaluieren die Effektivität dieser Verfahren anhand von Probleminstanzen verschiedener Grösse und Komplexität.

**Download:**(PDF) (slides; PDF)

In classical AI planning, the state explosion problem is a reoccurring subject: although the problem descriptions are compact, often a huge number of states needs to be considered. One way to tackle this problem is to use static pruning methods which reduce the number of variables and operators in the problem description before planning.

In this work, we discuss the properties and limitations of three existing static pruning techniques with a focus on satisficing planning. We analyse these pruning techniques and their combinations, and identify synergy effects between them and the domains and problem structures in which they occur. We implement the three methods into an existing propositional planner, and evaluate the performance of different configurations and combinations in a set of experiments on IPC benchmarks. We observe that static pruning techniques can increase the number of solved problems, and that the synergy effects of the combinations also occur on IPC benchmarks, although they do not lead to a major performance increase.

**Download:**(PDF) (slides; PDF) (sources; TAR.GZ)

In planning what we want to do is to get from an initial state into a goal state. A state can be described by a finite number of boolean valued variables. If we want to transition from one state to the other we have to apply an action and this, at least in probabilistic planning, leads to a probability distribution over a set of possible successor states. From each transition the agent gains a reward dependent on the current state and his action. In this setting the growth of the number of possible states is exponential with the number of variables. We assume that the value of these variables is determined for each variable independently in a probabilistic fashion. So these variables influence the number of possible successor states in the same way as they did the state space. In consequence it is almost impossible to obtain an optimal amount of reward approaching this problem with a brute force technique. One way to get past this problem is to abstract the problem and then solve a simplified version of the aforementioned. That’s in general the idea proposed by Boutilier and Dearden [1]. They have introduced a method to create an abstraction which depends on the reward formula and the dependencies contained in the problem. With this idea as a basis we’ll create a heuristic for a trial-based heuristic tree search (THTS) algorithm [5] and a standalone planner using the framework PROST (Keller and Eyerich, 2012). These will then be tested on all the domains of the International Probabilistic Planning Competition (IPPC).

**Download:**(PDF) (slides; PDF) (sources; ZIP)

The goal of classical domain-independent planning is to find a sequence of actions which lead from a given initial state to a goal state that satisfies some goal criteria. Most planning systems use heuristic search algorithms to find such a sequence of actions. A critical part of heuristic search is the heuristic function. In order to find a sequence of actions from an initial state to a goal state efficiently this heuristic function has to guide the search towards the goal. It is difficult to create such an efficient heuristic function. Arfaee et al. show that it is possible to improve a given heuristic function by applying machine learning techniques on a single domain in the context of heuristic search. To achieve this improvement of the heuristic function, they propose a bootstrap learning approach which subsequently improves the heuristic function.

In this thesis we will introduce a technique to learn heuristic
functions that can be used in classical domain-independent
planning based on the bootstrap-learning approach introduced by
Arfaee et al. In order to evaluate the performance of the
learned heuristic functions, we have implemented a learning
algorithm for the Fast Downward planning system. The
experiments have shown that a learned heuristic function
generally decreases the number of explored states compared to
*blind-search*. The total time to solve a single problem
increases because the heuristic function has to be learned
before it can be applied.

**Download:**(PDF) (slides; PDF) (sources; ZIP)

In einer Planungsaufgabe geht es darum einen gegebenen Wertezustand durch sequentielles Anwenden von Aktionen in einen Wertezustand zu überführen, welcher geforderte Zieleigenschaften erfüllt. Beim Lösen von Planungsaufgaben zählt Effizienz. Um Zeit und Speicher zu sparen verwenden viele Planer heuristische Suche. Dabei wird mittels einer Heuristik abgeschätzt, welche Aktion als nächstes angewendet werden soll um möglichst schnell in einen gewünschten Zustand zu gelangen.

In dieser Arbeit geht es darum, die von Haslum vorgeschlagene
P^{m}-Kompilierung für Planungsaufgaben zu implementieren und
die h^{max}-Heuristik auf dem kompilierten Problem gegen die
h^{m}-Heuristik auf dem originalen Problem zu testen. Die
Implementation geschieht als Ergänzung zum Fast-Downward-Planungssystem.
Die Resultate der Tests zeigen, dass mittels der Kompilierung die Zahl
der gelösten Probleme erhöht werden kann. Das Lösen eines kompilierten
Problems mit der h^{max}-Heuristik geschieht im allgemeinen mit
selbiger Informationstiefe schneller als das Lösen des originalen
Problems mit der h^{m}-Heuristik. Diesen Zeitgewinn erkauft man
sich mit einem höheren Speicherbedarf.

**Download:**(PDF) (slides; PDF) (sources; ZIP)

The objective of classical planning is to find a sequence of actions which begins in a given initial state and ends in a state that satisfies a given goal condition. A popular approach to solve classical planning problems is based on heuristic forward search algorithms. In contrast, regression search algorithms apply actions “backwards” in order to find a plan from a goal state to the initial state. Currently, regression search algorithms are somewhat unpopular, as the generation of partial states in a basic regression search often leads to a significant growth of the explored search space. To tackle this problem, state subsumption is a pruning technique that additionally discards newly generated partial states for which a more general partial state has already been explored.

In this thesis, we discuss and evaluate techniques of regression and state subsumption. In order to evaluate their performance, we have implemented a regression search algorithm for the planning system Fast Downward, supporting both a simple subsumption technique as well as a refined subsumption technique using a trie data structure. The experiments have shown that a basic regression search algorithm generally increases the number of explored states compared to uniform-cost forward search. Regression with pruning based on state subsumption with a trie data structure significantly reduces the number of explored states compared to basic regression.

**Download:**(PDF) (slides; PDF) (sources; ZIP)

This thesis discusses the Traveling Tournament Problem and how it can be solved with heuristic search. The Traveling Tournament problem is a sports scheduling problem where one tries to find a schedule for a league that meets certain constraints while minimizing the overall distance traveled by the teams in this league. It is hard to solve for leagues with many teams involved since its complexity grows exponentially in the number of teams. The largest instances solved up to date, are instances with leagues of up to 10 teams.

Previous related work has shown that it is a reasonable approach to solve the Traveling Tournament Problem with an IDA*-based tree search. In this thesis I implemented such a search and extended it with several enhancements to examine whether they improve performance of the search. The heuristic that I used in my implementation is the Independent Lower Bound heuristic. It tries to find lower bounds to the traveling costs of each team in the considered league. With my implementation I was able to solve problem instances with up to 8 teams. The results of my evaluation have mostly been consistent with the expected impact of the implemented enhancements on the overall performance.

**Download:**(PDF) (slides; PDF) (sources; ZIP)

One huge topic in Artificial Intelligence is the classical planning. It is the process of finding a plan, therefore a sequence of actions that leads from an initial state to a goal state for a specified problem. In problems with a huge amount of states it is very difficult and time consuming to find a plan. There are different pruning methods that attempt to lower the amount of time needed to find a plan by trying to reduce the number of states to explore. In this work we take a closer look at two of these pruning methods. Both of these methods rely on the last action that led to the current state. The first one is the so called tunnel pruning that is a generalisation of the tunnel macros that are used to solve Sokoban problems. The idea is to find actions that allow a tunnel and then prune all actions that are not in the tunnel of this action. The second method is the partition-based path pruning. In this method all actions are distributed into different partitions. These partitions then can be used to prune actions that do not belong to the current partition.

The evaluation of these two pruning methods show, that they can reduce the number of explored states for some problem domains, however the difference between pruned search and normal search gets smaller when we use heuristic functions. It also shows that the two pruning rules effect different problem domains.

**Download:**(PDF) (slides; PDF) (sources; TAR.GZ)

Essential for the estimation of the performance of an algorithm in satisficing planning is its ability to solve benchmark problems. Those results can not be compared directly as they originate from different implementations and different machines. We implemented some of the most promising algorithms for greedy best-first search, published in the last years, and evaluated them on the same set of benchmarks. All algorithms are either based on randomised search, localised search or a combination of both. Our evaluation proves the potential of those algorithms.

**Download:**(PDF) (slides; PDF) (sources; ZIP)

Ziel klassischer Handlungsplanung ist es auf eine möglichst effiziente Weise gegebene Planungsprobleme zu lösen. Die Lösung bzw. der Plan eines Planungsproblems ist eine Sequenz von Operatoren mit denen man von einem Anfangszustand in einen Zielzustand gelangt. Um einen Zielzustand gezielter zu finden, verwenden einige Suchalgorithmen eine zusätzliche Information über den Zustandsraum - die Heuristik. Sie schätzt, ausgehend von einem Zustand den Abstand zum Zielzustand. Demnach wäre es ideal, wenn jeder neue besuchte Zustand einen kleineren heuristischen Wert aufweisen würde als der bisher besuchte Zustand. Es gibt allerdings Suchszenarien bei denen die Heuristik nicht weiterhilft um einem Ziel näher zu kommen. Dies ist insbesondere dann der Fall, wenn sich der heuristische Wert von benachbarten Zuständen nicht ändert. Für die gierige Bestensuche würde das bedeuten, dass die Suche auf Plateaus und somit blind verläuft, weil sich dieser Suchalgorithmus ausschliesslich auf die Heuristik stützt. Algorithmen, die die Heuristik als Wegweiser verwenden, gehören zur Klasse der heuristischen Suchalgorithmen.

In dieser Arbeit geht es darum, in Fällen wie den Plateaus trotzdem eine Orientierung im Zustandsraum zu haben, indem Zustände neben der Heuristik einer weiteren Priorisierung unterliegen. Die hier vorgestellte Methode nutzt Abhängigkeiten zwischen Operatoren aus und erweitert die gierige Bestensuche. Wie stark Operatoren voneinander abhängen, betrachten wir anhand eines Abstandsmasses, welches vor der eigentlichen Suche berechnet wird. Die grundlegende Idee ist, Zustände zu bevorzugen, deren Operatoren im Vorfeld voneinander profitierten. Die Heuristik fungiert hierbei erst im Nachhinein als Tie-Breaker, sodass wir einem vielversprechenden Pfad zunächst folgen können, ohne dass uns die Heuristik an einer anderen, weniger vielversprechenden Stelle suchen lässt.

Die Ergebnisse zeigen, dass unser Ansatz in der reinen Suchzeit je nach Heuristik performanter sein kann, als wenn man sich ausschliesslich auf die Heuristik stützt. Bei sehr informationsreichen Heuristiken kann es jedoch passieren, dass die Suche durch unseren Ansatz eher gestört wird. Zudem werden viele Probleme nicht gelöst, weil die Berechnung der Abstände zu zeitaufwändig ist.

**Download:**(PDF) (slides; PDF) (sources; TAR.GZ)

In classical planning, heuristic search is a popular approach to solving problems very efficiently. The objective of planning is to find a sequence of actions that can be applied to a given problem and that leads to a goal state. For this purpose, there are many heuristics. They are often a big help if a problem has a solution, but what happens if a problem does not have one? Which heuristics can help proving unsolvability without exploring the whole state space? How efficient are they? Admissible heuristics can be used for this purpose because they never overestimate the distance to a goal state and are therefore able to safely cut off parts of the search space. This makes it potentially easier to prove unsolvability

In this project we developed a problem generator to automatically create unsolvable problem instances and used those generated instances to see how different admissible heuristics perform on them. We used the Japanese puzzle game Sokoban as the first problem because it has a high complexity but is still easy to understand and to imagine for humans. As second problem, we used a logistical problem called NoMystery because unlike Sokoban it is a resource constrained problem and therefore a good supplement to our experiments. Furthermore, unsolvability occurs rather 'naturally' in these two domains and does not seem forced.

**Download:**(PDF) (slides; PDF) (sources; TAR.GZ)

Heuristic search with admissible heuristics is the leading approach to cost-optimal, domain-independent planning. Pattern database heuristics - a type of abstraction heuristics - are state-of-the-art admissible heuristics. Two recent pattern database heuristics are the iPDB heuristic by Haslum et al. and the PhO heuristic by Pommerening et al.

The iPDB procedure performs a hill climbing search in the space of pattern collections and evaluates selected patterns using the canonical heuristic. We apply different techniques to the iPDB procedure, improving its hill climbing algorithm as well as the quality of the resulting heuristic. The second recent heuristic - the PhO heuristic - obtains strong heuristic values through linear programming. We present different techniques to influence and improve on the PhO heuristic.

We evaluate the modified iPDB and PhO heuristics on the IPC benchmark suite and show that these abstraction heuristics can compete with other state-of-the-art heuristics in cost-optimal, domain-independent planning.

**Download:**(PDF)

Greedy best-first search (GBFS) is a prominent search algorithm for satisficing planning - finding good enough solutions to a planning task in reasonable time. GBFS selects the next node to consider based on the most promising node estimated by a heuristic function. However, this behaviour makes GBFS heavily depend on the quality of the heuristic estimator. Inaccurate heuristics can lead GBFS into regions far away from a goal. Additionally, if the heuristic ranks several nodes the same, GBFS has no information on which node it shall follow. Diverse best-first search (DBFS) is a new algorithm by Imai and Kishimoto [2011] which has a local search component to emphasis exploitation. To enable exploration, DBFS deploys probabilities to select the next node.

In two problem domains, we analyse GBFS' search behaviour and present theoretical results. We evaluate these results empirically and compare DBFS and GBFS on constructed as well as on provided problem instances.

**Download:**(PDF)

State-of-the-art planning systems use a variety of control knowledge in order to enhance the performance of heuristic search. Unfortunately most forms of control knowledge use a specific formalism which makes them hard to combine. There have been several approaches which describe control knowledge in Linear Temporal Logic (LTL). We build upon this work and propose a general framework for encoding control knowledge in LTL formulas. The framework includes a criterion that any LTL formula used in it must fulfill in order to preserve optimal plans when used for pruning the search space; this way the validity of new LTL formulas describing control knowledge can be checked. The framework is implemented on top of the Fast Downward planning system and is tested with a pruning technique called Unnecessary Action Application, which detects if a previously applied action achieved no useful progress.

**Download:**(PDF)

Sokoban is a computer game where each level consists of a two-dimensional grid of fields. There are walls as obstacles, moveable boxes and goal fields. The player controls the warehouse worker (Sokoban in Japanese) to push the boxes to the goal fields. The problem is very complex and that is why Sokoban has become a domain in planning.

Phase transitions mark a sudden change in solvability when traversing through the problem space. They occur in the region of hard instances and have been found for many domains. In this thesis we investigate phase transitions in the Sokoban puzzle. For our investigation we generate and evaluate random instances. We declare the defining parameters for Sokoban and measure their influence on the solvability. We show that phase transitions in the solvability of Sokoban can be found and their occurrence is measured. We attempt to unify the parameters of Sokoban to get a prediction on the solvability and hardness of specific instances.

Landmarks are known to be useable for powerful heuristics for informed search. In this thesis, we explain and evaluate a novel algorithm to find ordered landmarks of delete free tasks by intersecting solutions in the relaxation. The proposed algorithm efficiently finds landmarks and natural orders of delete free tasks, such as delete relaxations or Pi-m compilations.

**Download:**(PDF)

Planning as heuristic search is the prevalent technique to solve planning problems of any kind of domains. Heuristics estimate distances to goal states in order to guide a search through large state spaces. However, this guidance is sometimes moderate, since still a lot of states lie on plateaus of equally prioritized states in the search space topology. Additional techniques that ignore or prefer some actions for solving a problem are successful to support the search in such situations. Nevertheless, some action pruning techniques lead to incomplete searches.

We propose an under-approximation refinement framework for adding actions to under-approximations of planning tasks during a search in order to find a plan. For this framework, we develop a refinement strategy. Starting a search on an initial under-approximation of a planning task, the strategy adds actions determined at states close to a goal, whenever the search does not progress towards a goal, until a plan is found. Key elements of this strategy consider helpful actions and relaxed plans for refinements. We have implemented the under-approximation refinement framework into the greedy best first search algorithm. Our results show considerable speedups for many classical planning problems. Moreover, we are able to plan with fewer actions than standard greedy best first search.

**Download:**(PDF)

The main approach for classical planning is heuristic search. Many cost
heuristics are based on the delete relaxation. The optimal heuristic of
a delete free planning problem is called h

The two algorithms are used to compute a cost heuristic for an A* search. As both approaches compute the optimal heuristic for delete free planning tasks, the algorithms can also be used to find a solution for relaxed planning tasks.

**Download:**(PDF)

In planning, we address the problem of automatically finding a sequence of actions that leads from a given initial state to a state that satisfies some goal condition. In satisficing planning, our objective is to find plans with preferably low, but not necessarily the lowest possible costs while keeping in mind our limited resources like time or memory. A prominent approach for satisficing planning is based on heuristic search with inadmissible heuristics. However, depending on the applied heuristic, plans found with heuristic search might be of low quality, and hence, improving the quality of such plans is often desirable. In this thesis, we adapt and apply iterative tunneling search with A* (ITSA*) to planning. ITSA* is an algorithm for plan improvement which has been originally proposed by Furcy et al. for search problems. ITSA* intends to search the local space of a given solution path in order to find "short cuts" which allow us to improve our solution. In this thesis, we provide an implementation and systematic evaluation of this algorithm on the standard IPC benchmarks. Our results show that ITSA* also successfully works in the planning area.

**Download:**(PDF)

In action planning, greedy best-first search (GBFS) is one of the standard techniques if suboptimal plans are accepted. GBFS uses a heuristic function to guide the search towards a goal state. To achieve generality, in domain-independant planning the heuristic function is generated automatically. A well-known problem of GBFS are search plateaus, i.e., regions in the search space where all states have equal heuristic values. In such regions, heuristic search can degenerate to uninformed search. Hence, techniques to escape from such plateaus are desired to improve the efficiency of the search. A recent approach to avoid plateaus is based on diverse best-first search (DBFS) proposed by Imai and Kishimoto. However, this approach relies on several parameters. This thesis presents an implementation of DBFS into the Fast Downward planner. Furthermore, this thesis presents a systematic evaluation of DBFS for several parameter settings, leading to a better understanding of the impact of the parameter choices to the search performance.

**Download:**(PDF)

Risk is a popular board game where players conquer each other's countries. In this project, I created an AI that plays Risk and is capable of learning. For each decision it makes, it performs a simple search one step ahead, looking at the outcomes of all possible moves it could make, and picks the most beneficial. It judges the desirability of outcomes by a series of parameters, which are modified after each game using the TD(λ)-Algorithm, allowing the AI to learn.

The *Canadian Traveler's Problem* (ctp)
is a path finding problem where due to unfavorable weather, some of the
roads are impassable. At the beginning, the agent does not know which
roads are traversable and which are not. Instead, it can observe the
status of roads adjacent to its current location. We consider the
stochastic variant of the problem, where the blocking status of a
connection is randomly defined with known probabilities. The goal is to
find a policy which minimizes the expected travel costs of the agent.

We discuss several properties of the stochastic ctp and present an efficient way to calculate state probabilities. With the aid of these theoretical results, we introduce an uninformed algorithm to find optimal policies.

**Download:**(PDF)

Finding optimal solutions for general search problems is a challenging task. A powerful approach for solving such problems is based on heuristic search with pattern database heuristics. In this thesis, we present a domain specific solver for the TopSpin Puzzle problem. This solver is based on the above-mentioned pattern database approach. We investigate several pattern databases, and evaluate them on problem instances of different size.

**Download:**(PDF)

Multi-Agent-Path-Finding (MAPF) is a common problem in robotics and
memory management. *Pebbles in Motion* is an implementation of a
problem solver for MAPF in polynomial time, based on a work by Daniel
Kornhauser from 1984. Recently a lot of research papers have been
published on MAPF in the research community of Artificial Intelligence,
but the work by Kornhauser seems hardly to be taken into account. We
assumed that this might be related to the fact that said paper was more
mathematically and hardly describing algorithms intuitively. This work
aims at filling this gap, by providing an easy understandable approach
of implementation steps for programmers and a new detailed description
for researchers in Computer Science.

**Download:**(PDF)

Merge-and-shrink abstractions are a popular approach to generate abstraction heuristics for planning. The computation of merge-and-shrink abstractions relies on a merging and a shrinking strategy. A recently investigated shrinking strategy is based on using bisimulations. Bisimulations are guaranteed to produce perfect heuristics. In this thesis, we investigate an efficient algorithm proposed by Dovier et al. for computing coarsest bisimulations. The algorithm, however, cannot directly be applied to planning and needs some adjustments. We show how this algorithm can be reduced to work with planning problems. In particular, we show how an edge labelled state space can be translated to a state labelled one and what other changes are necessary for the algorithm to be usable for planning problems. This includes a custom data structure to fulfil all requirements to meet the worst case complexity. Furthermore, the implementation will be evaluated on planning problems from the International Planning Competitions. We will see that the resulting algorithm can often not compete with the currently implemented algorithm in Fast Downward. We discuss the reasons why this is the case and propose possible solutions to resolve this issue.

**Download:**(PDF)

In order to understand an algorithm, it is always helpful to have a visualization that shows step for step what the algorithm is doing. Under this presumption this Bachelor project will explain and visualize two AI techniques, Constraint Satisfaction Processing and SAT Backbones, using the game Gnomine as an example.

CSP techniques build up a network of constraints and infer information by propagating through a single or several constraints at a time, reducing the domain of the variables in the constraint(s). SAT Backbone Computations find literals in a propositional formula, which are true in every model of the given formula.

By showing how to apply these algorithms on the problem of solving a Gnomine game I hope to give a better insight on the nature of how the chosen algorithms work.

**Download:**(PDF)

Planning as heuristic search is a powerful approach to solve domain-independent planning problems. An important class of heuristics is based on abstractions of the original planning task. However, abstraction heuristics usually come with loss in precision. The contribution of this thesis is the investigation of constrained abstraction heuristics in general, and the application of this concept to pattern database and merge and shrink abstractions in particular. The idea is to use a subclass of mutexes which represent sets of variable-value-pairs so that only one of these pairs can be true at any given time, to regain some of the precision which is lost in the abstraction without increasing its size. By removing states and operators in the abstraction which conflict with such a mutex, the abstraction is refined and hence, the corresponding abstraction heuristic can get more informed. We have implemented the refinements of these heuristics in the Fast Downward planner and evaluated the different approaches using standard IPC benchmarks. The results show that the concept of constrained abstraction heuristics can improve planning as heuristic search in terms of time and coverage.

**Download:**(PDF)

A permutation problem considers the task where an initial order of objects (ie, an initial mapping of objects to locations) must be reordered into a given goal order by using permutation operators. Permutation operators are 1:1 mappings of the objects from their locations to (possibly other) locations. An example for permutation problems are the wellknown Rubik's Cube and TopSpin Puzzle. Permutation problems have been a research area for a while, and several methods for solving such problems have been proposed in the last two centuries. Most of these methods focused on finding optimal solutions, causing an exponential runtime in the worst case.

In this work, we consider an algorithm for solving permutation problems that has been originally proposed by M. Furst, J. Hopcroft and E. Luks in 1980. This algorithm has been introduced on a theoretical level within a proof for "Testing Membership and Determining the Order of a Group", but has not been implemented and evaluated on practical problems so far. In contrast to the other abovementioned solving algorithms, it only finds suboptimal solutions, but is guaranteed to run in polynomial time. The basic idea is to iteratively reach subgoals, and then to let them fix when we go further to reach the next goals. We have implemented this algorithm and evaluated it on different models, as the Pancake Problem and the TopSpin Puzzle .

**Download:**(PDF)

Pattern databases (Culberson & Schaeffer, 1998) or PDBs, have been proven very effective in creating admissible Heuristics for single-agent search, such as the A*-algorithm. Haslum et. al proposed, a hill-climbing algorithm can be used to construct the PDBs, using the canonical heuristic. A different approach would be to change action-costs in the pattern-related abstractions, in order to obtain the admissible heuristic. This the so called Cost-Partitioning.

The aim of this project was to implement a cost-partitioning inside the hill-climbing algorithm by Haslum, and compare the results with the standard way which uses the canonical heuristic.

**Download:**(PDF)

UCT ("upper confidence bounds applied to trees") is a state-of-the-art algorithm for acting under uncertainty, e.g. in probabilistic environments. In the last years it has been very successfully applied in numerous contexts, including two-player board games like Go and Mancala and stochastic single-agent optimization problems such as path planning under uncertainty and probabilistic action planning.

In this project the UCT algorithm was implemented, adapted and evaluated for the classical arcade game "Ms Pac-Man". The thesis introduces Ms Pac-Man and the UCT algorithm, discusses some critical design decisions for developing a strong UCT-based algorithm for playing Ms Pac-Man, and experimentally evaluates the implementation.