Planning is one of the classic problems of Artificial Intelligence. Here we just touch on the most basic concepts in the simplest way, and show how planning can be approached using a deductive reasoner.
The Planning Problem
Here's a simplified statement of the planning problem.
- You have a current state of the world.
- You have a desired state of the world.
- Find a sequence of actions that, if executed, should change the world from its current state into the desired state.
A planner is a program that can solve planning problems.
Military planning is an obvious, and long-studied, example of planning. It's particularly challenging because it mixes not only planning, which is hard enough, but game playing against an uncontrollable opponent.
To keep things simple, early work in AI focused on planning where no opponent is involved, the world is fairly simple, and things don't change unless the actor, e.g., a robot, causes them to change when executing the plan. We'll make those assumptions here.
In real planning research, the goal is to find very general rules for selecting steps that can be used in every problem. These rules look at the states and the plan so far. We're going to program in specific deductive rules to solve each problem. The rules will be able to handle different problems, but we won't have a general planner.
Deductive Rules for Planning
Using just our simple backward chainer, we can define a general planning framework, that involves a few basic concepts, and two core deductive rules.
There are three core predicates that we need to define rules for.
(PLAN plan start-state goal-state): This asserts that executing plan would lead from start-state to goal-state.
(RESULTS action current-state result-state): This asserts that doing action in current-state leads to result-state.
(STEP action current-state goal-state): This asserts that action is a good step in a plan to get from current-state to goal-state.
Actions and states are things you need to design when creating a planner. Logically, they will be functional terms. You can define them any way you want but there are some design constraints.
States should be designed so that there is just one way to describe any given state of the world for some problem domain. If there are multiple ways to describe the same state, backward chaining will have to explore all of them, even though the answer will be the same in every case.
For example, in Blocks World, we could represent
a stack of blocks with the term
(and (on a b) (on b c) (on c table)).
This is a bad representation, because there are six different ways to
describe the same state. If we represent the stack
of blocks with the term
(cons a (cons b (cons c nil))))
cons here is NOT the Lisp function -- then
there is only one way to describe that state.
Actions and results
Actions are relatively simple to design. They can be simple names,
JUMP, if the actor is implicitly understood to be
the robot, or they can be terms specifying parameters, e.g.,
(PUT-ON x y).
For each action, you have to define
rules that say how that action changes the state of the world.
In simple domains, these rules are normally very simple. See
the examples for, um, examples.
Action result rules are not enough. The planner
can't just try every action and see what happens. That will usually lead
to endless loops. For example, a basic action in
the Monkey and Bananas problem is
(MOVE loc) which means "monkey moves to location
loc." Suppose the monkey begins by the door and the planner
generates the action
(MOVE WINDOW) for "monkey moves to
window." One of the next possible actions is
Without some rules to guide it, the planner generate, among other
things, a plan that forever goes back and forth.
STEP rules are needed to
pick the best action to do in a given state of the world.
This is where the intelligence of your planner resides. These
rules at the least must avoid endless loops, while still finding
solutions, if there are any.
Finally there are
PLAN rules. These say
how to use the
rules to find plans. Surprisingly, just two general
rules can do basic planning. You normally don't need to define
The first rule is trivial. It says "the empty plan
works if the current state is the desired state."
To represent the empty plan, we'll use the symbol
We could have picked
DONE or something else.
(<- (plan nil ?goal ?goal))
The second rule is where most of the work occurs. It needs to say "a plan from current state to the desired state is one that has an action for current state, followed by a plan that gets from the result of that action to the desired state."
To represent a sequence of actions, we'll use Lisp list notation.
(CONS action1 (CONS action2 ... NIL)) represents
"first do action1 then do action2 then ...."
(<- (plan (cons ?action ?actions) ?current ?goal) (step ?action ?current ?goal) (results ?action ?current ?result) (plan ?actions ?result ?goal))
This make take a bit of study. In a typical planning query,
?goal will be constant
but the first argument, the plan, will be an unbound variable.
The goal of the deductive planner is to prove the query
by constructing a plan that makes it true, if possible.
The following two examples are classic early toy examples in AI planning. Cordell Green demonstrated how deductive retrieval could solve such problems in his technical report Application of Theorem Proving to Problem Solving. It turned out to be a much harder problem than first realized, and much logical study resulted.
The Blocks World Stacking Problem
At MIT, a lot of early work used the blocks world domain as a testbed for planning ideas. In the blocks world, a robot with a camera and one arm with a grasper is given the task of stacking blocks on a table in some specified arrangement. Since the robot can only move one block at a time, and blocks may already be in stacks, the problem is one of unstacking and stacking. Ideally the planner should create efficient plans with no wasted moves.
We'll make this problem as simple as possible. The robot has just one stack of blocks to worry about. Blocks can be stacked or put in a box.
We'll represent the stack of blocks as a logical list term. E.g., the
(CONS A (CONS B (CONS C NIL))) represents the stack with
A on B on C. So a problem to create a stack can be represented as a query
with an initial stack and a desired stack, and a variable for the plan.
(plan ?plan (cons a (cons b (cons c nil))) (cons a (cons b (cons d (cons c nil)))))
Two actions are sufficient for this problem:
POPremoves the block on the top of the stack and puts it in the box of blocks
(PUSH x)takes the block x from the box and puts it on the top of the stack
The action result rules are easy to write
(<- (results pop (cons ?x ?stack) ?stack)) (<- (results (push ?x) ?stack (cons ?x ?stack)))
Now comes the hard part -- writing the rules to pick whether to
PUSH. It would be easy to get into an endless loop here.
Plus the problem in the example query introduces a new block,
-- how can the planner know what blocks there are to push onto the stack?
At this point, it pays to do a number of examples, from one or two blocks to
three or four, in different arrangements, to get a feel for what the plans
look like. Eventually, the following rule for
POP may occur to you:
- You should pop blocks off the current stack until it matches
the block(s) at the bottom of the goal stack.
- In the example, we should pop until we get to block C.
- If two stacks are the same, we should not pop anything.
- If the bottom block of the two stacks are different, we need to pop everything.
The rule for
(PUSH x) is a bit more of a leap of faith.
- If the current stack does match the bottom of the desired stack, you should push the block x that would make a new stack that matches the bottom of the desired stack.
These two rules are both based on the concept of the current stack being a substack of the desired stack. So let's assume we have such a predicate and write the above step selection rules.
(<- (step pop ?current ?goal) (not (substack ?current ?goal))) (<- (step (push ?x) ?current ?goal) (substack (cons ?x ?current) ?goal))
Study these for a while. The first makes use of
the NOT logical operator.
The second uses variable binding and backward chaining
to find the binding for
?x that we need.
All that remains is to define that predicate
Fortunately, it's fairly intuitive, once you understand backward chaining.
(<- (substack ?stack ?stack)) (<- (substack ?s1 (cons ?y ?s2)) (substack ?s1 ?s2))
These are the rules we need for the one-stack blocks world planner,
along with the two
You can try them out by loading
The Monkey and Bananas Problem
While MIT had Blocks World, John McCarthy at Stanford had the Monkey and Bananas problem. A monkey enters a room. Hanging from the ceiling is a bunch of bananas, just out of reach. Over by the window is a box. The planning problem is to generate the plan that gets the bananas by moving the box under the bananas and climbing on top of the box.
The only thing we really care about in this problem are the locations of the monkey, box, and bananas. To make things really simple, we'll put the bananas in the center of the room. So a simple state representation that has only one way to describe any state is
(mb-state monkey-location box-location)
The box can be at the DOOR, WINDOW or CENTER. The Monkey can be those places or BOX-TOP.
If the monkey starts at the door and the box is by the window, then the basic problem query is
(plan ?plan (mb-state door window) (mb-state box-top center))
I.e., the box has to end up under the bananas and the monkey has to be on the box.
There are three actions the monkey can do. It can walk to a location, push the box to a location, or climb on the box. The action result rules are pretty simple, but unlike the Blocks World, there are some preconditions that have to be true for some actions to be possible. These preconditions are not the same as the step selection rules.
The monkey can climb on the box if the monkey is in the same location as the box.
(<- (results climb-box (mb-state ?bloc ?bloc) (mb-state box-top ?bloc)))
Notice how we use repeated variables to require the monkey and box to be in the same location.
The monkey can push the box to a location if the monkey is in the same location as the box. The location of both changes.
(<- (results (push-box ?bloc1 ?bloc2) (mb-state ?bloc1 ?bloc1) (mb-state ?bloc2 ?bloc2)))
The monkey can walk to any location.
(<- (results (walk-to ?mloc2) (mb-state ?mloc1 ?bloc) (mb-state ?mloc2 ?bloc)))
Now we need to guide the monkey to choose actions that lead toward getting the bananas, and never fall into an endless loop.
For example, if the monkey is not where the box is, then the monkey should walk to the box. To say this, we'll need a predicate for saying "location1 is different than location2." Simply using different variables doesn't do that.
(<- (step (walk-to ?bloc) (mb-state ?mloc ?bloc) (mb-state ?mloc-2 ?gloc)) (different ?mloc ?bloc))
We also need to assert the different locations.
(<- (different window center)) (<- (different center window)) (<- (different window door)) (<- (different door window)) (<- (different center door)) (<- (different door center))
Similarly, if the box is not where the bananas are -- the goal location -- then the monkey should push the box to that location.
(<- (step (push-box ?bloc ?gloc) (mb-state ?bloc ?bloc) (mb-state ?mloc ?gloc)) (different ?bloc ?gloc))
Finally, if the box and monkey are at the desired location, the monkey should climb on the box.
(<- (step climb-box (mb-state ?gloc ?gloc) (mb-state box-top ?gloc)))
These rules, plus the two general
will solve the problem.
You can try them out by loading
A Universal Planner
Well, not really, but certainly more general than what we've written so far.
Right now, our planners need two sets of rules:
- Rules about how the world works, e.g., if you push a box to some location, then both you and the box end up in the new location
- Rules about how to select the best action in a given state, given some goal state
What most AI researchers wanted to find was a planner that only the first kind of rule. A universal planner, AKA general problem solver, should be able to figure out for itself what actions to do in what order, using just knowledge about what actions there are and what they do.
Let's take our basic PLAN rules and remove the part that
STEP predicate to find the best action.
(defparameter *universal-plan-kb* '((<- (plan nil ?goal ?goal)) (<- (plan (cons ?action ?actions) ?current ?goal) (results ?action ?current ?result) (plan ?actions ?result ?goal))))
Try a Monkey and Bananas problem with these new rules:
(ask '(plan ?plan (mb-state center center) (mb-state box-top center)) (append *universal-plan-kb* *monkey-kb* *room-kb*))
You should get a stack overflow, not
NIL. The rules are running -- but they
get into an endless loop trying all possible sequences of actions. Even though they may find
immediately the action of climbing on the box, they also look for other plans, such as walking
somewhere else, or pushing the box somewhere else, and then later coming back, and so on.
This doesn't have to be. It should be possible to write rules that only allow an action to be selected if it results in a new state, i.e., not one that is in the current plan being developed. To do that, we would need to change the PLAN rules to keep a list of the result states and check that the result state of an action is not a member of that list before trying plan with that action.
Doing this is left as an exercise.
Note that this will lead to a more general planner, but also a much less efficient one. Also, it will only work if there are a finite number of possible states. That's true for the two example problems we have here, but would not be true if, for example, something like counting was allowed, e.g., the Peano's axiom rules that can generate "successor of successor of ..." forever.