Deductive Planning

Planning is one of the classic problems of Artificial Intelligence. Here we just touch on the most basic concepts in the simplest way, and show how planning can be approached using a deductive reasoner.

The Planning Problem

Here's a simplified statement of the planning problem.

A planner is a program that can solve planning problems.

Military planning is an obvious, and long-studied, example of planning. It's particularly challenging because it mixes not only planning, which is hard enough, but game playing against an uncontrollable opponent.

To keep things simple, early work in AI focused on planning where no opponent is involved, the world is fairly simple, and things don't change unless the actor, e.g., a robot, causes them to change when executing the plan. We'll make those assumptions here.

In real planning research, the goal is to find very general rules for selecting steps that can be used in every problem. These rules look at the states and the plan so far. We're going to program in specific deductive rules to solve each problem. The rules will be able to handle different problems, but we won't have a general planner.

Deductive Rules for Planning

Using just our simple backward chainer, we can define a general planning framework, that involves a few basic concepts, and two core deductive rules.

There are three core predicates that we need to define rules for.

Actions and states are things you need to design when creating a planner. Logically, they will be functional terms. You can define them any way you want but there are some design constraints.


States are functional terms. They should be designed so that there is just one way to describe any given state of the world for some problem domain. If there are multiple ways to describe the same state, backward chaining will have to explore all of them, even though the answer will be the same in every case.

For example, in Blocks World, we could represent a stack of blocks with the term (and (on a b) (on b c) (on c table)). This is a bad representation, because there are six different ways to describe the same state. If we represent the stack of blocks with the term (cons a (cons b (cons c nil)))) -- remember, cons here is NOT the Lisp function -- then there is only one way to describe that state.


Actions are also functional terms. They are relatively simple to design. They can be simple names, like JUMP, if the actor is implicitly understood to be the robot, or they can be terms specifying parameters, e.g., (PUT-ON x y).


For each action, you have to define one or more rules that say how that action changes the state of the world. These rules use the predicate RESULTS to make assertions of the form (results action current-state result-state) to describe what happens. In simple domains, these rules are normally pretty simple. See the examples for, um, examples.

Step selection

Action result rules are not enough. The planner can't just try every action and see what happens. That is very inefficient, and can easily lead to endless loops. For example, a basic action in the Monkey and Bananas problem is (MOVE loc) which means "monkey moves to location loc." Suppose the monkey begins by the door and the planner generates the action (MOVE WINDOW) for "monkey moves to window." One of the next possible actions is (MOVE DOOR). Without some rules to guide it, the planner generate, among other things, a plan with an endless list of steps going back and forth.

STEP rules are used to pick what action to do in a given state of the world. This is where the intelligence of your planner resides. These rules at the least must avoid endless loops, while still finding solutions, if there are any.

Plan construction

Finally there are PLAN rules. These say how to use the STEP and RESULTS rules to find plans. Surprisingly, just two general rules can do basic planning. You normally don't need to define any more PLAN rules.

The first rule is trivial. It says "the empty plan works if the current state is the desired state." To represent the empty plan, we'll use the symbol NIL. We could have picked DONE or something else.

(<- (plan nil ?goal ?goal))

The second rule is where most of the work occurs. It needs to say "a plan from current state to the desired state is one that has an action for current state, followed by a plan that gets from the result of that action to the desired state."

To represent a sequence of actions, we'll use Lisp list notation. (CONS action1 (CONS action2 ... NIL)) represents the plan "first do action1 then do action2 then ...."

(<- (plan (cons ?action ?actions) ?current ?goal)
    (step ?action ?current ?goal)
    (results ?action ?current ?result)
    (plan ?actions ?result ?goal))

This make take a bit of study. In a typical planning query, ?current and ?goal will be constant state descriptions, but the first argument, the plan, will be an unbound variable. The goal of the deductive planner is to prove the query by constructing a plan that makes it true, if possible.


The following two examples are classic early toy examples in AI planning. Cordell Green demonstrated how deductive retrieval could solve such problems in his technical report Application of Theorem Proving to Problem Solving. It turned out to be a much harder problem than first realized, and much logical study resulted.

The Blocks World Stacking Problem

At MIT, a lot of early work used the blocks world domain as a testbed for planning ideas. In the blocks world, a robot with a camera and one arm with a grasper is given the task of stacking blocks on a table in some specified arrangement. Since the robot can only move one block at a time, and blocks may already be in stacks, the problem is one of unstacking and stacking. Ideally the planner should create efficient plans with no wasted moves.

We'll make this problem as simple as possible. The robot has just one stack of blocks to worry about. Blocks not in the stack sit on the table. We won't include these in our state here, but could.

We'll represent the stack of blocks as a list, using the rules for append E.g., the functional term (CONS A (CONS B (CONS C NIL))) will represent the stack with A on B on C.

So a problem to create a stack can be represented as a query with a variable for the desired pan, the initial stack, and the goal stack. For example, given this problem to solve

we would pose this query, using the form (plan plan initial goal).

(plan ?plan (cons a (cons b (cons c nil)))
            (cons a (cons b (cons d (cons c nil)))))

Two actions are sufficient for this problem. Again, for brevity, we assume these actions apply to the one stack, so the stack is not included in the functional terms describing the actions.

The action result rules are easy to write, using the form (results action current-state result-state).

(<- (results pop (cons ?x ?stack) ?stack))

(<- (results (push ?x) ?stack (cons ?x ?stack)))

No antecedent queries are needed to check preconditions for these rules. POP needs a non-empty stack, and the (cons ?x ?stack) makes sure that is true. PUSH has no preconditions because we are not including the blocks on the table in our state. If we did, a precondition for pushing a block X would be that X is on the table.

Now comes the critical part: writing the rules to pick whether to POP or PUSH. It would be easy to get into an endless loop here. Plus the problem in the example query introduces a new block, D -- how can the planner know what blocks are available to push onto the stack?

At this point, it pays to do a number of examples by hand, from one or two blocks to three or four, in different arrangements, to get a feel for what the plans look like. Eventually, the following rule for POP may occur to you:

If the current stack does not match the block(s) at the bottom of the goal stack, pop a block.

In the example above, we should pop until the stack has just block C. This rule implies that

The rule for (PUSH x) is basically "push what you need to".

If the current stack matches the bottom of the desired stack, push the block x that makes a new stack that matches the bottom blocks of the goal stack.

These two rules both ask if the current stack matches the bottom of the desired stack. Since we're using lists to represent stacks, that's equivalent to asking if there is some (possibly empty) list that appended to the current stack would yield the goal stack. I.e., if the current stack is in the variable ?current and the goal stack is in the variable ?goal, it's the query (append ?top ?current ?goal). So our selection rules for pop and push become:

(<- (step pop ?current ?goal)
    (not (append ?top ?current ?goal)))
(<- (step (push ?x) ?current ?goal)
    (append ?top (cons ?x ?current) ?goal))

Study these for a while. The first makes use of the NOT logical operator. The second uses variable binding and backward chaining to find the binding for ?x that we need. Ask yourself, "How does it figure out it needs to stack D?"

These are the rules we need for the one-stack blocks world planner, along with the two PLAN rules. You can try them out by loading sddr-plan.lisp.

The Monkey and Bananas Problem

While MIT had Blocks World, John McCarthy at Stanford had the Monkey and Bananas problem. A monkey enters a room. Hanging from the ceiling is a bunch of bananas, just out of reach. Over by the window is a box. The planning problem is to generate the plan that gets the bananas by moving the box under the bananas and climbing on top of the box.

The only thing we really care about in this problem are the locations of the monkey, box, and bananas. To make things really simple, we'll put the bananas in the center of the room. So a simple state representation that has only one way to describe any state is

(mb-state monkey-location box-location)

The box can be at the DOOR, WINDOW or CENTER. The Monkey can be those places or BOX-TOP.

If the monkey starts at the door and the box is by the window, then the basic problem query is

(plan ?plan (mb-state door window)
            (mb-state box-top center))

I.e., the box has to end up under the bananas and the monkey has to be on the box.

There are three actions the monkey can do. It can walk to a location, push the box to a location, or climb on the box. The action result rules are pretty simple, but unlike the Blocks World, there are some preconditions that have to be true for some actions to be possible. These preconditions are not the same as the step selection rules.

The monkey can climb on the box if the monkey is in the same location as the box.

(<- (results climb-box
             (mb-state ?bloc ?bloc)
             (mb-state box-top ?bloc)))

Notice how we use repeated variables to require the monkey and box to be in the same location.

The monkey can push the box to a location if the monkey is in the same location as the box. The location of both changes.

(<- (results (push-box ?bloc1 ?bloc2)
             (mb-state ?bloc1 ?bloc1)
             (mb-state ?bloc2 ?bloc2)))

The monkey can walk to any location.

(<- (results (walk-to ?mloc2)
             (mb-state ?mloc1 ?bloc)
             (mb-state ?mloc2 ?bloc)))

Now we need to guide the monkey to choose actions that lead toward getting the bananas, and never fall into an endless loop.

For example, if the monkey is not where the box is, then the monkey should walk to the box. To say this, we'll need a predicate for saying "location1 is different than location2." Simply using different variables doesn't do that.

(<- (step (walk-to ?bloc)
          (mb-state ?mloc ?bloc)
          (mb-state ?mloc-2 ?gloc))
    (different ?mloc ?bloc))

We also need to assert the different locations.

(<- (different window center))
(<- (different center window))
(<- (different window door))
(<- (different door window))
(<- (different center door))
(<- (different door center))

Similarly, if the box is not where the bananas are -- the goal location -- then the monkey should push the box to that location.

(<- (step (push-box ?bloc ?gloc)
          (mb-state ?bloc ?bloc)
          (mb-state ?mloc ?gloc))
    (different ?bloc ?gloc))

Finally, if the box and monkey are at the desired location, the monkey should climb on the box.

(<- (step climb-box
          (mb-state ?gloc ?gloc)
          (mb-state box-top ?gloc)))

These rules, plus the two general PLAN rules, will solve the problem. You can try them out by loading sddr-plan.lisp.

A Universal Planner

Right now, our planners need two sets of rules:

What most AI researchers want is a planner that only needs the first kind of rule. A universal planner, AKA general problem solver, should be able to figure out for itself what actions to do in what order, using just knowledge about what actions there are and what they do.

Try removing the STEP query in our basic PLAN rules:

(defparameter *universal-plan-kb*
  '((<- (plan nil ?goal ?goal))
    (<- (plan (cons ?action ?actions) ?current ?goal)
        (results ?action ?current ?result)
        (plan ?actions ?result ?goal))))

Now try a Monkey and Bananas problem with these new rules:

(ask '(plan ?plan
            (mb-state center center)
            (mb-state box-top center))
     (append *universal-plan-kb* *monkey-kb* *room-kb*))

You should get a stack overflow. If you get NIL, you did something incorrectly. Stack overflow means that the rules are running, but they get into an endless loop trying all possible sequences of actions. That includes plans, such as walking somewhere then walking back, or pushing the box somewhere, then pushing it back, and so on.

This doesn't have to be. It should be possible, at least in simple finite domains like stacking blocks and moving boxes in a room, to write rules that try all valid actions but avoid obvious loops. To do this, you need a new set of PLAN rules that:

POSSIBLE rules are like STEP rules but much less intelligent. For example,

Doing this is left as an exercise.

Note that this will lead to a more general planner, but also a much less efficient one. Also, it will only work if there are a finite number of possible states. That's true for the two example problems we have here, but would not be true if, for example, something like counting was allowed, e.g., the Peano's axiom rules that can generate "successor of successor of ..." forever.

Faculty: Chris Riesbeck
Time: Monday, Wednesday, Friday: 11am - 11:50am
Location: Tech LR5


Important Links