Deductive Planning

Planning is one of the classic problems of Artificial Intelligence. Here we just touch on the most basic concepts in the simplest way, and show how planning can be approached using a deductive reasoner.

The Planning Problem

Here's a simplified statement of the planning problem.

You have a current state of the world.
You have a desired state of the world.
Find a sequence of actions that, if executed, should change the world from its current state into the desired state.

A planner is a program that can solve planning problems.

Military planning is an obvious, and long-studied, example of planning. It's particularly challenging because it mixes not only planning, which is hard enough, but game playing against an uncontrollable opponent.

To keep things simple, early work in AI focused on planning where no opponent is involved, the world is fairly simple, and things don't change unless the actor, e.g., a robot, causes them to change when executing the plan. We'll make those assumptions here.

In real planning research, the goal is to find very general rules for selecting steps that can be used in every problem. These rules look at the states and the plan so far. We will have a general planner, but very specific rules for each problem.

Deductive Rules for Planning

Using just our simple backward chainer, we can define a general planning framework, that involves a few basic concepts, and two core deductive rules. We start with the predicate PLAN.

(PLAN steps start goal)

This should be true if doing the steps listed would transform an initial situation start into a desired situation goal. We'll represent the list of steps with CONS and NIL. That means we can use rules for predicates like APPEND and MEMBER in our planning rules. How we represent individual steps and states will depend on the specific problem domain.

For a minimal planner, we just need two rules to define PLAN. The first rule is trivial.

(<- (plan nil ?goal ?goal))

This says the empty plan works if the current state is the desired state.

The second rule does all the work. When the start state is not the goal state, the rule looks for actions that will change the state of the world from the current state to the desired state.

(<- (plan (cons ?action ?actions) ?current ?goal)
    (not (same ?current ?goal))
    (step ?action ?current ?goal)
    (results ?action ?current ?result)
    (plan ?actions ?result ?goal))
    
(<- (same ?x ?x))

Here is what the rule says, line by line:

(<- (plan (cons ?action ?actions) ?current ?goal)

A plan that starts with ?action followed by ?actions will get from ?current to ?goal, if...

    (not (same ?current ?goal))

...the current state is not already the goal state, and...

    (step ?action ?current ?goal)

...?action is a valid step to try given the current state and the goal state, and...

    (results ?action ?current ?result)

...the result of doing ?action> in the current state is the state ?result , and...

    (plan ?actions ?result ?goal))

...the steps in ?actions> are a valid plan to get to the goal state from the result state.

This formulation defines how plans work in general. All the details about how to represent states, how to represent actions, how actions affect states, and how to pick valid actions, are in the rules we define for the predicates STEP and RESULTS.

Examples

The following two examples are classic early toy examples in AI planning. Cordell Green demonstrated how deductive retrieval could solve such problems in his technical report Application of Theorem Proving to Problem Solving. The planning problem turned out to be a much harder than first realized, and much logical study resulted.

The Blocks World Stacking Problem

At MIT, a lot of early work used the blocks world domain as a testbed for planning ideas. In the blocks world, a robot with a camera and one arm with a grasper is given the task of stacking blocks on a table in some specified arrangement. Since the robot can only move one block at a time, and blocks may already be in stacks, the problem is one of unstacking and stacking. Ideally the planner should create efficient plans with no wasted moves.

We'll make this problem as simple as possible. The robot has just one stack of blocks to worry about. We'll represent the state of the world as just a stack of blocks, such as (CONS A (CONS B (CONS C NIL))) for the stack with A on B on C.

The task to create a stack can be represented as a query with a variable for the desired plan, the initial stack, and the goal stack. For example, given this problem to solve

becomes this query:

(plan ?plan (cons a (cons b (cons c nil)))
            (cons a (cons b (cons d (cons c nil)))))

Two actions are sufficient for this simple version of the problem:

the action POP removes the block on the top of the stack and puts it on the table
the action (PUSH x) takes the block x from the table and puts it on the top of the stack

These RESULT rules describe those effects:

(<- (results pop (cons ?x ?stack) ?stack))

(<- (results (push ?x) ?stack (cons ?x ?stack)))

Neither rule needs antecedents in this simple world. The POP can be applied to any stack with an element on top. The result state is the stack without that element. PUSH says that if ?x is put on the stack, then it appears on top in the result state.

Note that these rules are not changing any data structures. They are just rules about how this "world" works, how things would change if actions were taken in a given world state.

The harder part is writing STEP rules to say when to POP or PUSH. It would be easy to get into an endless loop here. A deductive planner tries all possibilities. We don't want our planner to say "what if I PUSH A and then POP and then PUSH A then POP then ..." We have to write rules for STEP that direct the planner "toward" the goal in some way that doesn't lead to cycles.

At this point, it pays to do a number of examples by hand, from one or two blocks to three or four, in different arrangements, to get a feeling for when to POP and when to PUSH. Eventually, the following rules for selecting when to POP and PUSH should make sense:

Pop until the current stack matches the bottom of the goal stack
Push a block X if X + the current stack matches the bottom of the stack

In the example problem given above, the first rule would pop every block until the stack has just block C. Then the second rule would push D, to make the stack D C, then push B, to make the stack B D C, and then push A, to make the stack A B D C. Only those choices make stacks that incrementally match the bottom of the goal stack.

How do we write a rule to test if the current stack matches the bottom of the goal stack? Since we are using lists to represent stacks, this turns out to be very simple. A stack A matches the bottom of another stack B if there is a list that could be appended to the front of A to make B. So our step selection rules are:

(<- (step pop ?current ?goal)
    (not (append ?top ?current ?goal)))
    
(<- (step (push ?x) ?current ?goal)
    (append ?top (cons ?x ?current) ?goal))

Study these for a while. The first rule uses the NOT logical operator. For the second rule, ask yourself "How does it figure out what ?X should be?"

These are the rules, the rules for APPEND, and the two PLAN rules are we for the one-stack blocks world planner, You can try them out by loading sddr-plan.lisp. You can see some test problems in sddr-plan-tests.lisp.

The Monkey and Bananas Problem

While MIT had Blocks World, John McCarthy at Stanford had the Monkey and Bananas problem. A monkey enters a room. Hanging from the ceiling is a bunch of bananas, just out of reach. Over by the window is a box. The planning problem is to generate the plan that gets the bananas by moving the box under the bananas and climbing on top of the box.

The only thing we really care about in this problem are the locations of the monkey, box, and bananas. To make things really simple, we'll put the bananas in the center of the room. So a simple state representation that has only one way to describe any state is

(state monkey-location box-location)

The box can be at the DOOR, WINDOW or CENTER. The Monkey can be those places or BOX-TOP.

If the monkey starts at the door and the box is by the window, then the basic problem query is

(plan ?plan (state door window)
            (state box-top center))

I.e., the box has to end up under the bananas and the monkey has to be on the box.

There are three actions the monkey can do. It can walk to a location, push the box to a location, or climb on the box. The action result rules are pretty simple, but unlike the Blocks World, there are some preconditions that have to be true for some actions to be possible. These preconditions are not the same as the step selection rules.

The monkey can climb on the box if the monkey is in the same location as the box.

(<- (results climb-box
            (state ?bloc ?bloc)
            (state box-top ?bloc)))

Notice how we use repeated variables to require the monkey and box to be in the same location.

The monkey can push the box to a location if the monkey is in the same location as the box. The location of both changes.

(<- (results (push-box ?bloc2)
            (state ?bloc1 ?bloc1)
            (state ?bloc2 ?bloc2)))

The monkey can walk to any location.

(<- (results (walk-to ?mloc2)
            (state ?mloc1 ?bloc)
            (state ?mloc2 ?bloc)))

Now we need to guide the monkey to choose actions that lead toward getting the bananas, and never fall into an endless loop.

For example, if the monkey is not where the box is or already on the box, then the monkey should walk to the box.

(<- (step (walk-to ?bloc)
          (state ?mloc ?bloc)
          (state ?mloc-2 ?gloc))
    (not (same ?mloc-1 ?bloc))
    (not (same ?mloc-1 box-top)))

Note the use of (not (same ...)). Simply writing different variables does not ensure different values.

Similarly, if the box is not where the bananas are -- the goal location -- then the monkey should push the box to that location.

(<- (step (push-box ?gloc)
          (state ?bloc ?bloc)
          (state ?mloc ?gloc))
    (not (same ?bloc ?gloc)))

Finally, if the box and monkey are at the desired location, the monkey should climb on the box.

(<- (step climb-box
          (state ?gloc ?gloc)
          (state box-top ?gloc)))

These rules, plus the two general PLAN rules, will solve the problem. You can try them out by loading sddr-plan.lisp.