Deductive Retrieval

Introduction

Deductive retrieval is a generalization of simple data retrieval. Though the extensions appear relatively small, the result is in fact a full-fledged alternative to the declarative approach seen in C++, Java, PHP, ..., and the functional approach seen in Lisp, Scheme, Haskell,.... This alternative is called declarative programming

The simplest deductive retriever code I've been able to write is in the Simple Deductive Data Retriever (SDDR) It does unification and backward chaining and can handle all the key examples.

There is also the Deductive Data Retriever (DDR). This has a bit more code in order to be more efficient. It gets the same answers the same way, but the API is a little different. Examples on this page will use SDDR.

A data retriever has a database of assertions. There are two kinds of assertions. There are simple facts. A fact is a list that begins with the name of a logical predicate, followed by any number of arguments. For example, (married john mary) might assert that John is married to Mary, and (in chicago illinois) might assert that Chicago is in Illinois.

There are no predefined predicates. You implement a knowledge base by deciding what predicates and arguments you want to have.

The other kind of assertions are inference rules. An inference rule has the form

(<- consequent antecedent1 antecedent2 ...)

Both the consequent and antecedents are facts. The above rule says that consequent is true if you can prove that every antecedent is true.

In SDDR, every assertion is written as a rule. A fact is an assertion with a consequent and no antecedents. That means nothing needs to be proved to know that the consequent is true.

The final key idea is that the arguments in any assertion can be a variable. A variable is a symbol beginning with a question mark. This lets us write general inference rules. For example, there is a classic example in syllogistic logic:

All humans are mortal.
Socrates is a human.
Therefore, Socrates is mortal.

In SDDR, the first two rules would be represented with

(defparameter *rules*
  '(
    (<- (mortal ?x) (human ?x))
    (<- (human socrates))
  ))

You can ask SDDR to infer if Socrates is mortal with the function ask.

> (ask '(mortal socrates) *rules*)
((MORTAL SOCRATES))

SDDR looks through all the rules and facts returns all the assertions that match the input query pattern that can be inferred.

Deductive retrieval has these major processes:

unification -- generalized pattern matching
backward chaining -- linking rules together to form a conclusion
backtracking -- trying all possibilities until one or all answers are found

Pattern Matching

A pattern can be any form, e.g., (in tiger savannah), but usually a pattern will have variables, e.g., (in ?x savannah). A form without variables is sometimes called a constant form.

A form matches a pattern all corresponding elements of the form and pattern can be matched. Constant elements match each other if they are equal. Variable elements can match anything, but if multiple occurrences of the same variable must match the same thing.

The result of a match is a list of binding lists. A binding list is a list of (variable value) pairs, recording each pattern variable and the corresponding form element.

For example, the result of matching (in ?x savannah) with (in tiger savannah) is (((?x tiger))).

As SDDR searches, it will often find multiple rules and facts that can match, leading to multiple bindings lists. Each binding list is an answer to the query.

At the end, ask returns a list of copies of the original query, one for each binding list found, if any. Any variables in the query are replaced by the bindings for those variables in the binding list. So, if the rules were

(defparameter *rules*
  '(
    (<- (mortal ?x) (human ?x))
    (<- (human plato))
    (<- (human socrates))
  ))

then SDDR can be asked to return all mortals:

>(ask '(mortal ?x) *rules*)
((MORTAL PLATO) (MORTAL SOCRATES))

Unification

Unification introduces one additional feature to normal pattern matching. It allows a pattern to match a pattern. This raises a number of complexities. That means that variables may be matched against variables or forms containing variables. If the patterns share variables, a variable may be matched against itself! The definition of unification is surprisingly short:

A pattern P unifies with a pattern Q if there exists any assignment of values to variables that makes P the same as Q.

Unification is symmetric. If P unifies with Q, then Q unifies with P. Here are some examples of patterns that do and do not unify:

P	Q	Unify?	Why / Why not?
A	A	Yes	Already equal
A	B	No	No variable assignment can fix this
?X	A	Yes	Assign A to ?X
A	?X	Yes	Assign A to ?X
?X	?Y	Yes	Assign A (say) to ?X and ?Y
?X	?X	Yes	Already equal
(A ?X ?Y)	(?X ?Y A)	Yes	Assign A to ?X and ?Y
?X	(A ?Y)	Yes	Assign B (say) to ?Y and (A B) to ?X
?X	(A ?X)	No	Nothing can be assigned to ?X to make these equal
(?X ?Y)	(?Y (F ?X))	No	Nothing can be assigned to ?X and ?Y to make these equal
(?X A)	((F ?Y) ?Y)	Yes	Assign A to ?Y and (F A) to ?X

The definition of unification is not constructive. It doesn't say how to find an assignment of variables. Fortunately, a constructive algorithm was worked out long ago. A function unify can be defined that takes two patterns and, as with the regular matcher, returns a list of bindings lists if, and only if, the two patterns unify. Like pattern matching with one pattern, unify binds variables as it sees them, but there are additional checks when binding a variable to a variable. There are a number of pitfalls to watch out for, so it's important to have many unit tests in place to catch the tricky cases. For the Lisp version used in this class, see unify.lisp

Backward Chaining

Backward chaining is the process that SDDR uses to find answers to queries. Given a query, SDDR finds all assertions whose consequent unifies with the assertion. Then it goes "backward" and recursively asks if the all of the antecedents of each consequent are true. Facts, i.e., assertions with no antecedents, are true right way.

Backward chaining, i.e., recursively asking about every antedent, eventually either succeeds, generating a binding list of any variables used, or fails.

Deductive Retrieval versus Logic

Deductive retrieval is based on logical principles, but a deductive retriever is different from a theorem prover in a number of significant ways.

First, simple deductive retrievers are very limited in the logic that they can represent and deduce:

Rules are limited to the form (<- predication pattern pattern ...).
- You can't directly represent disjunctive assertions, such as "Either Mary or John will come to the party."
All assertions are universally quantified. That is, the assertion (p ?x) says "for all x, p(x) is true," or in logic notation, "∀x p(x)." The assertion (<- (q ?x) (p ?x)) says "for all x, p(x) implies q(x)," or in logic notation "∀x p(x) ⇒ q(x)."
- You can't assert something like "There exists a smallest prime."
All queries are existentially quantified. That is, the query (p ?x) asks "does there exist an x such that p(x) is true?," or, in logic, "∃x p(x)." The retriever returns all such values.
- You can't ask something like "Are all birds mammals?"
You can't make assertions or queries with variables for predicates. These systems implement first-order logic only.
- You can't assert something like "All predicates about mammals take two arguments.

Deductive systems can be extended to overcome many of these limitations, and others, but this has to be done carefully. The limitations to what's called Horn clause logic allow for very simple and fast deductive retrieval. Adding more power can lead to slower processing, or to queries that can't be answered.

Deductive retrieval also uses techniques not available in logic. Prolog, for example, applies rules in the order in which they were defined, so that different proofs are possible, depending on the order in which rules are specified. Prolog also has a "cut" operation that allows a rule to eliminate other proof paths from the search process. Our retriever has a mechanism for allowing arbitrary Lisp code to be executed during retrieval.

Negation in the deductive retriever

One of the most common places where deductive retrieval behaves differently than logic is when (NOT p) is implemented as "true if and only if p can not be proved." While this works as expected in many cases, consider these rules for a blocks world robot arm.

(<- (can-move ?x) (block ?x) (clear ?x))
(<- (clear ?x) (not (on ?y ?x)))
(block a)
(block b)
(block c)
(on b a)

The first rule says "you can move a block if it is clear." The second rule says that something is clear if nothing is on it. The remaining assertions describe a scene with blocks A, B, and C, where block B is stacked on block A.

Here's what the deductive retriever returns for the following queries.

Query	Results
(clear a)	NIL
(clear b)	((CLEAR B))
(clear c)	((CLEAR C))
(can-move a)	NIL
(can-move b)	((CAN-MOVE B))
(can-move c)	((CAN-MOVE C))
(can-move ?x)	((CAN-MOVE B) (CAN-MOVE C))
(clear ?x)	NIL

Everything is just what you expect until the last row. Even though the retriever can respond correctly to "what blocks can I move?" it can't respond correctly to "what blocks are clear?" Why the difference?

The query (clear ?x) leads to the query (not (on ?y ?x)). (on ?y ?x) returns ((ON B A)), so (not (on ?y ?x)) query fails.

In contrast, the query (can-move ?x) leads to the query (block ?x), which has three answers, A, B, and C. This leads to the queries (not (on ?y A)), (not (on ?y B)), and (not (on ?y C)). The queries (on ?y B), and (on ?y C) fail, so the corresponding not queries succeed and hence the can-move succeeds for those cases.

We can fix this specific problem by changing the second rule to

(<- (clear ?x) (block ?x) (not (on ?y ?x)))

We will need a version of this rule for every kind of shape we want to use.

The failure to prove (clear ?x) with the original version of the rule is not a bug in the code, but a fundamental limitation in what Horn clause logic can represent. The query (clear ?x) logically asks "∃x clear(x)" which, in the original form of the rule, asks "∃x ¬∃y on(y, x)." Now we're in trouble. In logic, "∃x ¬∃y on(y, x)" is equivalent to "∃x ∀y ¬on(y, x)," i.e., "is there an x such that for all possible y, on(y, x) is not true?" But Horn clause logic doesn't support universal queries, only existential ones.

Backtracking

Let's define a set of rules about tigers and what they can eat.

(defparameter *tiger-kb*
  '(
    (<- (eats antelope grass))
    (<- (eats antelope ferns))
    (<- (predator-of antelope tiger))
    (<- (predator-of zebra tiger))

    (<- (at antelope savannah))
    (<- (at tiger savannah))
    (<- (at grass savannah))

    (<- (can-survive ?x)
        (eats ?x ?y)
        (can-catch ?x ?y))

    (<- (eats ?y ?x)
        (predator-of ?x ?y))

    (<- (can-catch ?x ?y)
        (near ?x ?y))

    (<- (near ?x ?y)
        (at ?x ?loc)
        (at ?y ?loc))
    ))

Now, let's ask how our tiger can satisfy its hunger:

> (ask '(can-survive tiger) *tiger-kb*)

The retriever will find that it has to choose something the tiger eats. To do that, it has to choose something a tiger is a predator of. Suppose it picks zebra. Now, the retriever will ask (near tiger zebra). This query will fail, because there is no zebra in the same place as the tiger.

The fact that this query has failed doesn't mean that the tiger can't eat something. It means that the retriever has to "back up" and look for another answer to (predator-of tiger ...). The other possible answer is antelope. This leads to the query (near tiger antelope) which eventually succeeds.

The backing up process, undoing one possible match when it doesn't pan out, is called backtracking. It's one of the classic weak methods of AI, so called because it always works, even when the system is knowledge poor, but it may not work very efficiently. Basically, backtracking means writing your code to try all possibilities, because your code isn't smart enough to know the right thing to do.

Functional terms

The arguments of predicates are called terms. Terms in logic can be

Simple constants, e.g., tiger
Variables, e.g., ?x
Functional terms, e.g., (father-of john)

Functional terms may look like assertions but they most definitely are not. An assertion is like a sentence, e.g., (at tiger savannah) is "The tiger is in the savannah." An assertion can be true or false.

A functional term is like a noun phrase, e.g., (father-of john) is "the father of John." A functional term refers to something, but is neither true nor false. It's like the difference between "the dog" and "that is a dog." The former is a reference, the latter is an assertion.

Functional terms in logic are not like functions in Lisp. They are not executable. They don't return values. They are simply ways to construct a representation of something, e.g., (date 2015 10 13) for "the date October 13, 2015."

Functional terms turn out to give deductive retrieval super powers.

Arithmetic in logic

Implementing a backward chaining deductive retriever takes surprisingly little code, if we don't worry about efficiency. sddr.lisp is 80 lines of actual code. Only about half of that is for backward chaining. The rest is for unification.

Even more surprising is that such a small system is capable not only of retrieving implied logical facts, such as "Socrates is mortal", but of actually constructing new results.

The secret is functional terms. Consider the following two rules, from sddr-tests.lisp.

(defparameter *peano-kb*
  '(
    (<- (add 0 ?x ?x))
    (<- (add (succ ?x) ?y (succ ?z))
        (add ?x ?y ?z))
    ))

The first rule is an unconditional assertion that the (add x y z) is always true if x is 0, and y and z are the same.

The second rule is more easily understood in the forward direction: if (a x y z) is true, then (a (succ x) y (succ z) is also true.

What these are are rules for addition, where 0 is 0, and (succc x) means "the successor number of x", i.e., x + 1. So the first rule says "0 + x = x", and the second rule says "if x + y = z, then (x + 1) + y = (z + 1)".

All of which seems true but pointless. What can we do with this? A lot, it turns out.

We can verify addition, e.g., that 2 + 3 = 5.

(ask '(add (s (s (s 0)) (s (s (s 0))) (s (s (s (s (s (s 0)))))))) *peano-kb*)

This will succeed for valid sums and fail for others. But we can also ask the retriever, with no additional machinery, to calculate the sum for us.

(ask '(add (s (s (s 0)) (s (s (s 0))) ?sum)) *peano-kb*)

This will return one answer, where ?sum is 5, written with successor terms.

We can actually use this to solve any simple addition, e.g., "x + 4 = 5".

(ask '(add ?x (s (s (s 0))) (s (s (s (s (s (s 0)))))))) *peano-kb*)

We can even ask for all solutions to "x + y = 5".

(ask '(add ?x ?y (s (s (s (s (s (s 0)))))))) *peano-kb*)

To understand how this works, it helps to hand-execute a simple query, such as "1 + 1 = x", to see how just unification and backward chaining will eventually bind ?x to (succ (succ 0)).

Append and cons in logic

Clearly, we would not want to really implement arithmetic in logic. But almost exactly the same kind of rules can be used to implement concepts in Lisp, and that is a reasonable thing to do.

(defparameter *append-kb*
  '(
    (<- (append nil ?x ?x))
    (<- (append (cons ?x ?l1) ?l2 (cons ?x ?l3))
        (append ?l1 ?l2 ?l3))
    ))

The predicate (append x y z) says "x appended to y is z". The functional term (cons x y) represents a CONS pair (not the CONS function!). For example, the functional term (cons 1 (cons 2 nil)) represents the list (1 2). We use the constant nil for the empty list, for our convenience. To the deductive retriever, it's just another symbol. We could use empty or nothing, or whatever we want.

The first rule says "nil appended to any list is the same list". This is like saying zero plus a number is the number.

The second rule, read forward, says "if l1 appended to l2 is l3, then the list (cons x l1) appended to l2 is the list (cons x l3)". As with addition, the rules seem to be true, but what can we do with them?

Well, we can append. Here we ask what we get if we append (1 2) with (3 4).

(ask '(append (cons 1 (cons 2 nil)) (cons 3 (cons 4 nil)) ?lst) *append-kb*)

This will return (cons 1 (cons 2 (cons 3 (cons 4 nil)))) for ?lst.

But we can also ask for solutions to problems such as "what appended to (1 2) would give (1 2 3 4)".

(ask '(append (cons 1 (cons 2 nil)) ?lst (cons 1 (cons 2 (cons 3 (cons 4 nil))))) *append-kb*)

We can even ask for all the ways to split the list (1 2 3 4).

(ask '(append ?head ?tail (cons 1 (cons 2 (cons 3 (cons 4 nil))))) *append-kb*)

Unlike arithmetic, this is gives our logic engine a very general tool for handling data. First, it can construct lists. Second, we can write backchaining rules that "loop over a list" by defining rules so that (member x lst) is true if x is an element of a list lst represented with CONS functional terms. This is in fact what Prolog does.

Now that we have a way to construct and map over lists, we can do things like planning using deduction.