Deductive retrieval is a generalization of simple data
retrieval. A data retriever has a database of assertions, e.g.,
(married john mary), (in chicago illinois), and
so on. The main operations are
The retriever looks through the database and returns all the assertions that match the input query pattern. Assertions are usually indexed and organized for fast retrieval, but the semantics is as if all assertions are matched against the query.
A deductive retriever adds the following operations:
A simple deductive retriever in Lisp is ddr.lisp. The general concepts are described in this document, and functions for using it are described here.
Deductive retrieval has these major processes:
A pattern can be any form, e.g., (in tiger savannah),
but usually a pattern will have variables, e.g., (in ?x
savannah). A form without variables is sometimes called a
constant form.
A form matches a pattern all corresponding elements of the form and pattern can be matched. Constant elements match each other if they are equal. Variable elements can match anything, but if multiple occurrences of the same variable must match the same thing.
The result of a match is a list of binding lists. A
binding list is a list of (variable value)
pairs, recording each pattern variable and the corresponding form
element.
For example, the result of matching (in ?x savannah)
with (in tiger savannah) is (((?x tiger))).
With simple patterns, the result of a match will be either
NIL or a list of one binding list. It is possible to
extend patterns in such a way that multiple bindings can occur,
i.e., there is more than one way the pattern can match the
form.
In particular, matching (in tiger savannah) with
(in tiger savannah) will produce (nil), while
matching (in tiger savannah) with (in tiger
swamp) will produce nil.
Unification introduces one additional feature to normal
pattern matching. It allows a pattern to match a pattern. This
raises a number of complexities. For example, a variable can bind
to another variable, or even to itself. Fortunately, these issues
have been worked out long ago, and the code for implementing
unification, while subtle and subject to pitfalls, is not all
that complicated. It can be found in the unify.lisp
file.
Patterns matching against patterns is important for backward chaining.
The knowledge base for our deductive retriever contains:
(in tiger savannah)(<- (near ?x grass) (in
?x savannah))(-> (predator-of ?x ?y)
(prey-of ?y ?x))Assertions and rules are stored using (tell
form), e.g.,
> (tell '(at tiger savannah)) > (tell '(<- (near ?x grass) (at ?x savannah))) > (tell '(-> (predator-of ?x ?y) (prey-of ?y ?x)))
A query is asked by typing (ask query).
The query can be any form.
If the query is a constant form, then we're effectively asking a yes or no question. For example, to ask "Is the tiger near grass?" we type:
> (ask '(at tiger savannah)) ((AT TIGER SAVANNAH))
The result is a list of answers to the query found in the KB. There's just one answer to this query.
Now consider a query very similar to the previous one:
> (ask '(near tiger grass) ((NEAR TIGER GRASS))
The query is still a "yes/no" questions, and the answer still
says "yes" but (near tiger
grass) was not explicitly stored in the knowledge base. It
was inferred using the backward chaining rule "something is near
grass if it is in a savannah."
In this case, only one rule was needed to derive the answer. In general, though, many rules may be chained together.
For example, let's add a rule that says that if you're near something, you can bite it:
> (tell '(<- (can-bite ?x ?y) (near ?x ?y)))
Now let's ask if the tiger can bite grass:
> (ask '(can-bite tiger grass)) ((CAN-BITE TIGER GRASS))
The answer is obviously yes, but it required two rules and an assertion to prove it. The retriever found a rule that said the tiger could bite grass if it was near grass. It then found a rule that said a tiger would be near grass if it was in a savannah. Finally, it found the assertion that the tiger was indeed in a savannah.
The two rules and the assertion form a chain, or proof, of the assertion that the tiger can bite grass. The retriever will find all the chains that it can. Chains always terminate in assertions.
Deductive retrieval is based on logical principles, but a deductive retriever is different from a theorem prover in a number of significant ways.
First, deductive retrievers are very limited in what they can deduce, for two reasons:
(<- predication
pattern pattern ...); You can't directly represent
statements like "Either Mary or John will come to the
party."P shows that
P is true if (NOT A) is true, while another
chain shows P is true if A is true, but neither
A nor (NOT A)has been asserted. Proving that
P is true requires assuming A and assuming
(NOT A), a technique not in our retriever.Deductive retrieval also uses techniques not available in logic. Prolog, for example, applies rules in the order in which they were defined, so that different proofs are possible, depending on the order in which rules are specified. Prolog also has a "cut" operation that allows a rule to eliminate other proof paths from the search process. Our retriever has a mechanism for allowing arbitrary Lisp code to be executed during retrieval.
There are many ways to represent a rule in English as deductive retrieval rules. Just as with the frame system, you have to develop a coherent ontology, i.e., a set of predicates and objects.
For example, consider the rule "something can satisfy its hunger with something if it's near something it eats." There are many parts to be represented in this rule:
There's no single right answer to these questions, though some answers are better than others, depending how sophisticated you need the retriever to be, how efficient it should, how easy you want it to be for others to add rules, and so on.
We've already chosen a representation for "near," namely
(near x y). "Can" is a complex concept that
still puzzles philosophers. "Hunger" is an interesting property,
because it keeps coming back. "Satisfying hunger" is not an
all-or-none result, but a matter of degree, ranging from "hungry
but surviving" to "stuffed to the gills."
However, for our purposes, we don't care about any of these
issues. We just want to know whether something will starve or
not. Therefore, we'll merge all these concepts into one
predicate. (can-satisfy-hunger-with x y) that
represents the claim that x could satisfy the problem of
hunger by eating y.
For "food it eats" we'll use the predicate (eats
x y) to say that y is a food that
x eats.
Our rule then becomes something like this:
(<- (can-satisfy-hunger-with ?x ?y)
(eats ?x ?y)
(near ?x ?y))
The difference between the above and
(<- (can-satisfy-hunger ?x)
(near ?x ?y)
(eats ?x ?y))
is purely one of possible efficiency. The predicate that generally has fewer answers should be tested first. In the extreme case, a predicate that is often false should be tested as early as possible in a rule, to avoid needless testing of other parts of the rule. The same answers would be returned by either form of the rule.
Since we already have a predator-of predicate,
let's link that to eats:
> (tell '(<- (eats ?y ?x) (predator-of ?x ?y)))
The important point is that every one of these choices for
predicates, include the near predicate, could have been
made differently. The odds are great that the rules that result
from the choices used here will not work well for a system
reasoning about spatial relationships, detailed animal behavior,
and so on.
Assume we've loaded the
assertions in *tiger-kb* in
ddr-tests.lisp.
> (load "ddr-tests.lisp") ... > (in-package :ddr-tests) ... > (init-kb *tiger-kb*) ...
Now, let's ask how our tiger can satisfy its hunger:
> (ask '(can-satisfy-hunger-with tiger ?x))
The retriever will find that it has to choose something
the tiger eats. To do that, it has to choose something a tiger is a
predator of. Suppose it picks zebra. Now, the
retriever will ask (near tiger zebra). This query will
fail, because there is no zebra in the same place as the tiger.
The fact that this query has failed doesn't mean that the
tiger can't eat something. It means that the retriever has to
"back up" and look for another answer to (predator-of tiger
...). The other possible answer is antelope.
This leads to the query (near tiger antelope)
which eventually succeeds.
The backing up process, undoing one possible match when it doesn't pan out, is called backtracking. It's one of the classic weak methods of AI, so called because it always works, even when the system is knowledge poor, but it may not work very efficiently. Basically, backtracking means writing your code to try all possibilities, because your code isn't smart enough to know the right thing to do.
Many predicates are very difficult or inefficient to do with
logical rules. For example, (< x y) should
return true if x is less than y, but try
defining this with pure logic. It can be done, but you have to
axomiatize arithmetic first.
On the other hand, our retriever is running on top of Lisp,
and it's trivial to tell if x is less than y in
Lisp. So, why not allow our retriever to call Lisp functions?
We do this in our retriever by defining retrieval
methods. A retrieval method for a predicate P is
basically a function to call when the retriever sees anything of
the form (P ...). Retrieval methods are defined
using
(define-retrieval-method name (var) body)
name
is the name of the predicate for which the method is being
defined, e.g., <.var will be bound
to the pattern beginning with that predicate, that the retriever
has come across.
The retrieval method has to be prepared for variables in its
arguments. For example, if (< ?x ?y) appears in a
rule, then ?x and ?y will be replaced by their
bindings, if any, but any unbound variable will still be in the
pattern passed to the retrieval method.
In the case of <, we'll assume that it's OK to
simply fail if there are any unbound, or otherwise non-numeric,
arguments.
(define-retrieval-method < (pat)
(destructuring-bind (x y) (pattern-args pat)
(and (numberp x)
(numberp y)
(< x y)
(make-empty-bindings)))
A retrieval method should always return a list of bindings
lists, or NIL, so that the predicate will act like every
other predicate in the system.
Comments?
Send mail to Chris Riesbeck.