The writeup below describes a very simple approach to memory-based parsing. The code that implements this simple approach is in parse.lisp. The accompanying file parse-tests.lisp has the toy examples given below.

We will model parsing for a MOP-based memory as a process that takes a list of words an input and returns a list of the MOPs in memory that the text might be referring to. We'll call this model memory-based because the parsing rules will use lexical, syntactic and semantic knowledge. We'll call this partial parsing because we won't try for completeness. The parser will accept and ignore words it doesn't know or concepts it can't combine together, on the assumption that it's knowledge is very incomplete.

Our parser will return multiple answers because input texts are often ambiguous and can be interpreted in several different ways. More specifically, our parser will return three values:

A parse result is an object that contains the concept identified and any additional attributes specified, the input words thatwere used to construct the result (in reverse order), and any words that followed the words used.

Here are some simple examples. First, an example that, in this small knowledge base is unambiguous:

> (parse '(catch a butterfly))
      M-CATCH-ANIMAL7      an instance of catching an animal
      (#<M-CATCH-ANIMAL (:OBJECT (M-BUTTERFLY)) Used: (A BUTTERFLY CATCH)>)
      (#<M-CATCH-ANIMAL (:OBJECT (M-BUTTERFLY)) Used: (A BUTTERFLY CATCH)>
      #<M-BUTTERFLY Used: (A BUTTERFLY)>)

We can see the memory structure that resulted with show-frame:

> ((show-frame 'm-catch-animal7)
         TYPE :INSTANCE
         ISA M-CATCH-ANIMAL
         :OBJECT M-BUTTERFLY
            TYPE :MOP
            ISA M-INSECT

Another unambiguous sentence that uses "catch" to mean something very different:

>  (parse '(catch a cold))
      M-GET-DISEASE8
      (#<M-GET-DISEASE (:OBJECT (M-COLD-DISEASE)) Used: (COLD CATCH)>)
      (#<M-GET-DISEASE (:OBJECT (M-COLD-DISEASE)) Used: (COLD CATCH)>
      #<M-COLD-DISEASE Used: (COLD)>)

Again, to see the contents of the new instance:

> ((show-frame 'm-get-disease8)
         TYPE :INSTANCE
         ISA M-GET-DISEASE
         :OBJECT M-COLD-DISEASE
            TYPE :MOP
            ISA M-DISEASE

Finally, an ambiguous sentence, using the word "bug" which might mean an insect or a disease:

>  (parse '(catch a bug))
      NIL            no instance because of ambiguity
      (#<M-CATCH-ANIMAL (:OBJECT (M-INSECT)) Used: (BUG CATCH)>
      #<M-GET-DISEASE (:OBJECT (M-DISEASE)) Used: (BUG CATCH)>)
      (#<M-GET-DISEASE (:OBJECT (M-DISEASE)) Used: (BUG CATCH)>
      #<M-CATCH-ANIMAL (:OBJECT (M-INSECT)) Used: (BUG CATCH)>
      #<M-DISEASE Used: (BUG)>
      #<M-INSECT Used: (BUG)>)

Basic Concepts

We're going to attach phrases directly to concepts in memory. A phrase is a list of items. The simplest items are words. Phrases are attached to concepts with defphrase, e.g.,

(defphrase m-tiger tiger)
          
         (defphrase m-kangaroo-rat kangaroo rat)

The general syntax is (defphrase concept item1 item2 ...). The general rule for processing a phrase is:

If you see all the item of phrase in the input, in the given order, then you've recognized the phrase, and therefore have seen a reference to the concept to which the phrase is attached.

The kicker is that another kind of item can appear in a phrase: a role name. When a role name appears, it represents whatever kind of concept fills that role in the concept to which the phrase is attached. So if the phrase (:actor eat :object) is attached to a MOP for eating, then :actor means whatever concept can be an actor for eating, e.g., an animal concept, and :object means whatever concept can be the object of eating, e.g., a food concept.

A concept can have many phrases attached, using multiple defphrase's, and the same phrase can be attached to many concepts. The latter is where many, though not all, ambiguities arise.

A Parsing Algorithm

The basic algorithm uses two steps:

This two-phase approach is common in large knowledge bases using potentially expensive matching operations.

The fetch phase implements the following rules to decide what phrases might apply to a given list of input words. For a phrase to be a potential parser of the input, it must:

Note that the input may have words not in the phrase, that keywords are almost ignored, and that a phrase with all keywords only rejects inputs shorter than it.

The rules for the match phase are slightly more complicated. Matching is done from left to right:

For example, when matching (catch :object) with the base concept m-get-disease against the input (catch a cold):

This is where the list generated by the fetch phrase comes in handy. The only phrases that might satisfy the first condition have already been fetched. Therefore, all the matcher has to do is try those phrases in that list whose base concept satisfies the second condition. There's no need to search all of phrase memory again.