CS 337 -- Intro to Semantic Information Processing-- L. Birnbaum

LECTURE 7:  CAUSAL CHAINS AND INFERENCE, II

Last time, we concluded that the representation of a coherent

paragraph describing some event consists of a CAUSAL CHAIN that links

the individual actions and states involved.  It seems intuitively

clear that understanding involves building such a representation,

since it is a direct outcome of, and reflects, our ability to explain

the event.

Behaviorally, what can we use causal chains for?

Who gave the slave girl the necklace?

To summarize stories:

Jack was thirsty.  He went to the kitchen and took a beer from the

refrigerator.  His wife asked him to do the dishes.  He told her

he would do them when he finished the beer.  He went to the den.

He turned on the radio.  He heard that the weatherman predicted it

would rain tomorrow.  He went to a chair and sat down.  The chair

fell over and Jack landed on the floor.  The beer spilled all over

the chair.  When Jack's wife saw the mess, she was angry.

Summarization rules (Schank 1975):

1. Deadend chains will be forgotten.

2. Sequential chains may be shortened.

3. Disconnected pieces will be connected or forgotten.

4. Pieces that have many causal connections are crucial.

Summary: Jack was thirsty.  He got a beer.  He went into the den.

He sat down on a chair.  The chair fell over.  Jack fell on the

floor.  The beer spilled on the chair.  Jack's wife was angry.

Let's turn to Charniak's (1972) theory.  Unlike Rieger and Schank, who

propose both a theory of content and of process, Charniak's theory is

basically a process theory -- it makes no commitment to the kinds of

inferences that will be drawn.

Charniak doesn't make all inferences all the time.  Some inferences

are made right away, others are deferred.

BASE ROUTINES attached to concepts make immediate inferences, and

activate DEMONS.

DEMONS represent expectations -- they draw inferences only upon

the mention of certain other concepts -- i.e., there is a notion

of RELEVANCE.

Example: The base routine for "piggy bank" will activate the following

demons:

1 If you want money, then you can get it from the piggy bank if

you get the piggy bank.

2 If you want to get money out of the piggy bank, shake it.

3 If you shake a piggy-bank, and you don't hear anything, then

there is no money in it.

So consider the following story:

Janet wanted to get some money.  She found her piggy bank and

started to shake it.  She didn't hear anything.

Demon 1 links sentence 1 to sentence 2, demon 2 operates within

sentence 2, demon 3 links sentence 2 to sentence 3.

Basic model:

activate demons

|-------<---------|

|                 |

|                 |

Incoming       Apply             Apply

assertions --> active demons --> base routines

|              |                 |   \

|              |                 |    \

|--------<-----|--------<--------|     \

inferences                   clean-up

Compare with Rieger:

1 No specific inference types -- i.e., no content theory of causal

inference.

2 Deferral of inference.

3 Demons are more goal-directed.

4 Handles inferences from complex conjunctions of features much

better -- i.e., the inferences are more SPECIFIC.

Now let's consider the following story:

John went to a restaurant.

He asked for a hamburger.

He paid and left.

How could we do this with Rieger inference?

From the first sentence, we can infer that was at the restaurant

(resultative), from which we can infer that John wanted to be at

the restaurant (motivational), from which we can infer that John

wanted to eat something (functional).  Along with many other,

mostly irrelevant inferences, e.g., John was somewhere else before

he went to the restaurant (enablement).

It's hard to know how much we can expect to get out of the second

sentence "bottom-up", looking only at the words it contains.

Giving Rieger the benefit of the doubt, we'll say it means "John

told someone that he wanted someone to give him a hamburger." From

this we can infer that John wanted someone to give him a hamburger

(enablement), from which we can infer (maybe) that John wanted to

have a hamburger (what inference type would this be?), from which

we can infer that John wanted to eat a hamburger (functional).

This matches the inference from sentence 1 that John wanted to eat

something.  So we know that he asked for the hamburger for the

same reason that he went to the restaurant -- in order to eat.

This hardly captures what we know here -- which is that you ALWAYS

ask for what you want to eat at a restaurant.

"He paid" means that he gave someone some money in exchange for

that someone giving something to him.  Through some kludging, we

might be able to get this to match the fact that John wanted

someone to give him a hamburger.

Thus, with a little juggling, a program using Rieger-style inference

could be made to realize that John went to the restaurant because he

wanted to eat, that he asked for the hamburger for the same reason,

and, with a little more juggling, maybe even that he ate the

hamburger, and that he paid for the hamburger.

This is rather unsatisfactory for a number of reasons:

PROCESS: Too many irrelevant inferences.

It would also infer such true but irrelevant gems as the fact

that John was somewhere else before he went to the restaurant,

that he had gone to that somewhere else, that someone gave him

the money that he used to pay, that someone gave that person

the money, etc.

Furthermore, of course, we can write restaurant stories that

leave even more implicit:

John went to a restaurant.

The salad bar looked good.

He left 5 dollars on the table.

CONTENT: We should be using more specific knowledge.

It leaves many questions unanswered: DID John eat the

hamburger or not?  It also doesn't realize:

That John probably sat down.

That he may have looked at a menu.

That a waiter took his order.

That the waiter told the cook his order.

That the waiter brought the hamburger.

That the waiter brought the check.

We also know that people ask for the food they want in

restaurants -- this is not an accidental relation.  That is,

the Rieger program would fail to recognize that the fact that

John went to the restaurant to eat, and that he asked someone

to bring him a hamburger so that he could eat, are JOINTLY

responsible for him getting something to eat.  That is, they

are RELATED elements in a single plan.  As far as Rieger's

program would know, the above restaurant story is no different

from the following:

John went to the refrigerator.

He asked for a hamburger.

He paid and left.