Benchmark Problems On Production Rules

Benchmark problems on production rules

Summary: Think of this as a midterm, which allows both you yourself and the instructional team to assess your progress in computational thinking in a quantitative way. Work on your own, but feel free to ask questions of the instructional team, both about what the assignment asks for and about the progress that you are making towards a solution. This is a real, interesting, significant program, so it won't be easy. But we'd like everybody to solve it.

For now (Friday Oct 12, 8am), just a the instructions for part one of the programming project are provided. This should be enough to keep you busy this week. Updates as they become available

Overview

In this assignment you will implement a simple production rule system. This kind of system is also known as a deductive database. This metaphor will become clear over the course of the assignment description.

Objectives

This assignment has a number of objectives. The most important is for you to practice the programming skills that you have learned so far in the class. The description of the assignment breaks down the overall project that you need to implement into a set of small parts. Each of these parts can be implemented as a simple scheme function using the kinds of thinking and problem solving that you have been doing so far.

A second objective, however, is to get a sense of how large programs fit together. You can expect that it will take you a lot of work to understand the design for the production rule system as a whole. There are many parts and the way they fit together is very precise. That is typical of programs, so as you develop an understanding of the program as a whole you will get some important insights into the architecture of programs. A particular thing to pay attention to is the way that an overall architecture helps to clarify what a function needs to do - for example why a function has the arguments it has or why it returns its results in the way that it does.

A final objective is to appreciate production rule systems as models for cognitive science. Production rule systems have tons of applications, and they are one of the most effective and inspiring testbeds for linking together algorithms, knowledge and problem solving. We have some examples of this: transitive closure and spatial reasoning; planning; and natural language parsing. Understanding from the inside how these algorithms run - seeing the implementation all the way through and watching the consequences - should give you a much more precise way to think about cognitive decision making.

Workflow

The procedure for doing the assignment is to download an initial scheme file called the skeleton of the assignment. The skeleton contains all the low-level code that you can rely on to complete the assignment. It contains comments that indicate each of the functions that you need to write to complete the assignment. It contains test examples that you can use to make sure that each of these functions is working. And it contains some overall examples that illustrate the functionality of the production rule system as a whole. Then read through the assignment to get an overall picture of what you have to do. Once you've done this, think things through to make sure you have a clear idea of what is required and how to proceed. Then go through the skeleton code gradually implementing the functions that are required to complete the assignment. Be sure to test each function as you write it. That way you find a bug as soon as it appears. In this assignment there are high-level functions that depend on lower-level functions. If you test and debug as you go along you will never get bitten by a bug in a low-level function that you have to discover when a high-level function misbehaves.

Assignment outline

The organization of the assignment is as follows. First, I describe the structure of facts, rules, and bindings. This leads to a characterization of the basic operation of a production rule system, which is to apply a rule to a corresponding fact. In this section, there are a few key functions to write. They can be tested in a simple way, on their own. I encourage you to do this as the first step of the programming for this assignment.

Second, I describe the fundamental control mechanism for a production rule system. There is a bank of facts and a bank of rules that have already been processed. Then there is a queue of facts and rules that have not yet been processed. The basic step is of control is to take a new item off the queue, find the new results from processing this item against the background of facts and rules, and to add those resulting items back onto the queue. You are given an implementation of queues (we will see why over the course of the next few weeks), so this also provides practice with the crucial art of building on existing computational components!

Third, I describe an optional optimization that you can add to your solution. This optimization avoids adding new results to the database when more general results are already present there. This is a difficult programming problem because you need to avoid adding duplicate rules as well as duplicate facts. However, the optimization is important for underwriting a good characterization of the algorithmic complexity of your production rule implementation. We will return to this issue in class over the next few weeks.

Part One: Facts and Rules

Data structures

Facts and rules are symbol structures. They make a distinction between a special kind of symbol called a variable and everything else. The skeleton code contains a function

that returns true if and only if the input symbol s is a variable. It happens to be true if s begins with a question mark. So for example the symbol

is a variable while the symbol

is not a variable. This turns out to be a common convention in Artificial Intelligence.

A fact is determined by a list of symbols that contains no variables. Such structures are also known as ground terms in computational logic. Concretely, then, a fact has the form:

A very famous encoding of facts into a database concerns the "blocks world", which describes the arrangement of children's alphabet blocks in stacks on a work table. You have symbols to represent the blocks, which typically take the form blockn where n is a small number. You have the symbol table to represent the table. You have a relation symbol on which indicates that one block is immediately stacked on top of another in a tower, and a relation symbol above which indicates that one block is located in the same tower as another, anywhere higher up. In this setting, you might have sample facts like these to describe one complete tower:

This situation describes a tower consisting of three blocks stacked on top of one another: block1, block2 and block3. You can see that the above relation is pretty cumbersome to specify and it can be determined automatically from the basic relation of what blocks are on what other blocks. That provides the basis for rules.

A rule has two parts: a pattern and an action. A pattern is a list of symbols, possibly containing a set of variables. An action uses those variables to abstract from a fact or a rule. Note that rules are recursive! Rules can have rules as parts just as all the programs that we have considered so far in class have other programs as parts.

Intuitively, the variables in a pattern are placeholders for a range of specific symbols that the rule might apply to. To use the rule we find a fact that specifies particular values of the variables. Then we replace the variables in the action with these values. And we take the action.

Concretely, a rule has the form:

Here are two example rules.

This says that if block ?a is on block ?b, then it must also be true that block ?a is above block ?b. This is a simple kind of rule that infers one fact from another.

This is more complicated. It says that if ?a is immediately on top of block ?b, and block ?b in turn is anywhere in a tower above another block ?c, then ?a must also be in that tower above ?c. This rule works recursively. Whenever we find one block on another, this rule "fires" and adds a new rule to the situation. That rule encodes our knowledge about the two blocks by looking to extend tower information we have from the lower block to the higher block.

Examples of inference

Let's see how these rules would be instantiated in the particular situation. First off, we can match the first fact with the first rule:

This requires taking ?a to be block1 and ?b to be table. Based on this match, we instantiate the fact which is the action of the rule:

Similar inferences allow us to infer:

Now we can consider the second rule. Again it matches based on the on facts in the database. For example one match is:

Again we take ?a to be block1 and ?b to be table. Based on this match, we have a new rule to add to the knowledge base which represents the action of the rule.

In other words, if we find out that something's under the table (so to speak), then we will also learn that it's under block1. This rule will never do anything, given how the blocks world works. But there are two analogous rules that we can also add in this situation that will start doing work for us. These are the matches with block2 and block3.

These new rules match the above facts we inferred by the first rule we considered - in what you might think of as a basically recursive way. From here:

we get:

and analogously

Processing has to continue with these new facts! We can now run the inference to match

And we finally get

The match operation

We start by building up in pieces a function that computes and action, if possible, from a rule and a fact.

The logic of match

For example, in this case:

The result would be

However, in the case

The result would be

To be precise, the result of match is a list with two elements. The first element is a boolean value which indicates whether the rule applies to the fact to yield a match. The second value is the match if there was one, or the empty list otherwise. This is a common way of returning a complex result for a computation in Scheme. It is especially useful in cases such as this where a function may not succeed, or it may not always make sense to apply the function to the available data.

Bindings

In order to implement match, you will have to use a bunch of auxiliary functions and data structures. This is all spelled out in this section. The key data structure that you need is called a binding. It is a list of pairs of the form (variable, value). Aligning the pattern of the rule and the fact, if it is successful, creates a binding that indicates how the rule applies to this case. In the case of a successful match, substituting according to the binding in the action determines the fact or rule that represents the rule result. So the overall match operation boils down to two steps: align and substitute.

A basic operation that you will therefore need to start is a lookup operation that takes a variable and a binding and gets the value. You've written code like this before, but in this case you need to be prepared that you might not find a value for the variable and you need to indicate whether you found a value for the variable as a result. So you write a function

Which returns a two-element list, as usual, giving whether the variable was defined in the binding and if it was what its value is. By convention, the value of a variable that is not found in the binding is itself. For example:

And

The logic of Align

The specification for the align operation is to take a fact and a pattern and return an optional binding, if there is one, using the same convention with a two-part list as with match. So we write a definition

And we'll get results like:

Or

Align is an interface to a lower-level recursive helper function align-step which incrementally extends an input binding by stepping through the list of symbols associated with a fact - call it the nucleus of the fact - one element at a time. The idea of such a helper function should be familiar - it is the same kind of thing that we used in adding up the elements of a list or evaluating a UFO program. Concretely, we have a definition

And we will get results like

(And the same for every binding!) Or:

And so forth.

It even helps to write a further helper function Align-term defined this way:

That just looks at two elements - one from the nucleus and one from the pattern - and handles separately the case where the element in the pattern is a variable and the case where the element in the pattern is a constant.

The logic of Substitute

Substitute can be written directly by recursion. It is easiest if you just recurse on an arbitrary scheme data structure, applying the substitution whenever a constituent is a list. So define

With results like: