Lecture 4 Quotes

What is the problem in deciding how to act?

Todd and Gigerenzer see themselves as the heirs to Herbert Simon and his tradition in psychology and cognitive science:

The research program on ecological rationality aims to explicate the mind-world interactions underlying good decision making.

Herbert Simon proposed the metaphor of the mind and world fitting together like the blades of a pair of scissors - the two must be well matched for effective behavior to be produced, and just looking at the cognitive blade will not explain how the scissors cut.

They describe their work as discovering heuristics - "simple decision algorithms that can work well in appropriate environments" that people use both in routine behavior and in important decisions. An example relevant to the mate choice problem we talked about last time in class:

Heuristic: Try-a-dozen. To select a high-valued option from an unknown sequence, set an aspiration level at highest value seen in first 12 options, then choose next option that exceeds aspiration. This is ecologically rational if there is an unknown distribution of option values and no returning to previously seen options. Surprisingly, it leads to near-optimal performance over a wide range of sequence lengths (i.e., the number of available options matters little)

We return to this and other examples momentarily. The question is how to understand such heuristics as an instance of Newell and Simon's notion of heuristic search as discussed in their 1975 Turing Award Lecture.

Planning as an explanation of behavior

Agre has a good perspective on these issues.

The ambivalence within Miller, Galanter and Pribram's theory of action [between a plan, which is selected whole from a library of action structures, and the plan, which explains all of an agent's activities as goal-directed behavior] reflects their failure to address adequately a central question: How is it that human activity can take account of the boundless variety of large and small contingencies that affect our everyday undertakings while still exhibiting an overall orderliness and coherence and remaining generally routine? In other words, how can flexible adaptation to specific situations be reconciled with the routine organization of activity?

I also like the rhetoric he uses to present his ideas as a contribution to cognitive science!

For these reasons, I propose that activity in worlds of realistic complexity is inherently a matter of improvisation. By "inherently" I mean that this is a necessary result, a property of the universe and not simply of a particular species of organism or a particular type of device. In particular, it is a computational result, one inherent in the physical realization of complex things.

Here is where things engage with the concept of choice as problem solving.

As thinking and acting intertwine, improvisation becomes a matter of continually redeciding what to do… This account is still Cartesian in the sense that each moment's action is brought about by an individual's discrete, deliberate choice, but this is still the only principled account of the relation between thought and action of which anyone can currently make any computational sense. At the same time, it is an interactionist view in one important respect: individuals continually choose among options presented by the world around them.

I propose to understand improvisation as a running argument in which an agent decides what to do by conducting a continually updated argument among various alterantives. This is an engineering proposal in the case of robots and a scientific proposal in the case of human beings… The argument [that agents conduct with themselves] might make reference to plans, maps, mnemonic devices, precedents from the actions of others, or anything else. Unanticipated issues can arise at any time, leading to new patterns of argument and possibly to changed courses of action. At any given moment, the complex of arguments leading to an agent's current actions is called its argument structure. As the agent interacts with its world, the argument structure will evolve.

Argumentation involves the adducing of arguments and counterarguments concerning a proposal for action.

Argumentation is problem solving and a way of thinking about responses to the current situation. The problem is not just to find an action to do. The problem is to find a good argument that justifies the choice of action to do next. An argument is good if all objections to it have been answered. You can test whether an argument is good in a particular situation by showing that all the potential counterarguments have been considered and disposed of. You can generate and search for such arguments by doing inference to construct arguments and taking stock of the relationships among them.

Back to fast-and-frugal heuristics

The point of the "try-a-dozen" heuristic is not to solve a problem of finding a possible action to do. As we saw last week, when you encounter an option (for mate selection for example) it's obvious that taking it is a possible action to do. There's no problem there.

The problem is actually to find a good argument for taking it, or for not taking it. Todd and Gigerenzer's heuristics can be thought of as examples of such arguments. These arguments are surprising because they are simple to calculate, can only be objected to on a few relatively simple grounds, and because they tend to lead to good decisions.

The Take-the-Best heuristic. To decide which of two recognized options is greater on some criterion, search through cues in order of validity, stop search on the first discriminating cue, and choose the option favored by this cue. This heuristic is ecologically rational if cue validities vary highly, but there is moderate to high redundancy between cues. Surprisingly, it can decide more accurately than multiple regression, neural networks, and exemplar models when generalizing to new data.

In challenging environments with high variability, low predictability, and little opportunity for learning, good decisions may nonetheless be made more often by simple mechanisms than by complex ones… People are sensitive to the distribution of cues in an environment, appropriately applying either Take The Best of a weighted additive mechanism, depending on whcih will be more accurate… Exactly how people are able to determine which type of environment they are in, and then which heuristics will be appropriate to apply, remains an open question.

Turing machines, universality and cognitive models

Pylyshyn argues that viewing a process from a computational point of view - at least in the sense that is relevant for cognitive science - depends on characterizing the process semantically as well as formally. You need this to make sense of universality:

Universality implies that a formal symbol-processing mechanism can produce any arbitrary input-output function. Now, even within the confines of symbolic functions, where we abstract from the physical properties of inputs and outputs, such arbitrary flexibility seems paradoxical. After all, if the mechanism is deterministic, it obviously can realize only one function - the one that pairs its inputs with its determinate outputs. To view the mechanism as universal, its inputs must be partitioned into distinct components. One component is assigned a privileged interpretation as instructions or as a specification of a particular input-output function, the other is treated as the proper input to that function. Such a partition is essential for defining a universal mechanism. This way of looking at the phenomenon of universality is important also because it shows that there can only be arbitrary plasticity of behavior if there is some interpretation of its inputs and outputs.

And you need this to be able to explain and predict behavior at a computational or symbol-processing level, rather than just a functional or physical level.

The states of a computer, when it is viewed as a physical device, are individuated in terms of the identity of physical descriptions; and therefore, its state transitions are connected by physical laws. By abstracting over these physical properties, we can give a functional description of the device, which, for example, might be summarized as a finite-state transition diagram of the sort familiar in automata theory.

If, however, we wish to explain the computation being performed, or the regularities exhibited by a particular computer programmed to carry out a specific function over, say, numbers or sentences, we must refer to objects in the domain that are the intended interpretation, or subject matter, of the computations… Thus, to explain why the machine prints the numeral "5" when it is provided with the expression "(PLUS 2 3)" (with the symbols given their usual interpretation), we must refer to the meanings of the symbols in both the expression and the printout. These meanings are the referents of the symbols in the domain of numbers.

Pylyshyn then develops the example of the plus function in more detail, which we can translate all the way to scheme.

(define (value lob)
  (if (null? lob)
      0
      (+ (first lob) (* 2 (value (rest lob))))))

(define (add-with-carry lob1 lob2 carry)
  (if (and (null? lob1)
           (null? lob2)
           (= 0 carry))
      empty
      (let* ((d1 (if (null? lob1) 0 (first lob1)))
             (n1 (if (null? lob1) empty (rest lob1)))
             (d2 (if (null? lob2) 0 (first lob2)))
             (n2 (if (null? lob2) empty (rest lob2)))
             (sum (+ d1 d2 carry))
             (this (if (or (= sum 1) (= sum 3)) 1 0))
             (next (if (or (= sum 2) (= sum 3)) 1 0)))
        (cons this (add-with-carry n1 n2 next)))))

(define (add lob1 lob2)
  (add-with-carry lob1 lob2 0))

The point here is that

(value (add lob1 lob2)) = (+ (value lob1) (value lob2))

The ability to view a process as a computation depends on its states being structured in terms of interpretable elements.

What this [the codes being atomic] would amount to is, we would get the effect of structured codes by applying collections of rules which, in effect, re-created relevant syntactic structures solely on the basis of the code's identity. This redundant proliferation of rules, needed to accomplish something so natural in a system that assumes the codes themselves to be structured, would have to be independently motivated.

Such considerations as these also explain why programs cannot be described adequately in terms of state-transition diagrams, for example, those used in the description of finite-state automata, which, among other things, show that when a machine is in some state S_i and receives input x, it will go on to state S_j; but if it receives input y, it will go into state S_k, and so on… As we have seen, the rule for adding numbers in a computer cannot be directly expressed in a finitary manner as a rule for producing a transition from state S_n to state S_n+1; rather, it must be expressed as a rule or set of rules for transforming an expression of numerals into a new expression.

Thus: to capture the rule-governed quality of computation the process must be viewed in terms of operations on formal expressions rather than in terms of state transitions. Consideration of this view provides insight into why a computer (or a brain, for that matter) is more appropriately described as a Turing machine than as a finite-state automaton, though clearly it is finite in its resources.

The difference between Turing machines and finite-state automata might seem to be merely that one is infinite and the other finite; but that isn't so… It is true that the class of functions that can be computed by these two machines depends on the unboundedness of the Turing machine tape. As far as the relevance of the notion of Turing machine to cognition is concerned, however, the potentially infinite length of the Turing-machine tape serves to force a kind of qualitative organization on the process. Although certain cognitive capacities may be strictly finite, the fact that they apply over very large domains (for example, the number of sentences we can understand is very large) means they must be dealt with generatively.

The joys of production systems

How production systems work:

A production system has two main parts - a communication area, called the workspace, and a set of condition-action pairs, called productions. If the condition side of a production is satisfied by the current contents of the workspace, then that production is said to be evoked, and the action on its action side is carried out. The workspace resembles a public bulletin board, with the productions resembling simple minded bureaucrats who do little except look at the bulletin board. Each bureaucrat, however, is looking for a certain pattern of "notices" on the board. When it finds that pattern, the "bureaucrat" performs whatever its action is - usually merely posting a notice of its own.

What's good about them:

The system is responsive to a limited number of symbols at a time… [It permits] a uniform treatment of what psychologists call stimulus-bound and stimulus-independent activity… Since no hidden control apparatus exists, the flow of control must be handled by putting appropriate symbols in the workspace to evoke the relevant actions. These symbols then identify goals… Production systems are highly modular… The workspace is the dynamic, working memory of the system.

More on intelligence as a computational problem

What is the problem in deciding how to act?

Planning as an explanation of behavior

Back to fast-and-frugal heuristics

Turing machines, universality and cognitive models

The joys of production systems