Knowledge Discussion Notes

Knowledge discussion notes - Sep 10

What do robots know? How does their knowledge relate to the computations they do? The data structures they maintain? Their capacities to act in the world? These questions turn out to be delicate and complicated. There are a bunch of themes and intuitions that we have to get our heads around.

Knowledge and representation depend on the ability to act in the world

In his AAAI Presidential address, Newell says:

Knowledge is intimately tied with rationality (the principle that actions are selected to attain the agent's goals). Systems of which rationality can be posited can be said to have knowledge. It is unclear in what sense other systems can be said to have knowledge.

Knowledge is whatever can be ascribed to an agent, such that its behavior can be computed according to the principle of rationality.

In their Turing Award lecture, Newell and Simon use the notion of designation to highlight the connection between knowledge, representation and the world:

An expression designates an object if, given the expression, the system can either affect the object itself or behave in ways dependent on the object.

A final quote:

Knowledge, in the principle of rationality, is defined entirely in terms of the environment of the agent, for it is the environment that is the object of the agent's goals, and whose features therefore bear on the way actions can attain goals. This is true even if the agent's goals have to do with the agent itself as a physical system. Therefore, the solutions are ways to say things about the environment, not ways to say things about reasining, internal information processing states, and the like.

Philosophers have explored a number of surprising consequences to the idea that information is an abstraction of the relationship between an agent and the world. They have also developed interestingly different frameworks for characterizing what the relevant kind of relationship is. In AI, the problem of orchestrating the right relationship between a system and its environment so that the system can be understood to represent the environment is known as the symbol-grounding problem.

Brooks provides a classic set of arguments showing that symbol grounding is crucial, and system builders ignore it at their peril - his background:

The implicit idea is that perception and motor interfaces are sets of symbols on which the central intelligence system operates. Thus, the central system, or reasoning engine, operates in a domain independent way on the symbols. Their meanings are unimportant to the reasoner, but the coherence of the complete process emerges when an observer of the system knows the groundings of the symbols within his or her own experience.

His worries about inputs to the central system:

The default assumption has been that the perception system delivers a description of the world in terms of typed, named individuals and their relationships… But for another task… quite a different representation might be important. Psychophysical evidence certainly points to perception being an active and task dependent operation. The effect of the symbol system hypothesis has been to encourage vision researchers to quest after the goal of a general purpose vision system which delivers complete descriptions of the world in a symbolic form. Only recently has there been a movement towards active vision which is much more task dependent, or task driven.

Without a carefully built physical grounding any symbolic representation will be mismatched to its sensors and actuators. These groundings provide the constraints on symbols necessary for them to be truly useful.

Thrun's Stanley has lots of symbolic representations. The way Stanley maintains and uses them sheds light both on Newell's views and on Brooks's. Pose estimation relates perception and action:

While GPS is available, the UKF uses only a "weak" model. This model corresponds to a moving mass that can move in any direction. Hence, in normal operating mode, the UKF places no constraint on the direction of the velocity vector relative to the vehicle's orientation. Such a model is clearly inaccurate, but the vehicle-ground interactions in slippery desert terrain are generally difficult to model. The moving mass model allows for any slipping or skidding that may occur during off-road driving.

However, this model performs poorly during GPS outages, as the position of the vehicle relies strongly on the accuracy of the IMU's accelerometers. As a consequence, a more restrictive UKF motion model is used during GPS outages. This model constrains the vehicle to only move in the direction it is pointed. The integration of the IMU's gyroscopes for orientation, coupled with wheel velocities for computing the position, is able to maintain accurate pose of the vehicle during GPS outages of up to 2 min long; the accrued error is usually in the order of centimeters. Stanley's health monitor will decreate the maximum vehicle velocity during GPS outages to 10 mph in order to maximize the accuracy of the restricted vehicle model.

Obstacle detection is finely adapted to the process by which measurements are taken - including kind of sensors, patterns of measurements, and difficulties in interpreting sensor data:

Small pose errors are magnified into large errors in the projected positions of laser points because the lasers are aimed at the road up to 30m in front of the vehicle… For some grid cells, the perceived height is enormous - despite the fact that in reality, the surface is flat. However, this function is not random. The error is strongly correlated with the elapsed time between the two scans. To model this error, Stanley uses a first-order Markov model, which models the drift of the pose estimation error over time. The test for the presence of an obstacle is therefore a probabilistic test. Given two points, the height difference is distributed according to a normal distribution whose variance scales linearly with the time difference.

It's hard to think of general mechanisms that would ground out in the specific kinds of processing that Stanley actually needs to ground its symbols. At the same time, it's clear that the fundamental motivation behind these algorithms is to keep Stanley in touch with the world - to track an external reality that is only partially reflected in its sensor readings.

Knowledge is an idealization

Knowledge is an idealization because it is an approximation that takes only certain information into account - Newell again:

One way of viewing the knowledge level is as the attempt to build as good a model of an agent's behavior as possible based on information external to the agent, hence permitting distal prediction. This standponit makes understandable why such a level might exist even though it is radically incomplete. If such incompleteness is the best that can be done, it must be tolerated.

The knowledge level is only an approximation, and a relatively poor one on many occasions - we called it radically incomplete. It is poor for predicting whether a person remembers a telephone number just looked up. It is poor for predicting what a person knows given a new set of mathematical axioms with only a short time to study them. And so on, through whole meadows of counterexamples. Equally, it is a good approximation in many other cases. It is good for predicting that a person can find his way to the bedroom of his own house, for predicting that a person who knows arithmetic will be able to add a column of numbers. And so on, through much of what is called common sense knowledge.

This move to appeal to approximation… seems weak, because declaring something an approximation seems a general purpose dodge, applicable to dissolving every difficulty, hence clearly dispelling none.

Knowledge is also an idealization because of how it abstracts away from the underlying symbolic processes that realize knowledge:

A representation of any fragment of the world… reveals immediately that knowledge is not finite. Consider our old friend, the chess position… Given a reasonably intelligent agent who desires to win, the list of aspects of the position are unbounded… A seemingly appropriate objection is: (1) the agent is finite so can't actually have unbounded anything; and (2) the chess position is also just a finite structure. However, the objection fails, because it is not possible to state from afar a bound on the set of propositions about the position that will be available… The situation here is not really strange. The underlying phenomena is the generative ability of computational systems, which involves an active process working on an initially given data structure. Knowledge is the posited extensive form of all that can be obtained potentially from this process. This potential is unbounded when the details of processing are unknown and the gap is closed by assuming (from the principle of rationality) the processing to be whatever makes the correct selection.

For philosophers, the approximation and idealization involved in attributing knowledge has been the source of important conceptual difficulties. Some doubt that knowledge is really scientific. As we get more accurate theories in psychology or neuroscience, we will no longer have to make these approximations and idealizations and we will dispense with the notion of knowledge. Others look to find a basis for the approximation, which involves finding a set of norms or a privileged perspective which shows why the approximation to knowledge is the right way to analyze systems.

Other worries center around the gap between knowledge and actual representations that are physically instantiated. We may be uncomfortable with the extent of the abstraction and approximation involved in Newell's notion of knowledge. Brooks's robots remind us that systems can be engineered to produce specific behaviors without representation. Should we say they have knowledge anyway, if they select behaviors that match the environment in which they find themselves and the goals which they are designed to achieve:

Connell programmed Herbert to wander around office areas, go into people's offices and steal empty soda cans from their desks. He demonstrated obstacle avoidance and wall following, real-time recognition of soda-can-like objects, and a set of 15 behaviors which drove the arm to physically search for a soda can in front of the robot, locate it, and pick it up… The remarkable thing about Herbert is that there was absolutely no interanl communication between any of its behavior generating modules… The laser-based soda-can object finder drove the robot so that its arm was lined up in front of the soda can. But it did not tell the arm controller that there was now a soda can ready to be picked up. Rather, the arm behaviors monitored the shaft encoders on the wheels, and when they noticed that there was no body motion, initiated motions of the arm, which in turn triggered other behaviors, so that eventually the robot would pick up the soda can… As one example of how the arm behaviors cascaded upon one another, consider actually grasping a soda can. The hand had a grasp reflex that operated whenever something broke an infrared beam between the fingers. When the arm located a soda can with its local sensors, it simply drove the hand so that the two fingers lined up on either side of the can. Given this arrangement, it was possible for a human to hand a sode can to the robot. As soon as it was grasped, the arm retracted - it did not matter whether it was a soda can that was intentionally grasped, or one that magically appeared. The same opportunism among behaviors let the arm adapt automatically to a wide variety of cluttered desktops, and still successfully find the soda can.

Achieving knowledge involves getting some very fussy details right; knowledge is particular

Here's Brooks:

Accepting the physical grounding hypothesis as a basis for research entails building systems in a bottom up manner. High level abstractions have to be made concrete. The constructed system eventually has to express all its goals and desires as physical action, and must extract all its knowledge from physical sensors. Thus the designer of the system is forced to make everything explicit. Every short-cut taken has a direct impact upon system competence, as there is no slack in the input/output representations. The forms of the low-level interfaces have consequences which ripple through the entire system.

Here's Newell:

At the knowledge level, the principle of rationality and knowledge present a seamless surface: a uniform principle to be applied uniformly to the content of what is known. There is no reason to expect this to carry down seamlessly to the symbolic level, with separate subsystems for each aspect and a uniform encoding of knowledge. Decomposition must occur, of course, but the separation into processes and data structures is entirely a creation of the symbolic level, which is governed by processing and encoding considerations that have no existence at the knowledge level. The interface between the problem solving processes and the knowledge extraction processes is as diverse as the potential ways of designing intelligent systems. A look at existing AI programs will give some idea of the diversity, though no doubt we still are only at the beginnings of exploration of potential mechanisms. In sum, the seamless surface at the knowledge level is most likely a pastiche of interlocked intricate structures when seen from below, much like the smooth skin of a baby when seen under a microscope.

It seems to me that Brooks and Newell recognize the same practical constraints on technical realization in these passages. But they are in some sense drawing opposite morals. Can we use ideas of knowledge, representation and function to design systems? Or is the only role of knowledge and related notion to describe the activity of systems that we don't know how to build? Or when the internal structures and processes that explain behavior are not relevant?

How might Stanley's architecture bear on this question? What does Stanley know? How is knowledge important in explaining Stanley's behavior? Stanley's design?

stan-arch.bmp

Knowledge involves open-ended critical thinking and the ability to learn from experience

I think this issue is increasingly important, but it doesn't show up much in Newell's discussions or Brooks's. Consider how Stanley uses its sensors in combination to make sure that it creates one consistent model of the world, not two separate maps.

To find the road, the vision module classifies images into drivable and nondrivable regions. This classification task is generally difficult, as the road appearance is affected by a number of factors that are not easily measured and change over time, such as the surface material of the road, lighting conditions, dust on the lens of the camera, and so on. This suggests that an adaptive approach is necessary, in which the image interpretation changes as the vehicle moves and conditions change. The camera images are not the only soruce of information about upcoming terrain available to the vision mapper. Although we are interested in using vision to classify the drivability of terrain beyond the laser range, we already have such drivability information from the laser in the near range. All that is required from the vision routine is to extend the reach of the laser analysis. This is different from the general-purpose image interpretation problem, in which no such data would be available.

Stanley finds drivable surfaces by projecting drivable area from the laser analysis into the camera image. The learning algorithm maintains a mixture of Gaussians that model the color of drivable terrain. When a new image is observed, the pixels in the drivable quadrilateral are … clustered. The clusters are then merged with the memory of the learning algorithm, in a way that allows for slow and fast adaptation. The learning adapts to the image in two possible ways: by adjusting the previously found internal Gaussain to the actual image pixels, and by introducing new Gaussians and discarding older ones.

There is a sense in which this is consonant with Newell's discussion, since Newell emphasizes that an intelligent system will be able to recognize the relevant consequences of its knowledge. It is certainly relevant to Stanley that no place in the world is both drivable and not drivable. At the same time, there is a lot of focus in Newell (and in Brooks) on how the system finally selects the actions that it is going to do. The reasoning that the system does to check its knowledge and make sure that it is true is only indirectly related to any specific choice of action that the system will select. So in Newell's discussion, you could easily miss the importance of the kinds of corroboration and learning that Stanley does. It is also quite striking in retrospect how little Brooks says about learning and adaptation. Nowadays (cf. Stanley) machine learning is regarded as central to robotics.

This has philosophical consequences as well. One kind of objection to Brooks, and perhaps ultimately also to Newell, is that computer systems don't really have knowledge or representations of the world. They just have hardwired reactions. Brooks's robots surely do. But why should the more sophisticated deliberations that Newell is talking about be any different? They may be more complex reflexes, but they are reflexes just the same. Learning is a key function that makes knowledge open-ended, that makes it more useful to explain behavior with reference to the outside world than with reference to internal mechanisms. And you can imagine a system whose goals were for real-world outcomes that it could sense only indirectly, so that the critical project of learning and corroborating its knowledge was essential to its success. In short, knowledge may not just be a matter of rational, flexible behavior: it may require specific kinds of functions and architectures - like the right kinds of learning and engagement - that realize cognitive mechanisms that are important to our understanding of ourselves and our own relationship to the world.