A possible answer to Noam Chomsky

Last week I described a new approach to artificial intelligence (AI) based on sensory and motor capabilities

Last week I described a new approach to artificial intelligence (AI) based on sensory and motor capabilities. Now I want to show how this approach can be used in the areas of vision and language.

Traditional AI has had a preoccupation with representing the world in a way that allows it to be reasoned about. Many researchers in the area of computational vision, for example, are convinced that a faithful representation of the structure of the external world is a necessary component of a successful vision system. But systems constructed on this basis are proving very hard to extend and generalise. Is there an alternative to the representationalist approach?

Well there is rather compelling evidence that the representational view may be insufficient to account for some phenomena in natural vision, evidence which may serve as a pointer to more effective computational approaches.

A few years ago I acted as a pilot subject for a vision experiment in George McConkie's eye movement laboratory in the University of Illinois at UrbanaChampaign (www.beckman.uiuc.edu/faculty/ mcconki.html).

READ MORE

At the time, McConkie was refining a picture perception system that permitted changes to be made to a picture on a computer display during a viewer's eye movements (saccades as they are called). Display changes made during saccades usually go unnoticed by the subject. The system works by monitoring a subject's eyes with a computer-coupled eye tracker. The computer determines when a saccade has been launched and then makes a change to the displayed picture. So, while I viewed a street scene, fire hydrants moved, cars changed size, houses acquired or lost windows. Yet, I was oblivious to these modifications, or of anything unusual happening, save for the sounds of amusement from my experimenters observing the changes that I could not see. Whatever was going on in my head, I was certainly not constructing a faithful representation of the picture I was viewing. Otherwise, I would have been able to detect these gross alterations.

Kevin O'Regan of Rene Descartes University in Paris explains our lack of sensitivity to changes of this sort by suggesting that we treat the world as an external memory, which we sample as needed, depending on the task and the context. When you think about it, it makes economical sense to concentrate on processing only the relevant bits of a visual scene. You can get some idea of what it felt like to be McConkies' subject by viewing a computer-based experiment devised by O'Regan and his colleagues at pathfinder.cbr.com/people/clark/java/flicker.html.

Since the very beginning of AI, it has been a goal to develop programs to understand natural language. While, there has been considerable progress in this area in recent years, there is a general sense that there is an upper bound to the performance of systems employing traditional symbolic AI approaches. You can achieve a reasonable level of performance, but knowing the topic is everything.

If you stray outside the relevant domain, you are met with the computer equivalent of a blank stare of incomprehension.

A number of researchers argue that to achieve a breakthrough in this area we must pay closer attention to the best examples so far of a natural language understanding system: humans.

Recently researchers in the University of California at San Diego led by Jeff Elman (crl.ucsd.edu/elman/) have been taking the concept of language embodiment seriously. Elman and his group use artificial neural networks to model the behaviour of language learners. Artificial neural networks have emerged as a new and promising branch of AI.

They have some of the characteristics of the real neural networks found in our brains, but are significantly simplified. One of their more important features is their ability to learn on the basis of experience. Elman's experiments involved training an artificial neural network to learn a small artificial English-like language. He used only grammatical sentences, mimicking the language a child is exposed to but, when he first tried to train a network to learn the language, it failed to do so.

However, he discovered that if he limited the network's memory capacity early in training and then gradually increased it, the network could ultimately master the grammar. This gradual increase in capacity is analogous to what happens to a child's cognitive abilities as he or she develops. Elman's network succeeded in learning the complex grammar by first acquiring a simpler version of it.

What is particularly interesting about this discovery is that, counter-intuitively, a capacity limitation early in development turns out to be an advantage. With Elman's finding we can discern the shape of a response to Chomsky's position on language learnability. Chomsky has argued that we must have an innate knowledge of language if we are to learn natural languages under the conditions a child does. However, if we take account of the fact that language is acquired by a child with certain cognitive capacity limitations, these limitations act as a filter on the complexity of the language, transforming it into a simpler, more learnable one. The incremental expansion of these capacities allows additional features of the language to be constructed on this simpler foundation. In other words, we can surmount the difficulties that Noam Chomsky identified by taking account of the nature of the learner, by treating language as an embodied process.

The positive side of the embodiment approach to AI is the focus it provides on human brain function. The disadvantage is that it may postpone research on significantly intelligent systems, and shift it to more basic engineering problems. Projects such as Rodney Brooks' Cog (see last week's article) require a considerable amount of preliminary research to be done at an engineering level first before the fruits of this approach can be assessed.

One of the more promising trends in this line of research is the increased collaboration between AI, cognitive science, and neuroscience. We are now reaching a point where the non-invasive techniques available to neuroscientists to see what's going on inside people's brains as they understand language, for example, are becoming more precise.

Consequently, information from the cognitive sciences and the neurosciences will serve as in increasingly important set of constraints on future AI systems.