MIT neuroscientists reveal how the brain learns to recognize objects
Understanding how the brain recognizes objects is a central challenge for understanding human vision, and for designing artificial vision systems. (No computer system comes close to human vision.) A new study by MIT neuroscientists suggests that the brain learns to solve the problem of object recognition through its vast experience in the natural world.
Take for example, a dog. It may be sitting nearby or far away or standing in sunshine or shadow. Although each variation in the dog's position, pose or illumination produces a different pattern of light on the retina, we still recognize it as a dog.
One possible way to acquire this ability to recognize an object, despite these variations, is through a simple form of learning. Objects in the real world usually don't suddenly change their identity, so any two patterns appearing on the retina in rapid succession likely arise from the same object. Any difference between the two patterns probably means the object has changed its position rather than having been replaced by another object. So by simply learning to associate images that appear in rapid succession, the brain might learn to recognize objects even if viewed from different angles, distances, or lighting conditions.
To test this idea, called "temporal contiguity," graduate student Nuo Li and associate professor James DiCarlo at MIT's McGovern Institute for Brain Research "tricked" monkeys by exposing them to an altered visual world in which the normal rules of temporal contiguity did not apply. They recorded electrical activity from individual neurons in a region of the monkey brain called the inferior temporal cortex (IT), where object recognition is thought to happen. IT neurons respond selectively to particular objects; a neuron might, for example, fire more strongly in response to images of a Dalmatian than to pictures a rhinoceros, regardless of its size or position on the retina.
In the new study, which appears in the Sept. 23 issue of Neuron, monkeys observed an object on a computer screen as the object became larger or smaller, as though it were approaching or receding from view. But in some cases, the researchers replaced an object with another as it changed in size. For example, as a Dalmatian became larger on the screen, it suddenly transformed into a rhinoceros.
"We know that IT neurons are involved in object recognition, so our prediction was that these neurons would become confused," explains Li. "By exposing them to this artificial visual experience, we undermined the regularities that we hypothesized teach neurons to recognize the object at multiple sizes."
After a few hours, each IT neuron did indeed become confused. For example, a neuron that preferred a dog over a rhino (regardless of size) began to lose this preference specifically among large dogs and large rhinos (the size at which the temporal contiguity rules had been broken by the researchers). In some cases, the object preferences even started to reverse, and the neuron would begin to prefer large rhinos over large dogs. In other words, the altered visual experience was not merely degrading existing patterns of selectivity but also creating new ones. The new results offered the strongest support yet for the "temporal contiguity" hypothesis of object representation.
"Our monkeys saw only a few hundred examples of these altered visual stimuli during the experiment," says DiCarlo. "If we extrapolate to a lifetime of visual experience, we think this effect is a major contributor to object constancy."
In earlier work, Li and DiCarlo had shown a similar effect when object identities were switched during eye movements. "But we couldn't tell whether that result was peculiar to eye movements – for example, you don't need eye movements to observe a large image of a car changing into a small image of the same car," says DiCarlo. "Our new results strongly suggest that this is a general mechanism for learning about object identity across a wide range of real-world conditions."
The researchers did not measure the monkeys' behavior during the new study, so it is not known whether the monkeys themselves become confused by the altered visual experience. But similar effects have been seen in human studies, and it seems likely that the changing pattern of neural activity would lead to changes in perceptual judgments. DiCarlo and colleagues plan to test this in future studies.