Laurent Itti of the University of Southern California's Viterbi School of Engineering and Pierre Baldi of the University of California Irvine's Institute for Genomics and Bioinformatics, will present their results December 7, at the Neural Information Processing Systems (NIPS) Conference in Vancouver, B.C.
Itti and Baldi went back to first principles in developing their theory, taking off from fundamental work by Claude Shannon creating (in the title of his classic 1948 paper) "A Mathematical Theory of Communication." The pair's mathematical theory of surprise proposes an alternative mode for characterizing and quantifying information, distinct from Shannon's model -- a subjective one.
Shannon's technique is not about a specific observer, but any observer seeking to pick out a message from its noisy environment, or send one with an assurance it will be read accurately, according to Itti, a research assistant professor in the Viterbi School's department of computer science.
But the same noisy environmental buzz of activity that communicators must package their messages to survive in itself contains information crucial to individuals -- information that is not in message form. These include potential threats or opportunities. Individuals clearly develop mechanisms by which they devote attention to certain stimuli, while ignoring others, in the flood of information that they receive from their senses.
As Itti and Baldi write, "efficient and rapid attentional allocation is key to predation, escape, and mating -- in short, to survival."
According to the researchers, previous computational work on the problem has been phrased in the vocabulary of the stream of electronic data making up a video image, as a proxy for the much more complex mixture of sights, sounds, smells and more in a real environment.
Analyzing such a stream, researchers can isolate stimuli with visual attributes that are unique in the mix by breaking down the signal into "feature channels," each describing a particular attribute (i.e,, color) in the mix. Such features are called "salient." Itti himself previously developed a measure of saliency.
A parallel analysis performs similar operations, but does so over time, not space, looking for new elements suddenly appearing. This approach is said to model "novelty."
Finally, an analysis can be done purely in terms of Shannon's original equations, which can measure the level of organization or detail found in the data flow, its entropy.
Itti and Baldi say that in present research, the definition of both saliency and novelty are empirical, based on analysis of visual streams, rather than predictions about them based on basic principles,
Their theory boldly proposes to make just such predictions, working from probability theory as well as digital principles. The probability theory involved is that known as "Bayesian," which amounts to a way of structuring events observed over time in the past into predictions about the future.
The equation for making this guess is well known, developed from the probability studies of the English mathematician Thomas Bayes (1702-61). Itti and Baldi work out a way of applying it to the data in a video stream, providing a measure of how observing new date will affect the set of beliefs an observer has developed about the world on the basis of data previously received. "Data that does not change your beliefs is not surprising," says itti.
The next step is to use this theory to analyze a video stream to describe what are the streams most "surprising," features. Finally, having performed this analysis, they checked it by watching the eye movements observers watching the images, to see if the eyes followed the measure of surprise.
The pair measured the success of their "surprise" prediction against two other analyses. The first was the version of saliency that Itti co-developed as a graduate student studying under Christof Koch at Caltech.
The second was a computation of Shannon entropy by C.M. Privatera and L.W. Stark.
Surprise, they say, outperformed entropy and saliency, "exhibiting a stronger human bias toward surprising locations than towards entropic or salient regions." The pair say they have confirmed these results with a larger study.
The NSF has just funded a research
project to further explore this work. Details are at:
The authors conclude: "At the foundation of our model is a simple theory which describes a principled approach to computing surprise in data streams. While surprise is not a new concept it had lacked a formal definition, broad enough to capture the intuitive meaning of the term, yet quantitative and computable.... Beyond vision, computable surprise could guide the development of data mining, as it can in principle be applied to any type of data, including visual, auditory or text."
The NGA, NSF and NIH supported the research. The NIPS presentation can be viewed at http://iLab.