News Release 3-Aug-2022

The power of visual influence

New method to predict what catches our eye, for how long, and our human reaction

Association for Computing Machinery

image: The new approach determines a user’s real-time reaction to an image or scene based on their eye movement, particularly saccades, the super-quick movements of the eye that jerk between points before fixating on an image or object. The researchers will demonstrate their new work titled, “Image Features Influence Reaction Time: A Learned Probabilistic Perceptual Model for Saccade Latency”, at SIGGRAPH 2022 held Aug. 8-11 in Vancouver, BC, Canada. view more

Credit: ACM SIGGRAPH

What motivates or drives the human eye to fixate on a target and how, then, is that visual image perceived? What is the lag time between our visual acuity and our reaction to the observation? In the burgeoning field of immersive virtual reality (VR) and augmented reality (AR), connecting those dots, in real time, between eye movement, visual targets, and decision-making is the driving force behind a new computational model developed by a team of computer scientists at New York University, Princeton University, and NVIDIA.

The new approach determines a user’s real-time reaction to an image or scene based on their eye movement, particularly saccades, the super-quick movements of the eye that jerk between points before fixating on an image or object. Saccades allow for frequent shifts of attention to better understand one’s surroundings and to localize objects of interest. Understanding the mechanism and behavior of saccades is vital in understanding human performance in visual environments, representing an exciting area of research in computer graphics.

The researchers will demonstrate their new work titled, “Image Features Influence Reaction Time: A Learned Probabilistic Perceptual Model for Saccade Latency”, at SIGGRAPH 2022 held Aug. 8-11 in Vancouver, BC, Canada. The annual conference, which will be in-person and virtual this year, spotlights the world’s leading professionals, academics, and creative minds at the forefront of computer graphics and interactive techniques.

“There has recently been extensive research to measure the visual qualities perceived by humans, especially for VR/AR displays,” says the paper’s senior author Qi Sun, PhD, assistant professor of computer science and engineering at New York University Tandon School of Engineering.

“But we have yet to explore how the displayed content can influence our behaviors, even noticeably, and how we could possibly use those displays to push the boundaries of our performance that are otherwise not possible.”

Inspired by how the human brain transmits data and makes decisions, the researchers implement a neurologically-inspired probabilistic model that mimics the accumulation of “cognitive confidence” that leads to a human decision and action. They conducted a psychophysical experiment with parameterized stimuli to observe and measure the correlation between image characteristics, and the time it takes to process them in order to trigger a saccade, and whether/how the correlation differs from that of visual acuity.

They validate the model, using data from more than 10,000 trials of user experiments using an eye-tracked VR display, to understand and formulate the correlation between the visual content and the “speed” of decision-making based on reaction to the image. The results show that the new model prediction accurately represents real-world human behavior.

The proposed model may serve as a metric for predicting and altering eye-image response time of users in interactive computer graphics applications, and may also help to improve design of VR experiences and player performances in esports. In other sectors such as healthcare and auto, the new model could help estimate a physician’s or a driver’s ability to rapidly respond and react to emergencies. In esports, it can be applied to measure the competition fairness between players or to better understand how to maximize one’s performance where reaction times come down to milliseconds.

In future work, the team plans to explore the potential of cross-modal effects such as visual-audio cues that jointly affect our cognition in scenarios such as driving. They are also interested in expanding the work to better understand and represent the accuracy of human actions influenced by visual content.

The paper’s authors, Budmonde Duinkharjav (NYU); Praneeth Chakravarthula (Princeton); Rachel Brown (NVIDIA); Anjul Patney (NVIDIA); and Qi Sun (NYU), are set to demonstrate their new method Aug. 11 at SIGGRAPH as part of the program, Roundtable Session: Perception. The paper can be found here.

About ACM SIGGRAPH
ACM SIGGRAPH is an international community of researchers, artists, developers, filmmakers, scientists and business professionals with a shared interest in computer graphics and interactive techniques. A special interest group of the Association for Computing Machinery (ACM), the world’s first and largest computing society, our mission is to nurture, champion and connect like-minded researchers and practitioners to catalyze innovation in computer graphics and interactive techniques.

Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.