image: A team of researchers at Penn State recently developed an AI-powered smartphone application capable of helping visually impaired people navigate day-to-day tasks.
Credit: Caleb Craig/Penn State
UNIVERSITY PARK, Pa. — Over the last few years, systems and applications that help visually impaired people navigate their environment have undergone rapid development, but still have room to grow, according to a team of researchers at Penn State. The team recently combined recommendations from the visually impaired community and artificial intelligence (AI) to develop a new tool that offers support specifically tailored to the needs of people who are visually impaired.
The tool, known as NaviSense, is a smartphone application that can identify items users are looking for in real-time based on spoken prompts, guiding users to objects in the environment using the phone’s integrated audio and vibrational capabilities. Test users reported an improved experience compared to existing visual aid options. The team presented the tool and received the Best Audience Choice Poster Award at the Association for Computing Machinery’s SIGACCESS ASSETS ‘25 conference, which took place on Oct. 26-29 in Denver. Details of the tool were published in the conference’s proceedings.
According to Vijaykrishnan Narayanan, Evan Pugh University Professor, A. Robert Noll Chair Professor of Electrical Engineering and NaviSense team lead, many existing visual-aid programs connect users with an in-person support team, which can be inefficient or raise privacy concerns. Some programs offer an automated service, but Narayanan explained that these programs have a glaring issue.
“Previously, models of objects needed to be preloaded into the service’s memory to be recognized,” Narayanan said. “This is highly inefficient and gives users much less flexibility when using these tools.”
To address this problem, the team implemented large-language models (LLMs) and vision-language models (VLMs), which are both types of AI that can process significant amounts of data to answer inquiries, into NaviSense. The app connects to an external server hosting the LLMs and VLMs, which allows NaviSense to learn about its environment and recognize the objects in it, according to Narayanan.
“Using VLMs and LLMs, NaviSense can recognize objects in its environment in real-time based on voice commands, without needing to preload models of objects,” Narayanan said. “This is a major milestone for this technology.”
According to Ajay Narayanan Sridhar, a computer engineering doctoral student and lead student investigator on NaviSense, the team held a series of interviews with people who are visually impaired before development, so that they could specifically tailor the tool’s features to the needs of users.
“These interviews gave us a good sense of the actual challenges visually impaired people face,” Sridhar said.
NaviSense searches an environment for a requested object, specifically filtering out objects that do not fit a user’s verbal request. If it doesn’t understand what the user is looking for, it will ask a follow-up question to help narrow down the search. Sridhar said that this conversational feature offers convenience and flexibility that other tools struggle to provide.
Additionally, NaviSense can accurately track the hand movements of a user in real time by monitoring the movement of the phone, offering feedback on where the object they are reaching for is located relative to their hand.
“This hand guidance was really the most important aspect of this tool,” Sridhar said. “There was really no off-the-shelf solution that actively guided users’ hands to objects, but this feature was continually requested in our survey.”
Following the interviews, the team had 12 participants test the tool in a controlled environment, comparing NaviSense to two commercial options. The team tracked the time it took for the tools to identify and guide users to an object, while also monitoring the overall accuracy of the programs’ detection mechanisms.
NaviSense significantly reduced the time users spent looking for objects, while simultaneously identifying objects in the environment more accurately than the commercial options. Additionally, participants reported a better user experience compared to other tools, with one user writing in a post-experiment survey, “I like the fact that it is giving you cues to the location of where the object is, whether it is left or right, up or down, and then bullseye, boom, you got it.”
The current iteration of the tool, while effective and user friendly, has room for improvement prior to commercialization, according to Narayanan. The team is working to optimize the application’s power usage, which will decrease how much it drains the smartphone’s battery, as well as to further improve the LLM and VLM efficiency.
“This technology is quite close to commercial release, and we're working to make it even more accessible,” Narayanan said. “We can use what we’ve learned from these tests and our previous prototypes of this tool to further optimize it for the visually impaired community.”
Other team members affiliated with Penn State include Mehrdad Mahdavi, Penn State Hartz Family Associate Professor of Computer Science and Engineering; and Fuli Qiao, a computer science doctoral student. Additional co-authors include Nelson Daniel Troncoso Aldas, an independent researcher; Laurent Itti, professor of computer science and of psychology at the University of Southern California; and Yanpei Shi, a computer science doctoral candidate from the University of Southern California.
This work was supported by the U.S. National Science Foundation.
At Penn State, researchers are solving real problems that impact the health, safety and quality of life of people across the commonwealth, the nation and around the world.
For decades, federal support for research has fueled innovation that makes our country safer, our industries more competitive and our economy stronger. Recent federal funding cuts threaten this progress.
Learn more about the implications of federal funding cuts to our future at Research or Regress.
Method of Research
Experimental study
Subject of Research
People
Article Title
NaviSense: A Multimodal Assistive Mobile application for Object Retrieval by Persons with Visual Impairment
Article Publication Date
22-Oct-2025