image: The nature of the dynamics of signals in both the brains of people with aphasia and in large language models, or LLMs, proved strikingly similar when represented visually. ©2025 Watanabe et al. CC-BY-ND
Credit: ©2025 Watanabe et al. CC-BY-ND
Agents, chatbots and other tools based on artificial intelligence (AI) are increasingly used in everyday life by many. So-called large language model (LLM)-based agents, such as ChatGPT and Llama, have become impressively fluent in the responses they form, but quite often provide convincing yet incorrect information. Researchers at the University of Tokyo draw parallels between this issue and a human language disorder known as aphasia, where sufferers may speak fluently but make meaningless or hard-to-understand statements. This similarity could point toward better forms of diagnosis for aphasia, and even provide insight to AI engineers seeking to improve LLM-based agents.
This article was written by a human being, but the use of text-generating AI is on the rise in many areas. As more and more people come to use and rely on such things, there’s an ever-increasing need to make sure that these tools deliver correct and coherent responses and information to their users. Many familiar tools, including ChatGPT and others, appear very fluent in whatever they deliver. But their responses cannot always be relied upon due to the amount of essentially made-up content they produce. If the user is not sufficiently knowledgeable about the subject area in question, they can easily fall foul of assuming this information is right, especially given the high degree of confidence ChatGPT and others show.
“You can’t fail to notice how some AI systems can appear articulate while still producing often significant errors,” said Professor Takamitsu Watanabe from the International Research Center for Neurointelligence (WPI-IRCN) at the University of Tokyo. “But what struck my team and I was a similarity between this behavior and that of people with Wernicke’s aphasia, where such people speak fluently but don’t always make much sense. That prompted us to wonder if the internal mechanisms of these AI systems could be similar to those of the human brain affected by aphasia, and if so, what the implications might be.”
To explore this idea, the team used a method called energy landscape analysis, a technique originally developed by physicists seeking to visualize energy states in magnetic metal, but which was recently adapted for neuroscience. They examined patterns in resting brain activity from people with different types of aphasia and compared them to internal data from several publicly available LLMs. And in their analysis, the team did discover some striking similarities. The way digital information or signals are moved around and manipulated within these AI models closely matched the way some brain signals behaved in the brains of people with certain types of aphasia, including Wernicke’s aphasia.
“You can imagine the energy landscape as a surface with a ball on it. When there’s a curve, the ball may roll down and come to rest, but when the curves are shallow, the ball may roll around chaotically,” said Watanabe. “In aphasia, the ball represents the person’s brain state. In LLMs, it represents the continuing signal pattern in the model based on its instructions and internal dataset.”
The research has several implications. For neuroscience, it offers a possible new way to classify and monitor conditions like aphasia based on internal brain activity rather than just external symptoms. For AI, it could lead to better diagnostic tools that help engineers improve the architecture of AI systems from the inside out. Though, despite the similarities the researchers discovered, they urge caution not to make too many assumptions.
“We’re not saying chatbots have brain damage,” said Watanabe. “But they may be locked into a kind of rigid internal pattern that limits how flexibly they can draw on stored knowledge, just like in receptive aphasia. Whether future models can overcome this limitation remains to be seen, but understanding these internal parallels may be the first step toward smarter, more trustworthy AI too.”
###
Journal article: Takamitsu Watanabe, Katsuma Inoue, Yasuo Kuniyoshi, Kohei Nakajima, Kazuyuki Aihara “Comparison of large language model with aphasia”, Advanced Science, https://doi.org/10.1002/advs.202414016
Funding: This work was supported by Grant-in-aid for Research Activity from Japan Society for Promotion of Sciences (19H03535, 21H05679, 23H04217, JP20H05921), The University of Tokyo Excellent Young Researcher Project, Showa University Medical Institute of Developmental Disabilities Research, JST Moonshot R&D Program (JPMJMS2021), JST FOREST Program (24012854), Institute of AI and Beyond of UTokyo, Cross-ministerial Strategic Innovation Promotion Program (SIP) on “Integrated Health Care System” (JPJ012425).
Research Contact:
Professor Takamitsu Watanabe
International Research Center for Neurointelligence
The University of Tokyo Institutes for Advanced Study
7-3-1 Hongo Bunkyo-ku, Tokyo 113-0033 Japan.
takamitsu-watanabe@g.ecc.u-tokyo.ac.jp
International Research Center for Neurointelligence (WPI-IRCN) - https://ircn.jp/en/
Press contact:
Mr. Rohan Mehra
Public Relations Group, The University of Tokyo,
7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8656, Japan
press-releases.adm@gs.mail.u-tokyo.ac.jp
About The University of Tokyo:
The University of Tokyo is Japan's leading university and one of the world's top research universities. The vast research output of some 6,000 researchers is published in the world's top journals across the arts and sciences. Our vibrant student body of around 15,000 undergraduate and 15,000 graduate students includes over 4,000 international students. Find out more at www.u-tokyo.ac.jp/en/ or follow us on X (formerly Twitter) at @UTokyo_News_en.
Journal
Advanced Science
Method of Research
Experimental study
Subject of Research
People
Article Title
Comparison of large language model with aphasia
Article Publication Date
15-May-2025