News Release

AI-generated voices now indistinguishable from real human voices

New study reveals that the average listener can no longer distinguish between deepfake voices and those of real human beings

Peer-Reviewed Publication

Queen Mary University of London

EMBARGOED UNTIL: 24.09.25 02:00 ET / 07:00 London  

AI-generated voices now indistinguishable from real human voices 

New study reveals that the average listener can no longer distinguish between deepfake voices and those of real human beings 

Many people still think of AI-generated speech as sounding “fake” or unconvincing and easily told apart from human voices. But new research from Queen Mary University of London shows that AI voice technology has now reached a stage where it can create “voice clones” or deepfakes which sound just as realistic as human recordings. 

The study compared real human voices with two different types of synthetic voices, generated using state-of-the-art AI voice synthesis tools. Some were “cloned” from voice recordings of real humans, intended to mimic them, and others were generated from a large voice model and did not have a specific human counterpart. 

Participants were asked to evaluate which voices sounded most realistic, and which sounded most dominant or trustworthy. Researchers also looked at whether AI-generated voices had become “hyperreal”, given that some studies have shown that AI-generated images of faces are now judged to be human more often than images of real human faces.  

While the study did not find a “hyperrealism effect” from the AI voices, it did find that voice clones can sound as real as human voices, making it difficult for listeners to distinguish between them. Both types of AI-generated voices were evaluated as more dominant than human voices, and some were also perceived as more trustworthy. 

Dr Nadine Lavan, Senior Lecturer in Psychology at Queen Mary University of London who co-led the study, said: “AI-generated voices are all around us now. We’ve all spoken to Alexa or Siri, or had our calls taken by automated customer service systems. 

“Those things don’t quite sound like real human voices, but it was only a matter of time until AI technology began to produce naturalistic, human-sounding speech. Our study shows that this time has come, and we urgently need to understand how people perceive these realistic voices.” 

Dr Lavan pointed out how easily and quickly the team had been able to create clones, or deepfakes, of real voices (with the consent of their owners) using commercially available software. “The process required minimal expertise, only a few minutes of voice recordings, and almost no money,” she said. “It just shows how accessible and sophisticated AI voice technology has become.”  

The pace of improvement has been very rapid, noted Dr Lavan, and carries many implications for ethics, copyright, and security, especially in areas like misinformation, fraud, and impersonation. 

“However, the ability to generate realistic voices at scale opens up exciting opportunities,” she went on. “There might be applications for improved accessibility, education, and communication, where bespoke high-quality synthetic voices can enhance user experience.” 

ENDS 


Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.