News Release

Tired of video conferencing? Research suggests you're right to question its effectiveness

A new study suggests that non-visual communication methods that better synchronize and boost audio cues are in fact more effective

Peer-Reviewed Publication

Carnegie Mellon University

In the year since the coronavirus pandemic upended how just about every person on the planet interacts with one another, video conferencing has become the de facto tool for group collaboration within many organizations. The prevalent assumption is that technology that helps to mimic face-to-face interactions via a video camera will be most effective in achieving the same results, yet there's little data to actually back up this presumption. Now, a new study challenges this assumption and suggests that non-visual communication methods that better synchronize and boost audio cues are in fact more effective.

Synchrony Promotes Collective Intelligence

Researchers from Carnegie Mellon's Tepper School of Business and the Department of Communication at the University of California, Santa Barbara, have studied collective intelligence--the ability of a group to solve a wide range of problems--and how synchrony in non-verbal cues helps to develop it. There are many forms of synchrony, but the common view is that synchrony occurs when two or more nonverbal behaviors are aligned. Essentially, conversation is what happens when at least two speakers take turns sharing their thoughts, and nonverbal cues are how they establish when and how to take these turns.

Previous research has shown that synchrony promotes collective intelligence because it improves joint problem solving. So it's not too far-fetched that many would assume that if a conversation can't take place face-to-face, it would be best simulated with both video and audio software.

The researchers focused on two forms of synchrony: facial expression synchrony and prosodic synchrony. Facial expression synchrony is pretty straightforward and involves the perceived movement of facial features. Prosodic synchrony, on the other hand, captures the intonation, tone, stress, and rhythm of speech. They hypothesized that during virtual collaboration, collective intelligence would develop through facial expression synchrony when the collaborators had access to both audio and visual cues. Without visual cues, though, they predicted that prosodic synchrony would enable groups to achieve collective intelligence instead.

Collective Intelligence Is Achievable With or Without Video, but Even More So Without

"We found that video conferencing can actually reduce collective intelligence," says Anita Williams Woolley, Associate Professor of Organizational Behavior and Theory at Carnegie Mellon's Tepper School of Business, who co-authored the paper. "This is because it leads to more unequal contribution to conversation and disrupts vocal synchrony. Our study underscores the importance of audio cues, which appear to be compromised by video access."

Woolley and her colleagues pulled together a large, diverse sample of 198 individuals and divided them into 99 pairs. Forty-nine of these pairs formed the first group, which were physically separated with audio capabilities but not video capabilities. The remaining 50 pairs were also physically separated but had both video and audio capabilities. During a 30-minute session, each duo completed six tasks designed to test collective intelligence. As Woolley points out, the results challenge the prevailing assumptions.

The groups with video access did achieve some form of collective intelligence through facial expression synchrony, suggesting that when video is available, collaborators should be aware of these cues. However, the researchers found that prosodic synchrony improved collective intelligence whether or not the group had access to video technology and that this synchrony was enhanced by equality in speaking turns. Most strikingly, though, was that video access dampened the pairs' ability to achieve equality in speaking turns, meaning that using video conferencing can actually limit prosodic synchrony and therefore impede upon collective intelligence.

Specifically, groups regulate speaking turns via a set of interaction rules, which include yielding, requesting, or maintaining turns. Collaborators often subtly communicate these rules through nonverbal cues such as eye contact or vocal cues, such as altering volume and rate. However, visual nonverbal cues appear to enable some collaborators to dominate the conversation. By contrast, the study shows that when groups have audio cues only, the lack of video does not prevent them from communicating these interaction rules but actually helps them to regulate their conversation more smoothly by engaging in more equal exchange of turns and by establishing improved prosodic synchrony.

What does this mean for organizations whose members are still physically separated by the COVID-19 pandemic? It might be worth it to disable the video function in order to promote better communication and social interaction during collaborative problem solving.


Summarized from "Speaking out of turn: How video conferencing reduces vocal synchrony and collective intelligence," by Tomprou, Maria (Carnegie Mellon University) Kim, Young Ji (University of California, Santa Barbara), Chikersal, Prerna (Carnegie Mellon University), Williams Woolley, Anita (Carnegie Mellon University), and Dabbish, Laura A. (Carnegie Mellon University). It appears in PLoS One, published by the Public Library of Science. Copyright 2021. All rights reserved.

Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.