A man--or person--is known by the company he keeps. That old proverb takes on new meaning in the 21st century.
Computer scientists at the University of Rochester have shown that a great deal can be learned about individuals from their interactions in online social media, even when those individuals hide their Twitter messages (tweets) and other posts. The paper, "Finding Your Friends and Following Them to Where You Are," by professors Henry Kautz and Jeffrey Bigham, and graduate student Adam Sadilek, won the Best Paper Award at the Fifth Association for Computing Machinery (ACM) International Conference on Web Search and Data Mining, held in Seattle, Washington.
The researchers were able to determine a person's location within a 100 meter radius with 85 percent accuracy by using only the location of that person's friends. They were also able to predict a person's Twitter friendships with high accuracy, even when that person's profile was kept private.
In one experiment, Sadilek, Kautz, and Bigham studied the messages and data of heavy Twitter users from New York City and Los Angeles to develop a computer model for determining human mobility and location. The users, who sent out 100 or more tweets per month, had public profiles and enabled GPS location sharing. The location data of selected individuals was sampled over a two-week period, and then was ignored as the researchers tried to pinpoint their locations using only the information from their Twitter friends. In more than eight out of ten instances, they successfully figured out where the individuals lived to within one city block.
"Once you learn about relationships from peoples' tweets, it makes senses that you can track them," said Sadilek, the project's first author. "My fiancée may be a good predictor of my location because we have breakfast together every morning."
In the other experiment, the scientists used the same data sets from New York and Los Angeles, but ran the models in reverse. They made full use of individuals' location data and the content of their tweets, but ignored their lists of followers as they set out to predict people's Twitter friendships (mutual following). When they compared the predictions of their models with the actual network of friendships, the researchers found they were correct 90 percent of the time.
"If people spend a lot of time together online and talk about the same things," said Sadilek, "they're more likely to be friends."
The personal nature of the messages made it a little easier for the researchers to determine relationships. Sadilek explains that heavy Twitter users spend a great deal of time talking about themselves.
"It's harder than most people think it is to protect our privacy online," said Henry Kautz, chairman of the Department of Computer Science, "but there are ways to use this new reality for good."
The team will now apply their models to such tasks as tracking and predicting the spread of communicable diseases. If people and their friends in one location tweet about having a fever and not feeling well, it may be an indication of a flu outbreak.