Public Release: 

Computers use social media data to predict crime

Researchers have developed algorithms that predicted where and when various types of crimes would happen crimes in NYC and Brisbane based on the social media data

RMIT University

In a study published in the EPJ Data Science journal, the team of RMIT researchers show how location and activity data from users of the Foursquare app, when coupled with recommendation algorithms, allows us to predict crimes more accurately than ever before.

Foursquare users share their location and activity when they 'check-in' at various places. The study used data from over 20,000 check-ins by users in Brisbane, and nearly 230,000 check-ins by users in New York City.

RMIT computer scientist Dr Flora Salim says this dynamic, real-time data on people movements around a city is highly valuable in understanding the likelihood of different situations in an area.

But to fill the many gaps in this location-based data, researchers also developed recommendation algorithms, similar to those used to recommend related songs on Spotify.

"Obviously the large majority of people in the city were not always using the app and those committing crimes were likely not posting on the app about it," she says. "So, we used recommender systems to fill in the gaps and predict other activities in any given scenario."

In tests on both cities, the system predicted specific types of crime in specific parts of the city better than existing crime prediction models based on crime trends.

In Brisbane, the system was found to be 16% more accurate at predicting assaults than current models, 6% more accurate for predicting unlawful entry, 4% better for drug offences and theft and 2% better for fraud prediction.

In New York City, it improved prediction accuracy by 4% for theft and drug offences, fraud and unlawful entry, while improving predictions of assault by 2%.

Salim says that given the sparsity of data sets used in the study, these results are significant.

"Based on these positive results, this technology could allow police to design more effective patrol strategies with limited resources by sending officers to the places where crime is more likely," she says.

The system is also able to be easily scaled up to process larger samples from almost any social media platform, app or mobile network that collects location-based data.

"The widespread use of social media such as Twitter and Foursquare - which all gather huge amounts of data on our location, activities and preferences - provides unprecedented opportunities to capture the movement and activity of people across a city," she says.

The study is just one example of how our data can be used to predict our actions for a whole range of applications.

Another project Salim is involved in looks at algorithms to predict, with high levels of accuracy, what we'll do in the second half of our day based on historic patterns and data collected from the first half of our day.

"Research into the pattern of human movement, based on data from our mobile apps, often shows how predictable many of our activities are," Salim says.

Lead author and PhD student Shakila Khan Rumi, who is supervised by Salim and Dr Ke Deng, says the study marks a significant step forward on crime prediction models.

"Current state-of-the-art crime prediction models generally rely on relative static features including long-term historical information, geographical information and demographic information. This information changes slowly over time, meaning these traditional models couldn't capture the short-term variations in crime event occurrences," Rumi says.

"Our test results demonstrate the improvement of prediction performance after adding dynamic features is considerable and statistically significant. That really is revolutionary."

The group is now planning to extend the work by training the algorithms using data from one city and increasing its ability to apply those learnings in a different city where patterns are different.

###

Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.