Public Release: 

A new method measures the integration or segregation of immigrants based on their tweets

The message location, along with the language used, make it possible to find a community's most typical residential areas and to study whether they are more concentrated than in the local population

Spanish National Research Council (CSIC)

An international team led by researchers from the Spanish National Research Council (CSIC) has developed a method to measure the integration or segregation of immigrants based on the messages they write on the social network, Twitter.

In the work, which is published in the journal PLOS ONE, a method was developed to use Twitter data to analyse the degree of spatial segregation of immigrant communities. "The users' communities of origin are determined by the language in which the tweets are posted, establishing an 'idiomatic algebra' to assign the most likely community to which a tweet belongs," explains the study's director, José Javier Ramasco, CSIC researcher at the Institute for Cross-Disciplinary Physics and Complex Systems, in Mallorca, Spain.

"If all the messages are in the local language, then the user is considered to be a local resident. If, on the other hand, some messages are in the language of an immigrant community, it can be assumed that the user knows that language and belongs to that community," he adds.

The language used, together with the location of the messages, make it possible to find the typical residential areas of the different communities and to study whether they are more or less concentrated in those areas than the local population. "This method has allowed us to analyse immigrant communities in 53 of the world's largest cities. In each one of them we can define a metric that measures the spatial integration capacity of the immigrants living there," Ramasco explains.

By applying this metric, cities can be divided into three categories: those with high integration capacity, those with few immigrant communities- or those that are highly segregated from a spatial point of view- and an intermediate category between both extremes, explains Ramasco. "In the first group (high integration) we find cities such as London, San Francisco, Tokyo and Los Angeles, while at the other extreme (low integration) we see others such as Detroit, Miami, Toronto and Amsterdam," he explains.

In addition to considering cities, you can analyse how different cultures - characterised by language - are integrated within the countries where these cities are located. The best integration is found among nearby cultures, for example, Latin-based language speakers (speakers of Portuguese and Italian) in South American Spanish-speaking countries, or those from European countries within the United Kingdom. The cases of greater segregation occur between extremely different cultures.

This method opens a new avenue, offering a new source of data to analyse the segregation or spatial integration of immigrants' residences. The online data, which is intended for other purposes, is immense and constantly updated. These studies offer a significantly reduced-cost form of access to near real-time information, with study areas on a global scale. "We hope, then, that this first work will open up the possibility for future use of this data to study integration. We also hope that it may be a valuable complement beyond the scientific community for managers and public authorities who are in charge of immigration," concludes Ramasco.

###

Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.