PRINCETON, N.J.--The machine learning techniques scientists use to predict outcomes from large datasets may fall short when it comes to projecting the outcomes of people's lives, according to a mass collaborative study led by researchers at Princeton University.
Published by 112 co-authors in the Proceedings of the National Academy of Sciences, the results suggest that sociologists and data scientists should use caution in predictive modeling, especially in the criminal justice system and social programs.
One hundred and sixty research teams of data and social scientists built statistical and machine-learning models to predict measure six life outcomes for children, parents, and households. Even after using a state-of-the-art modeling and a high-quality dataset containing 13,000 data points about more than 4,000 families, the best AI predictive models were not very accurate.
"Here's a setting where we have hundreds of participants and a rich dataset, and even the best AI results are still not accurate," said study co-lead author Matt Salganik, professor of sociology at Princeton and interim director of the Center for Information Technology Policy, based at Princeton's Woodrow Wilson School of Public and International Affairs.
"These results show us that machine learning isn't magic; there are clearly other factors at play when it comes to predicting the life course," he said. "The study also shows us that we have so much to learn, and mass collaborations like this are hugely important to the research community."
The study did, however, reveal the benefits of bringing together experts from across disciplines in a mass-collaboration setting, Salganik said. In many cases, simpler models outperformed more complicated techniques, and teams with more accurate scoring models came from uncommon disciplines -- like politics, where research on disadvantaged communities is limited.
Salganik said the project was inspired by Wikipedia, one of the world's first mass collaborations, which was created in 2001 as a shared encyclopedia. He pondered what other scientific problems could be solved through a new form of collaboration, and that's when he joined forces with Sara McLanahan, the William S. Tod Professor of Sociology and Public Affairs at Princeton, as well as Princeton graduate students Ian Lundberg and Alex Kindel, both in the Department of Sociology.
McLanahan is principal investigator of the Fragile Families and Child Wellbeing Study based at Princeton and Columbia University, which has been studying a cohort of about 5,000 children born in large American cities between 1998 and 2000, with an oversampling of children born to unmarried parents. The longitudinal study was designed to understand the lives of children born into unmarried families.
Through surveys collected in six waves (when the child was born and then when the child reached ages 1, 3, 5, 9, and 15), the study has captured millions of data points on children and their families. Another wave will be captured at age 22.
At the time the researchers designed the challenge, data from age 15 (which the researchers call in the paper the "hold-out data) had not yet been made publicly available. This created an opportunity to ask other scientists to predict life outcomes of the people in the study through a mass collaboration.
"When we began, I really didn't know what a mass collaboration was, but I knew it would be a good idea to introduce our data to a new group of researchers: data scientists," McLanahan said.
"The results were eye-opening," she said. "Either luck plays a major role in people's lives, or our theories as social scientists are missing some important variables. It's too early at this point to know for sure."
The co-organizers received 457 applications from 68 institutions from around the world, including from several teams based at Princeton. Using the Fragile Families data, participants were asked to predict one or more of the six life outcomes at age 15. These included child grade point average (GPA); child grit; household eviction; household material hardship; primary caregiver layoff; and primary caregiver participation in job training.
The challenge was based around the common task method, a research design used frequently in computer science but not in the social sciences. This method releases some but not all of the data, allowing people to use whatever technique they want to determine outcomes. The goal is to accurately predict the hold-out data, no matter how fancy a technique it takes to get there.
"The outcomes this challenge produced are incredible," Salganik said. "We now can create these simulated mass collaborations by reusing people's code and extracting their techniques to look at different outcomes, all of which will help us get closer to understanding the variability across families."
The team is currently applying for grants to continue research in this area, and they also have published 12 of the teams' results in a special issue of a journal called Socius, a new open-access journal from the American Sociological Association. In order to support additional research in this area, all the submissions to the Challenge -- code, predictions and narrative explanations -- are publicly available.
The study was supported by the Russell Sage Foundation, the National Science Foundation (grant no. 1761810), and the Eunice Kennedy Shriver National Institute of Child Health (NICHD) and Human Development (grant no. P2-CHD047879).
Funding for the Fragile Families and Child Wellbeing Study was provided by NICHD (grant nos. R01-HD36916, R01-HD39135) and a consortium of private foundations including the Robert Wood Johnson Foundation.
The paper, "Measuring the predictability of life outcomes with a scientific mass collaboration," will be published on March 30 by PNAS.
Proceedings of the National Academy of Sciences