Public Release: 

Carnegie Mellon leads NSF project to help people understand Web privacy policies

Effort will teach computers to read and evaluate policy elements

Carnegie Mellon University

PITTSBURGH--Figuring out what information websites are gathering about its users can be daunting, but a new privacy research project led by Carnegie Mellon University will make that task easier with computer tools that leverage the power of crowdsourcing, machine learning and natural language processing.

Working with law school researchers at Fordham and Stanford universities, computer scientists and behavioral economists at Carnegie Mellon will teach computer systems how to read and evaluate the lengthy, often-confusing and subject-to-change privacy policies now posted by major websites. These systems, together with crowdworkers, will then create user-friendly digests that highlight policy elements that matter most to people.

The 42-month, $3.75 million Usable Privacy Policy Project is sponsored by the National Science Foundation through its Secure and Trustworthy Cyberspace (SaTC) program. It is one of three large Frontier awards the NSF recently announced in support of collaborative, multi-university research and education activities to protect critical infrastructure and enable a more secure information society.

"People are increasingly aware that information about them -- the sites they visit, the products they buy -- is being collected, used, shared and recombined in all sorts of ways," said Norman Sadeh, a Carnegie Mellon professor of computer science and leader of the Usable Privacy Policy Project. "But they feel helpless. They have no practical way of finding out about these practices and making informed decisions. Hardly anybody reads privacy policies and, when they do, they usually can't answer even the most trivial questions about those policies."

Earlier attempts to solve this dilemma, whether by encouraging websites to post privacy policies in machine-readable language or by getting website operators to abide by new rules, have encountered significant resistance. Instead, Sadeh and his collaborators aim to work with what is already available -- those rarely read, plain English privacy policies.

Crowdsourcing will be used to identify and extract those policy features that matter most to people. By itself, however, crowdsourcing would not scale to cover the breadth of the Internet and keep pace with changing policies. The researchers will therefore rely on computers to routinely scan through the policies, even though computers can't yet understand all the nuances of human language.

"We are going to develop algorithms that can automatically or semi-automatically read a privacy policy well enough to answer a few questions likely to be of interest to many users and also to policymakers," said Noah Smith, associate professor of language technologies and machine learning at Carnegie Mellon. "This is an exciting opportunity to apply recent developments in robust natural language processing to an everyday dilemma."

One goal is to develop user interfaces or browser add-ons that can summarize the pertinent privacy characteristics of a website in a way that is easily understood. This might be as simple as a letter grade, said Joel Reidenberg, a Fordham University law professor. User studies will help fine-tune the new interfaces, ensuring that people understand and can effectively use privacy information. Researchers also will develop methods using formal logic to provide deeper analysis of policies, identifying inconsistencies and conflicts that can inform ongoing legal and regulatory discussions.


More information is available at the project website, The research team includes Lorrie Cranor, CMU associate professor of computer science and engineering and public policy, and Alessandro Acquisti, associate professor of information technology and public policy, who will contribute their expertise to the design and evaluation of novel privacy displays. Travis Breaux, CMU assistant professor of computer science, Aleecia McDonald, director of privacy at Stanford's Center for Internet & Society, and Fordham's Reidenberg will help in analyzing privacy policies and in informing ongoing public policy efforts in the area.

About Carnegie Mellon University: Carnegie Mellon is a private, internationally ranked research university with programs in areas ranging from science, technology and business, to public policy, the humanities and the arts. More than 12,000 students in the university's seven schools and colleges benefit from a small student-to-faculty ratio and an education characterized by its focus on creating and implementing solutions for real problems, interdisciplinary collaboration and innovation. A global university, Carnegie Mellon has campuses in Pittsburgh, Pa., California's Silicon Valley and Qatar, and programs in Africa, Asia, Australia, Europe and Mexico. The university recently completed "Inspire Innovation: The Campaign for Carnegie Mellon University," exceeding its $1 billion goal to build its endowment, support faculty, students and innovative research, and enhance the physical campus with equipment and facility improvements. The campaign closed June 30, 2013.

Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.