Public Release: 

Next for DARPA: 'Autocomplete' for programmers

Rice University leads $11 million effort in big data software analytics

Rice University

Writing computer programs could become as easy as searching the Internet. A Rice University-led team of software experts has launched an $11 million effort to create a sophisticated tool called PLINY that will both "autocomplete" and "autocorrect" code for programmers, much like the software to complete search queries and correct spelling on today's Web browsers and smartphones.

"Imagine the power of having all the code that has ever been written in the past available to programmers at their fingertips as they write new code or fix old code," said Vivek Sarkar, Rice's E.D. Butcher Chair in Engineering, chair of the Department of Computer Science and the principal investigator (PI) on the PLINY project. "You can think of this as autocomplete for code, but in a far more sophisticated way."

Sarkar said the four-year effort is funded by the Defense Advanced Research Projects Agency (DARPA). PLINY, which draws its name from the Roman naturalist who authored the first encyclopedia, will involve more than two dozen computer scientists from Rice, the University of Texas-Austin, the University of Wisconsin-Madison and the company GrammaTech.

PLINY is part of DARPA's Mining and Understanding Software Enclaves (MUSE) program, an initiative that seeks to gather hundreds of billions of lines of publicly available open-source computer code and to mine that code to create a searchable database of properties, behaviors and vulnerabilities.

Rice team members say the effort will represent a significant advance in the way software is created, verified and debugged.

"Software today is far more complex than it was 20 years ago, yet it is still largely created by hand, one line of code at a time," said co-PI Swarat Chaudhuri, assistant professor of computer science at Rice. "We envision a system where the programmer writes a few of lines of code, hits a button and the rest of the code appears. And not only that, the rest of the code should work seamlessly with the code that's already been written."

He said PLINY will need to be sophisticated enough to recognize and match similar patterns regardless of differences in programming languages and code specifications. The system will have to explore different ways of interweaving code retrieved through search into a programmer's partially completed draft program and analyze the resulting code to make sure that it does not have bugs or security flaws.

The core of the system will be a data-mining engine that continuously scans the massive repository of open-source code. The engine will leverage the latest techniques in deep program analyses and big-data analytics to populate and refine a database that can be queried whenever a programmer needs help finishing or debugging a piece of code.

"The engine will formulate answers using Bayesian statistics," said co-PI Chris Jermaine, associate professor of computer science at Rice. "Much like today's spell-correction algorithms, it will deliver the most probable solution first, but programmers will be able to cycle through possible solutions if the first answer is incorrect."

Sarkar, Chaudhuri and Jermaine will be joined on the PLINY project by Rice co-PIs Keith Cooper and Moshe Vardi.

"This is a dream team that combines Rice's traditional strengths in programming language research with our new capabilities in big-data analytics," Sarkar said. "Add to that our world-class experts from U. Wisconsin, UT-Austin and Grammatech and we have an exciting four years ahead of us as we embark on addressing this DARPA hard challenge."

###

Video is available at: http://youtu.be/e4CMTk-3iNk

High-resolution IMAGES are available for download at: http://news.rice.edu/wp-content/uploads/2014/09/0923_PLINY-DARPA-main-lg.jpg

http://news.rice.edu/wp-content/uploads/2014/09/0923_PLINY-DARPA-group1-lg.jpg

This release can be found online at news.rice.edu.

Follow Rice News and Media Relations via Twitter @RiceUNews

Located on a 300-acre forested campus in Houston, Rice University is consistently ranked among the nation's top 20 universities by U.S. News & World Report. Rice has highly respected schools of Architecture, Business, Continuing Studies, Engineering, Humanities, Music, Natural Sciences and Social Sciences and is home to the Baker Institute for Public Policy. With 3,920 undergraduates and 2,567 graduate students, Rice's undergraduate student-to-faculty ratio is just over 6-to-1. Its residential college system builds close-knit communities and lifelong friendships, just one reason why Rice is highly ranked for best quality of life by the Princeton Review and for best value among private universities by Kiplinger's Personal Finance.

DARPA Distribution Statement "A" (Approved for Public Release, Distribution Unlimited)

Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.