News Release

Moran and Yao to study interpretability for neural language models of source code

Grant and Award Announcement

George Mason University

Kevin Moran, Assistant Professor, Computer Science, and Ziyu Yao, Assistant Professor, Computer Science, are set to receive funding from the National Science Foundation for: "Collaborative Research: SHF: Medium: Toward Understandability and Interpretability for Neural Language Models of Source Code."

Moran and Yao will develop a framework and methodology that enables researchers who build AI-powered developer tools and software engineers who use these tools to interpret why the underlying models make the predictions they do. 

Their objective is to allow researchers to obtain detailed insights into why a model may not be performing as expected, allowing for targeted improvement and informed creation of new models. 

Moran and Yao will integrate their methodology into AI-powered software development tools. This will allow software engineers to make informed decisions about when a tool's suggestion may be helpful or harmful, thus building trust in their use. The interpretability framework will also enable new forms of interaction with these tools and provide a mechanism for natural language feedback that improves over time. 

This project will produce and disseminate educational materials on best practices related to building and using AI-powered programming tools. These materials are intended to be integrated into existing computer-literacy courses at all levels of education. In addition, the project will focus on recruiting and retaining computer science students from traditionally underrepresented categories.

This project has three specific goals. 

First, it will design an automated approach for generating global explanations of the behavior of context-free neural language models for source code. This component of the project will map predictions from large language models to human-interpretable programming language concepts using causal inference theory, wherein explanations of behavior will be generated via causal interventions. 

Second, it will develop automated techniques for local explanations of contextualized language models of code by developing a set of interpretability techniques that generate behavioral, feature-based, and textual explanations defined for given SE tasks (e.g., program repair). 

Finally, the project will create techniques that enable researchers and developers to provide feedback to models based on generated explanations.

Regarding the significance of the research, Moran said, "Recent advancements in large language models for code have brought about some of the most powerful tools for developers that we have seen in decades. However, these tools are still largely opaque — that is, developers currently can’t understand why a model arrived at a given decision, and hence they do not always trust the tool’s output. Our project aims to change this by developing tools and techniques for interpreting these models, and giving developers the means to trust and work in concert with these AI assistants to more effectively tackle programming tasks.”

The researchers will receive $745,197 from NSF for this project. Funding will begin in Oct. 2023 and will end in late Sept. 2027.

###

About George Mason University

George Mason University is Virginia's largest public research university. Located near Washington, D.C., Mason enrolls 38,000 students from 130 countries and all 50 states. Mason has grown rapidly over the last half-century and is recognized for its innovation and entrepreneurship, remarkable diversity and commitment to accessibility. Learn more at http://www.gmu.edu.


Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.