News Release 22-Jan-2019

Lehigh University computer scientist wins 2019 NSF CAREER Award

Dr. Eric Baumer partners with ProPublica, AEquitas in injecting more humanity into the design of algorithmic systems

Lehigh University

Algorithms, in many ways, control the beat of modern life.

Algorithmic systems give us the reviews, recommendations, and connections that help us find better products, great movies, and new friends. They can be applied to nearly every facet of our lives--commerce, transportation, advertising--and when they work well, they can enrich and facilitate our experiences at home, work, and play.

But decision-by-algorithm is by no means foolproof: Bias on the part of the algorithm's author can create inadvertent, false associations that serve to mislead decision making or to malign entire cultures.

With support from the National Science Foundation (NSF), Dr. Eric Baumer, an assistant professor of computer science and engineering at Lehigh University, is seeking to improve algorithmic system design methods through a decidedly egalitarian approach. His proposal to develop participatory methods for human-centered design of algorithmic systems recently won him a Faculty Early Career Development (CAREER) Award from the National Science Foundation.

"I want technology to be more humane," he says. "And for that to happen, a more diverse set of people need to be incorporated into the design of algorithmically-based interactive systems--human beings who are not technologists integrated throughout the design process."

With the NSF's support, Baumer hopes to bridge the gap between people with expertise in computer science and those in other fields. His methods, he says, will give technologists experience in subject domains, and subject experts some fluency with technical skills. Such an approach, he says, will hopefully yield interactive tools that better align with users' existing practices--and the questions they're asking.

"As an interaction designer, you can say, 'Okay, algorithmically-generated content goes here,'" Baumer says, "but you don't really know what that content is going to be. Furthermore, it's difficult to get people whose expertise lies outside computer science engaged with these questions of, 'How do I build the natural language classifier that pulls up documents based on a search query?' or 'How do I cluster pieces of data?' So, essentially my goal is to come up with methods that allow for a shared language, a bridging among diverse people around shared goals in the design of technology systems."

The NSF CAREER program is considered one of the more prestigious awards granted by the NSF. They are awarded annually in support of junior faculty members across the U.S. who exemplify the role of teacher-scholars through outstanding research, excellent education, and the integration of education and research. Each award provides stable support at the level of approximately $500,000 for a five-year period.

Truth, justice, and the human side of data

With NSF support, Baumer will develop and test his methods at two nonprofits. AEquitas develops resources for prosecutors related to gender-based violence and human trafficking, and the newsroom ProPublica is devoted to exposing abuses of power and betrayals of the public trust.

Both groups rely heavily on data but don't necessarily have the tools to help them find what they're looking for. Legal research for AEquitas involves sifting through countless documents to find, for example, patterns of rhetoric or legal codes that can vary across counties, cities, and states.

"It's a lot of text to deal with," says Baumer. "And so it might be valuable to have computational tools that would facilitate this process."

Similarly, ProPublica analyzes reams of political text data for patterns, but lacks the sophisticated systems to do it efficiently.

"Part of what we want to do is look at discursive patterns across different forms of media," he says. "And then you can start asking interesting questions, such as, 'How do those patterns relate to the way the representative actually votes?' or 'How do those patterns relate to campaign contributions?'"

In designing custom systems that will assist the organizations, Baumer will work not only with the users of the data, but with those from whom the data is generated, and those who may be impacted by the data. For AEquitas, that will include prosecutors and plaintiffs. For ProPublica, it will include readers and subjects of the data such as members of Congress, their staffers, and lobbyists.

"Most participatory design methods are focused on users, the people who are going to use the technology," he says. "When you start talking about these AI and machine learning systems, there are new kinds of relationships that come into play, particularly those people whose data is being analyzed. A major gap in the field right now is that data are looked at purely as data instead of people. You can't engage with the data without engaging with the human beings from whom that data came from, at least not without risking serious harm."

Baumer's participatory design process will unfold in three phases. The first phase will use image diaries, card sorting, and fictional vignettes to gain a better understanding of each organization and its practices--what participants know about the data they analyze and how the data are currently used.

The second phase will include similar activities, including a workshop. But this time, the exercises are meant to teach participants how to articulate tasks that could make their work easier, as well as provoke questions and concerns about the reach and effect of algorithmic systems on different groups of people.

The third phase will use computational constraints and design sketches to convey some of the group's ideas. Dialogue, feedback, and iteration here is critical before any system is implemented.

"We will be taking our analysis of the data and our interpretations of it back to the people from whom the data came from," Baumer says, "and hopefully catching things earlier on that might be misinformed or misinterpretations or algorithmic biases. It's meant to be a cyclic iterative process. You take it to the judges or you take it to the Congressional staffers or whomever, and you say, 'This is what we think we see going on.' And they come back and say, 'This one aspect seems really interesting, but this other piece looks really odd and I didn't expect that' and then trying to figure out. This involves not just graphical representation, but asking, 'What parameters to tweak within the machine learning algorithm? What coefficients? At what point should I optimize the hyperparameters?'"

The staff of both organizations will also assist with designing the evaluation plan, measuring both quantitative and qualitative changes brought about by the new systems.

The potential for these participatory methods can be far-reaching. According to Baumer's CAREER proposal, they include: designing systems that help the AEquitas staff evaluate the arguments that are most effective in prosecuting cases of violence against women; identifying passages in news articles that frame a story in a particular way to give reporters the chance to revise them and/or readers the opportunity to track and study their implications; and combining textual and non-textual data to identify similarities and outliers in relevant groups.

Such systems could be game changers for AEquitas and ProPublica. But for Baumer, success and the true measure of progress with his CAREER award may simply be having one of his prototypes fail.

"I think that a really good marker of success would be generating initial prototypes and having someone say, 'Oh no, that's all wrong,'" he says. "Because that means the design process is doing what it's supposed to do, and catching those problems earlier on rather than letting them get baked into a system that gets publicly released."

###

RELATED LINKS:

Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.