News Release

Rice, Intel optimize AI training for commodity hardware

CPU algorithm trains deep neural nets up to 15 times faster than top GPU trainers

Peer-Reviewed Publication

Rice University

Anshumali Shrivastava

image: Anshumali Shrivastava is an assistant professor of computer science at Rice University. view more 

Credit: Photo by Jeff Fitlow/Rice University

HOUSTON -- (April 7, 2021) -- Rice University computer scientists have demonstrated artificial intelligence (AI) software that runs on commodity processors and trains deep neural networks 15 times faster than platforms based on graphics processors.

"The cost of training is the actual bottleneck in AI," said Anshumali Shrivastava, an assistant professor of computer science at Rice's Brown School of Engineering. "Companies are spending millions of dollars a week just to train and fine-tune their AI workloads."

Shrivastava and collaborators from Rice and Intel will present research that addresses that bottleneck April 8 at the machine learning systems conference MLSys.

Deep neural networks (DNN) are a powerful form of artificial intelligence that can outperform humans at some tasks. DNN training is typically a series of matrix multiplication operations, an ideal workload for graphics processing units (GPUs), which cost about three times more than general purpose central processing units (CPUs).

"The whole industry is fixated on one kind of improvement -- faster matrix multiplications," Shrivastava said. "Everyone is looking at specialized hardware and architectures to push matrix multiplication. People are now even talking about having specialized hardware-software stacks for specific kinds of deep learning. Instead of taking an expensive algorithm and throwing the whole world of system optimization at it, I'm saying, 'Let's revisit the algorithm.'"

Shrivastava's lab did that in 2019, recasting DNN training as a search problem that could be solved with hash tables. Their "sub-linear deep learning engine" (SLIDE) is specifically designed to run on commodity CPUs, and Shrivastava and collaborators from Intel showed it could outperform GPU-based training when they unveiled it at MLSys 2020.

The study they'll present this week at MLSys 2021 explored whether SLIDE's performance could be improved with vectorization and memory optimization accelerators in modern CPUs.

"Hash table-based acceleration already outperforms GPU, but CPUs are also evolving," said study co-author Shabnam Daghaghi, a Rice graduate student. "We leveraged those innovations to take SLIDE even further, showing that if you aren't fixated on matrix multiplications, you can leverage the power in modern CPUs and train AI models four to 15 times faster than the best specialized hardware alternative."

Study co-author Nicholas Meisburger, a Rice undergraduate, said "CPUs are still the most prevalent hardware in computing. The benefits of making them more appealing for AI workloads cannot be understated."


Additional co-authors include Mengnan Zhao of Rice, Sameh Gobriel and Charlie Tai of Intel, and Yong Wu, formerly of Intel and now with Ant Labs.

The research was supported by the National Science Foundation (1652131, 1838177), the Air Force Office of Scientific Research (FA9550-18-1-0152) and the Office of Naval Research.

The MLSys 2021 paper on SLIDE is available at:

High-resolution IMAGES are available for download at:

CAPTION: Anshumali Shrivastava (Photo by Jeff Fitlow/Rice University)

CAPTION: Shabnam Daghaghi (Photo courtesy of S. Daghaghi)

CAPTION: Nicholas Meisburger (Photo courtesy of N. Meisburger)

Related research from Rice:

Bad news for fake news: Rice research helps combat social media misinformation - Dec. 10, 2020

Deep learning rethink overcomes major obstacle in AI industry - March 2, 2020

Rice, Amazon report breakthrough in 'distributed deep learning' - Dec. 9, 2019

Rice U. scientists slash computations for 'deep learning' - June 1, 2017

This release can be found online at

Follow Rice News and Media Relations via Twitter @RiceUNews.

Located on a 300-acre forested campus in Houston, Rice University is consistently ranked among the nation's top 20 universities by U.S. News & World Report. Rice has highly respected schools of Architecture, Business, Continuing Studies, Engineering, Humanities, Music, Natural Sciences and Social Sciences and is home to the Baker Institute for Public Policy. With 3,978 undergraduates and 3,192 graduate students, Rice's undergraduate student-to-faculty ratio is just under 6-to-1. Its residential college system builds close-knit communities and lifelong friendships, just one reason why Rice is ranked No. 1 for lots of race/class interaction and No. 1 for quality of life by the Princeton Review. Rice is also rated as a best value among private universities by Kiplinger's Personal Finance.

Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.