Rochester Institute of Technology professors have received a National Science Foundation award to develop a hands-on data science course for non-computing majors. The course will first be offered at RIT and then across the country, in an effort to promote computing for all.
RIT plans to start offering the new Data Science Principles course in fall 2021. During the course, students from many different backgrounds can learn how to extract knowledge and patterns from data that's important to their field -- and all without needing to know how to program or code.
The demand for data science knowledge stretches beyond computing to many disciplines that have become data-driven, including business, engineering, science, healthcare, and the humanities. According to an IBM study, the number of data science and analytics job listings will grow to about 2.7 million in the U.S. this year, a demand that won't be met unless universities begin educating data science expertise to non-computing majors in their own domains.
"It is hard to think of any discipline that does not need data science, which can help you discover useful models and make predictions," said Xumin Liu, associate professor of computer science who is leading the development of the course. "However, even if you throw data at a problem, you need to be able to read it and make appropriate decisions from it - that is the science part."
Liu and Rajendra Raj, professor of computer science, received a nearly $300,000 NSF grant to develop curriculum that can attract non-computing majors. The challenge is creating a class that doesn't require a background in coding or a long chain of prerequisite courses in computer science and mathematics.
"I was actually inspired to develop this course by my elementary school-aged daughter, who was doing a project to study sleep and video game playing time," said Liu. "I realized that just because you don't know how to program in languages like R or Python, that should not stop you from being able to analyze data and solve problems."
To address this issue, the professors are creating the Data Science Learning Platform, a web-based graphical user interface (GUI) that allows users to query with data science tools and analyze data, without necessarily requiring coding. It will also teach the computational aspects of data science through real-time code exemplification and a sandbox where coding can be used if needed. The project also includes a Data Science Course Module that provides the curricular materials needed to use the platform effectively.
The platform will include a help center that explains different computer science and data science terms. It will also feature in-house data sets and the capability for users to upload their own data.
"It's really important that these exercises are hands-on and customizable for each student," said Liu. "The intention is not to make these students computer scientists, but rather to help them appreciate and apply basic data science concepts within their own disciplines."
At RIT, the Data Science Principles course will be available to students outside the Golisano College of Computing and Information Sciences, who have already taken the non-computing major Principles of Computing course (CSCI-101 or ISCH-110) or Advanced Placement (AP) Computer Science Principles. The plan is to make the course part of the Principles of Computing immersion, offered through the School of Information.
The professors said that many students will find this course useful for their majors.
"For example, health science students can apply data science to process and analyze large-scale electronic health records to quickly discover life-threatening patterns that are useful for diagnosis and treatment," said Liu. "Students in business can leverage data science skills to finalize informed financial decisions and gain a competitive advantage. Engineers can make use of data to understand the impact of different models and establish the best ones for actual implementation."
When created, this will be one of the first data science courses available to entry-level students in all non-computing majors. University of California, Berkeley is one other college that has designed a popular foundations of data science course.
To make a broader impact, the professors will host workshops for faculty at other universities, who want to offer the course at their college. The Data Science Learning Platform will be free and open source.
In the future, they plan to create a version of the course for an even wider audience on edX, the massive open online course provider created by Harvard and MIT. They also hope to modify the curriculum and platform for K-12 students.