Podcast host Jorge Salazar interviews Xian-He Sun, Distinguished Professor of Computer Science at the Illinois Institute of Technology.
What if scientists could realize their dreams with big data? On the one hand you have parallel file systems for number crunching. On the other, you have Hadoop file systems, made for cloud computing with data analytics. The problem is that one doesn't know what the other is doing. You have to copy files from parallel to Hadoop. Doing that is so slow it can turn a supercomputer into a super slow computer.
Computer scientists developed in 2015 a way for parallel and Hadoop to talk to each other. It's a cross-platform Hadoop reader called PortHadoop, short for portable Hadoop. The scientist have since improved it, and it's now called PortHadoop-R. It's good enough to start work with real data in the NASA Cloud library project. The data are used for real-time forecasts of hurricanes and other natural disasters; and also for long-term climate prediction.
A supercomputer at TACC helped the researchers develop PortHadoop-R. The system is called Chameleon, a cloud testbed funded by the National Science Foundation. Chameleon is a large-scale, reconfigurable environment for cloud computing research co-located at the Texas Advanced Computing Center and also at the University of Chicago.
Chameleon allows researchers 'bare-metal access,' the ability to change and adapt the supercomputer's hardware and customize it to improve reliability, security, and performance.
Sun's PortHadoop research was funded by the National Science Foundation and the NASA Advanced Information Systems Technology Program (AIST).
Feature Story: www.tacc.utexas.edu/-/reaching-for-...-with-chameleon
Music Credits: Raro Bueno, Chuzausen freemusicarchive.org/music/Chuzausen/