Feature Story | 30-Apr-2002

ORNL, IBM, and the Blue Gene Project

DOE/Oak Ridge National Laboratory

ORNL is involved in a cooperative research and development agreement with IBM to help develop the Blue Gene supercomputer that will improve our understanding of how living cells work. This supercomputer will use advanced cellular architectures to allow 1000 trillion calculations per second (petaflops computing). ORNL researchers will write programs to help the machine run effectively. (Ross Toedte)
Click here for more photos.

Advanced cellular architecture in the next-generation supercomputer will help scientists better understand the makeup and purpose of different genes and proteins in living cells.

Massive computing power and the intricacies of biological matter at the molecular level will be colliding through a cooperative research and development agreement (CRADA) announced August 22, 2001, by ORNL and IBM and funded by IBM and the Department of Energy.

At the heart of the agreement is IBM’s Blue Gene research project, which combines advanced protein science with IBM’s next-generation cellular architecture supercomputer design. Unlike today’s computers, cellular servers will run on chips containing “cells,” which are processors that contain memory and communications circuits. Cellular architecture will help scale computer performance from a teraflop (1 trillion calculations per second) to a petaflop (1000 trillion calculations per second).

The new supercomputer will be a petaflop machine. The fastest existing computer, ASCI White, unveiled by IBM in early August 2001, can perform about 12 trillion calculations per second, or 12 teraflops. That computer is being used for nuclear weapons stockpile stewardship research at DOE’s Lawrence Livermore National Laboratory. IBM, also known as Big Blue, began its five-year, $100 million Blue Gene project at the end of 1999; its goal is to create a supercomputer that can handle large-scale computing projects.

Supercomputing power of this magnitude (1 petaflop) will improve scientists’ ability to predict future climate, advance the field of nanotechnology, and gain a better understanding of how gene sequences and the folding of proteins relate to diseases.

“Proteins control all processes occurring in the cells of the body,” says Joe Jasinski, manager of the Computational Biology Center for IBM Research. “These proteins are made up of a vast array of different combinations of amino acids that fold and bend into very complex, three-dimensional shapes that determine the exact function of each protein. If the shape of a protein changes because of some environmental, physical, or biological factor, the protein may turn from being beneficial to one that causes a specific disease.”

The understanding of the protein-folding phenomenon is a recognized “grand challenge problem” of great interest to the life sciences. The scientific knowledge derived from research on protein folding can potentially be applied to a variety of problems of great scientific and commercial interest, including protein-drug interactions, enzyme catalysis, and refinement of protein structures created through other methods.

“Our collaboration with Oak Ridge National Laboratory is vital to IBM’s work to extend the boundaries for applications of large-scale computing, focusing on the combination of IBM and ORNL’s deep scientific capabilities,” says David McQueeney, vice president of Emerging Business for IBM Research. “Together we have built a common roadmap for an ambitious, multi-year evolution of the simulation and modeling of many complex systems. We are confident that we will break new ground in several domains, including life sciences.”

“The complexity of the protein-folding problem, nanoscale science, and climate dynamics will require computational resources at a scale not yet achieved by any scientific application,” says Thomas Zacharia, ORNL’s associate laboratory director for Computing and Computational Sciences. “This is an exciting next step in ORNL’s history of evaluating new computational architectures and pushing the computational science envelope.” Before it will be possible to solve problems in biology, climate, and nanotech-nology, scientists must devise methods to run applications that use tens of thousands of processors in the Blue Gene supercomputer. Each processor forms a cell with memory, communication, and input/output built in. This approach departs from past designs and offers a glimpse of what’s to come in high-performance computing.

“The world of supercomputing is rapidly changing,” says Ed Oliver, associate director in the Department of Energy’s Office of Advanced Scientific Computing Research. “We need to develop approaches to solving computational problems that are able to scale to thousands of processors and at the same time be tolerant of failures of some of these processors.”

Working with IBM, ORNL researchers led by Al Geist of the Computer Sciences and Mathematics Division will develop fault-tolerant algorithms to allow the Blue Gene supercomputer to work around processors that fail, as well as other capabilities, to ensure that the machine operates effectively. ORNL scientists led by Ying Xu of the Life Sciences Division will collaborate with IBM on how the supercomputer should be programmed to analyze proteins and predict their structures.

IBM and ORNL hope to use this enormous computing power to explore numerous other areas, as well. This effort merely represents the beginning of what is expected to be a long relationship.

###

Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.