Jefferson Lab's newest cluster computer takes shape
Chip Watson, JLab's High Performance Computing Group leader, stands next to the new cluster computer, 6N, which consists of eight racks of 35 machines and one rack with one machine and the central switching station. Click here for a high resolution photograph.
Jefferson Lab's spacious new Computer Center in CEBAF Center Wing F is already hosting its first occupants. Among the machines now located in the expansive room is Jefferson Lab's newest cluster computer, dubbed "6N."
Unlike a regular computer -- whose "brain" consists of one or perhaps two processors -- a cluster computer's brain can contain hundreds or even thousands of individual processors, called nodes -- all wired together. To solve a problem, the cluster splits the problem into parts, and each node computes its designated part and shares the result with other nodes to produce the final solution. This method allows cluster computers to tackle problems well beyond the capability of most desktop machines, such as providing solutions for the theory of quantum chromodynamics (QCD), which describes the strong force that dictates how quarks and gluons build protons, neutrons and other particles.
The new cluster joins JLab's 128-node 2002 machine, the 256-node 2003 machine (3G), and the 384-node 2004 machine (4G). According to Chip Watson, who leads Jefferson Lab's High Performance Computing Group, the first pieces of the new 6N cluster began arriving in December 2005. It is comprised of 281 dual-core Intel processors, and it runs the popular Linux operating system, just like its predecessors.
However, the 6N computer nodes each contain one processor with two processing cores, resulting in a sustained 2.5 Gigaflop computing capability (roughly 2.5 billion operations per second) for each node. The machine also incorporates the new InfiniBand technology connections, which can carry data from one node to the next about 10 times faster than the Gigabit Ethernet connections used in JLab's last two cluster machines.
"This is the first year that InfiniBand has been cost-effective," Watson says, "The InfiniBand fabric runs at a gigabyte/second in each direction, which is comfortably more bandwidth than we need today." The overall result of these upgrades is a machine that is more powerful than all three of the previous machines combined and essentially doubles JLab's high-performance computing capacity.
Watson says his group is currently working on testing a new version of the InfiniBand software and optimizing it for 6N. In the meantime, the group is also preparing for a massive new machine scheduled to arrive next winter. "We'll be hosting the next large cluster for the U.S. QCD community: a $1.5 million machine with 500 nodes that will be twice as powerful as our new machine," Watson explains.
Jefferson Lab's cluster computer program is funded through the Scientific Discovery Through Advanced Computing, or SciDAC program, in the Department of Energy's (DOE's) Office of Science, with additional funding from JLab's Nuclear Physics program. SciDAC is an effort to develop the scientific computing software and hardware infrastructure needed to use terascale computers to advance DOE research programs in basic energy sciences, biological and environmental research, fusion energy sciences, and high-energy and nuclear physics.
The Department of Energy's Office of Science is the single largest supporter of basic research in the physical sciences in the United States and is working to address some of the most pressing challenges of our time.