Advanced computing and high energy physics for the 21st century
Physicists and computer scientists can stake a larger claim on the future of high-energy physics—and on the next generation of computing—through their part in the first-ever awards in DOE's Scientific Discovery through Advanced Computing Program
Fermilab's Feynmann computer center, with Wilson Hall in background.
December 10—Through the SciDAC awards, Fermilab will receive approximately $1.28 million a year for the next three years as a participant in three nationwide collaborations: the Particle Physics DataGrid; Advanced Computing for 21st Century Accelerator Science and Technology; and the National Computational Infrastructure for Lattice Gauge Theory.
Together, the efforts share in ambitious scientific goals: creating computer tools that will allow physicists to work at their home base with up-to-the-second experimental data from sources anywhere the world; and adapting those access tools to design the high-energy physics discovery machines of the future more efficiently and economically.
"We know that there are basic questions of nature that can only be answered by building new accelerators to reach higher energy levels," said Fermilab Computing Division Head Matthias Kasemann. "These facilities will absolutely require world-wide collaboration, and computing is a critical element in making this collaboration work."
As the major next step, the Particle Physics Data Grid, Advanced Computing, and Lattice Gauge Theory collaborations will help create a new generation of scientific simulation codes for "terascale" computers: computers capable of making trillions of operations per second ("teraflops"), while handling trillions of bytes of data ("terabytes").
The collaborations will integrate terascale computing into developing the concept of a "collaboratory," or true collaborative laboratory. Scientists at far-flung institutions could work as if they were side-by-side at a central site, with the collaboratory distributing data for sharing in the analysis of particle physics experiments; in the design and development of future particle accelerators in new energy realms; and in the continuing expansion of theoretical principles and calculations in quantum chromodynamics (QCD), the sector of the Standard Model that describes the strong forces between particles.
"The requirements of high-energy physics push computing technology to ever-higher levels and offer extraordinary opportunities for collaborative efforts," said Fermilab Director Mike Witherell. "SciDAC represents a promising new approach to integrating science and computing at the Department of Energy."
Lattice gauge: Calculating a strategy
The elementary particles called quarks are held together by what is called the strong force. The study of that force is called the theory of Quantum Chromodynamics (QCD), and the computational method is Lattice Gauge Theory. Studying quarks and their binding force is a continuing challenge.
"The only way to get meaningful results is through large-scale numerical simulations," said theorist Robert Sugar of the University of California at Santa Barbara, principal investigator of the Lattice Gauge Theory collaboration.
"In recent years, there have been major advances in the methods for these calculations," Sugar continued, "and the Fermilab theory group has been a leader in developing the technologies to make these calculations feasible. But, to be meaningful, these methods require big increases in computing power."
Fermilab physicist Paul Mackenzie is one of the institutional leaders in the Lattice Gauge Theory collaboration, which has 65 members at 11 institutions (Fermilab, University of Illinois at Champaign-Urbana, University of California-Santa Barbara, M.I.T., Boston University, Columbia University, University of Washington, University of Utah, University of Arizona, Brookhaven National Lab, Thomas Jefferson Lab)—"essentially the entire U.S. lattice gauge community," Mackenzie said.
"Fermilab's role will be the development of large, cost-effective clusters of commodity computers for lattice calculations," Mackenzie said. "Essentially, these are 'off-the-shelf' computers. But we will be integrating the machines to make them perform together."
Fermilab's collider experiments are experienced at using computer clusters and have continued developing the concept for Collider Run II of the Tevatron. The Lattice Gauge Theory effort, already underway at the New Muon Lab on site, will also have a major focus on producing the clusters more economically. Mackenzie put the overall goal at reducing costs from the current $10-$20 per Megaflop, down to as little as a dollar per Megaflop.
Fermilab built its own homemade supercomputer called ACPMAPS for lattice QCD calculations in the last decade, and similar computers have been built at Columbia and M.I.T. But the SciDAC program was the catalyst for these various groups to get together and apply for joint funding.
"The U.S. has traditionally been a leader in the field of QCD," said Sugar, "but recently our nation has lagged behind the Europeans and Japanese in the computing facilities required for this research. The SciDAC grant will allow us to take some very important first steps in creating this needed infrastructure."
When the computer clusters are built, the huge amounts of data they generate must be stored somewhere—and physicists must be able to find what they need. Another $35,000 annually will go to an effort in Storage Resource Management, for example, placing the data on tapes and using robots to carry out physicists' commands for search and retrieval.
"It means matching tens of thousands of tape cartridges to tens of thousands of tape drives, and doing it quickly and reliably," said Donald Holmgren of Fermilab's Computing Division. "We'll be developing software for a physicist to issue a command to find the kind of data needed, instead of 'Go To Tape Number 123456.'"
The grid: Crossing borders and boundaries
The World Wide Web was developed to exchange information among particle physicists, but particle physics experiments now generate more data than the Web can handle. So physicists often put data on tapes and ship the tapes from one place to another—an anachronism in the Age of the Internet. But that's changing, and SciDAC will accelerate the change.
A major element of Fermilab's contributions to the Compact Muon Solenoid Detector (CMS) being built for CERN, the European Particle Physics Laboratory, is the formulation of a distributed computing system for widespread access to data when CERN's Large Hadron Collider begins operation later this decade. Fermilab's DZero experiment has established its own computing grid called SAM, used to offer access for experiment collaborators at six sites in Europe.
With SciDAC support, the nine-institution Particle Physics DataGrid collaboration (Fermilab, SLAC, Lawrence Berkeley Lab, Argonne, Brookhaven Lab, Thomas Jefferson National Accelerator Facility, CalTech, University of Wisconsin, University of California-San Diego) will develop the distributed computing concept for particle physics experiments at the major U.S. high-energy physics research facilities. Both DZero and the US/CMS collaboration are member experiments. The goal: to offer access to the worldwide research community, developing what's called "middleware" to make maximum use of the bandwidths available on the network.
For example, middleware can act as a sort of "search engine" for resources when a physicist isn't sure of the location of needed data. Middleware can determine the best access point for the data, and the best way to transmit it over the network. Instead of using one process at one speed, middleware can manage different processes at different speeds simultaneously to make the best use of the available bandwidth.
The Particle Physics Data Grid collaboration will serve high-energy physics experiments with large-scale computing needs, such as DZero at Fermilab, BABAR at SLAC and the CMS experiment, now under construction to operate at CERN, by making the experiments' data available to scientists at widespread locations.
"This is a very exciting opportunity for people in high-energy physics software development to collaborate with leaders in the computer science field at the universities," said Ruth Pordes of Fermilab's Computing Division, who serves as the collaboration coordinator. "All these proposals offer a new dimension to collaboration across laboratories and across technical and scientific domains."
Kasemann described Fermilab as providing the "test bed" as the distributive computing concept meets real-world demands of high-energy physics. Fermilab's experiments will offer the first implementation and the first trials of DataGrid, but Kasemann pointed out that the benefits aren't directed only at decentralized experimenters: the access to distributive computing power will expand computing capabilities beyond those available at a single central site.
"We as physicists need this greater computing power to fulfill our mission," Kasemann said.
Advanced computing for 21st century accelerators
The accelerators that fuel the demands for expanded computing power are also demanding greater computing power in their design and operation. The collaboration on Advanced Computing for 21st Century Accelerator Science and Technology involves 10 institutions (Fermilab, SLAC, Los Alamos National Lab, Lawrence Berkeley National Lab, Brookhaven National Lab, Thomas Jefferson Lab, Stanford University, UCLA, University of California-Berkeley, University of Maryland, USC, Tech-X, SNL, University of California-Davis), developing high-performance simulation codes to use existing accelerators more efficiently and help streamline the design of future accelerators.
The Fermilab effort will develop simulation software for improving the performance of the lab's accelerators; and will contribute to design studies for future machines and improvements, including a proton driver upgrade, ionization cooling for neutrino factories and muon colliders, and the Linear Collider—one of the possibilities for the next generation of high-energy physics accelerators.
"It's more cost-effective to simulate accelerators than to build prototypes that may require several changes," said Fermilab physicist Panagiotos Spentzouris. "It's also important to note the opportunities that are opened up for students. I have two computer science students from Kansas State University working with me."
All told, the DOE SciDAC awards will offer $57 million to 51 projects in this fiscal year, advancing fundamental research in such areas as climate modeling, fusion energy sciences, chemical sciences and nuclear astrophysics as well as high-energy physics and high-performance computing. SciDAC is an integrated program to help create a new generation of scientific simulation codes.