A deep neural network running on an ordinary desktop computer is interpreting highly technical data related to national security as well as - and sometimes better than - today's best automated methods or even human experts.
The progress tackling some of the most complex problems of the environment, the cosmos and national security comes from scientists at the Department of Energy's Pacific Northwest National Laboratory who presented their work at the 11th MARC conference - Methods and Applications of Radioanalytical Chemistry - in April in Hawaii. Their work employs deep learning, in which machines are enabled to learn and make decisions without being explicitly programmed for all conditions.
The research probes incredibly complex data sets from the laboratory's shallow underground lab, where scientists detect the faintest of signals from a planet abuzz in activity. In the laboratory buried 81 feet beneath concrete, rock and earth, thick shielding dampens signals from cosmic rays, electronics and other sources. That allows PNNL scientists to isolate and decipher signals of interest collected from anywhere on the planet.
Those signals signify events called radioactive decays, when a particle such as an electron is emitted from an atom. The process is happening constantly, through both natural and human activity. Scientists can monitor changes in levels of argon-37, which could indicate prior nuclear test activity, and argon-39, whose levels help scientists determine the age of groundwater and learn more about the planet.
The lab has accumulated data on millions of radioactive decay events since it opened in 2010. But it's a noisy world out there, especially for scientists listening for very rare signals that are easily confused with signals of a different and frequently routine origin - for instance, a person flipping on a light switch or receiving a call on a cell phone.
PNNL scientist Emily Mace, who presented at MARC, is an expert in interpreting the features of such signals - when an event might indicate underground nuclear testing, for example, or a rapidly depleting aquifer. Much like physicians peruse X-rays for hints of disease, Mace and her colleagues pore over radioactive decay event data regularly to interpret the signals - their energy, timing, peaks, slopes, duration, and other features.
"Some pulse shapes are difficult to interpret," said Mace. "It can be challenging to differentiate between good and bad data."
Recently Mace and colleagues turned for input to their colleagues who are experts in deep learning, an exciting and active subfield of artificial intelligence. Jesse Ward is one of dozens of deep learning experts at the lab who are exploring several applications through PNNL's Deep Learning for Scientific Discovery Agile Investment. Mace sent Ward information on nearly 2 million energy pulses detected in the Shallow Underground Laboratory since 2010.
Ward used a clean sample set of 32,000 pulses to train the network, inputting many features of each pulse and showing the network how the data was interpreted. Then he fed the network thousands more signals as it taught itself to differentiate between "good" signals that showed something of interest and "bad" signals that amounted to unwanted noise. Finally, he tested the network, feeding it increasingly complex sets of data that are difficult even for experts to interpret.
The network he created interprets pulse shape events with an accuracy that equals and sometimes surpasses the know-how of experts like Mace. With straightforward data, the program sorted more than 99.9 percent of the pulses correctly.
Results are even more impressive when the data is noisy and includes an avalanche of spurious signals:
- In an analysis involving 50,000 pulses, the neural network agreed 100 percent of the time with the human expert, besting the best conventional computerized techniques which agreed with the expert 99.8 percent of the time.
- In another analysis of 10,000 pulses, the neural net correctly identified 99.9 percent of pulses compared to 96.1 percent with the conventional technique. Included in this analysis were the toughest pulses to interpret; with that subset, the neural network did more than 25 times better, correctly classifying 386 out of 400 pulses compared to 14 of 400 for the conventional technique.
"This is a relatively simple neural network but the results are impressive," said Ward. "You can do productive work on important scientific problems with a fairly primitive machine. It's exciting to consider what else is possible."
The project posed an unexpected challenge, however: The shallow underground lab is so pristine, with most spurious noise signals mitigated before they enter the data stream, that Ward found himself asking Mace for more bad data.
"Signals can be well behaved or they can be poorly behaved," said Ward. "For the network to learn about the good signals, it needs a decent amount of bad signals for comparison."
The problem of culling through vast amounts of data looking for meaningful signals has a raft of implications and extends to many areas of science. At PNNL, one area is the search for signals that would result from dark matter, the vast portion of matter in our universe whose origin and whereabouts is unknown. Another is the automatic detection of breast cancers and other tissue anomalies.
"Deep learning is making it easier for us to filter out a small number of good events that are indicative of the activity of interest," said Craig Aalseth, nuclear physicist and PNNL laboratory fellow. "It's great to see deep-learning techniques actually doing a better job than our previous best detection techniques."