News Release

How to preserve fleeting digital information with DNA for future generations

Peer-Reviewed Publication

American Chemical Society

BOSTON, Aug. 17, 2015 -- Hand-written letters and printed photos seem quaint in today's digital age. But there's one thing that traditional media have over hard drives: longevity. To address this modern shortcoming, scientists are turning to DNA to save unprecedented amounts of digital data for posterity. One team has demonstrated that DNA they encapsulated can preserve information for at least 2,000 years, and they're now working on a filing system to make it easier to navigate.

The researchers present their work today at the 250th National Meeting & Exposition of the American Chemical Society (ACS). ACS, the world's largest scientific society, is holding the meeting here through Thursday. It features more than 9,000 presentations on a wide range of science topics.

"If you go back to medieval times in Europe, we had monks writing in books to transmit information for the future, and some of those books still exist," says Robert Grass, Ph.D. "Now, we save information on hard drives, which wear out in a few decades."

At the same time, digital technology has spurred an explosion in the amount of information available at any given moment. Any new techniques scientists develop to preserve even parts of our digital universe would have to be extremely small. This is where DNA comes in.

"A little after the discovery of the double helix architecture of DNA, people figured out that the coding language of nature is very similar to the binary language we use in computers," says Grass, who is with ETH Zurich. "On a hard drive, we use 0s and 1s to represent data, and in DNA, we have four nucleotides A, C, T and G."

But DNA has two major advantages over hard drives: size and durability. An external hard drive about the size of a paperback book can back up five terabytes of information and might last 50 years. In theory, a fraction of an ounce of DNA could store more than 300,000 terabytes. And, from archaeological finds, scientists know that DNA from hundreds of thousands of years ago can still be sequenced today.

A handful of research groups are exploring methods to take advantage of DNA's storage potential. Grass' team has encoded DNA with 83 kilobytes of text from the Swiss Federal Charter from 1291 and the Method of Archimedes from the 10th century. They encapsulated the DNA in silica spheres and warmed it to nearly 160 degrees Fahrenheit for one week, which is the equivalent of keeping it for 2,000 years at about 50 degrees. When they decoded it, it was error-free.

Now that the researchers have demonstrated how to synthetically preserve DNA for long periods of time, they're tackling the next challenge.

"In DNA storage, you have a drop of liquid containing floating molecules encoded with information," Grass says. "Right now, we can read everything that's in that drop. But I can't point to a specific place within the drop and read only one file." So, he and his colleagues are currently developing ways to label specific pieces of information on DNA strands to make them searchable.

Like many technologies in their early years, DNA storage comes with a hefty price tag. Encoding and saving a few megabytes of data costs thousands of dollars, Grass says. In other words, consumers won't have the option of buying DNA-based hard drives anytime soon.

So what will this technology accomplish? Grass says that question has yet to be answered. If it were up to him, he says he would take data snapshots of the ever-evolving Wikipedia, for example, to preserve its various iterations so they're not lost forever as users make edits. DNA storage also could preserve troves of historical texts, government documents or entire archives of private companies, all in a droplet.

"This interest in preserving information is something we have lost, especially in a digital world," he says. "And that's what I'd like to help address and encourage people to do: Save information we have today for future times."

A press conference on this topic will be held Monday, Aug. 17, at 2:30 p.m. Eastern time in the Boston Convention & Exhibition Center. Reporters may check-in at Room 153B in person, or watch live on YouTube http://bit.ly/ACSLiveBoston. To ask questions online, sign in with a Google account.

###

The researchers cite funding from ETH Zurich, Mag(net)icFun ITN and the Swiss National Science Foundation.

The American Chemical Society is a nonprofit organization chartered by the U.S. Congress. With more than 158,000 members, ACS is the world's largest scientific society and a global leader in providing access to chemistry-related research through its multiple databases, peer-reviewed journals and scientific conferences. Its main offices are in Washington, D.C., and Columbus, Ohio.

To automatically receive news releases from the American Chemical Society, contact newsroom@acs.org.

Note to journalists: Please report that this research is being presented at a meeting of the American Chemical Society.

Follow us: Twitter | Facebook

Title

The engineering of DNA for the long-term storage of digital information

Abstract

The long term preservation of the vast amounts of information our modern world creates is an emerging problem. As (bio)chemical engineers we see DNA as a possibility of preserving large amounts of information: about 750 megabytes of genetic information are stored in every cell of our body and theoretically one gram of DNA could store > 300'000 terabytes of information.[1] Furthermore, it is known from archeology studies that if well preserved, DNA can endure for several hundred thousand years.[2]

Within this presentation we will show how we can use modern chemical and information engineering tools for the safeguarding of actual digital information in the form of DNA. For this we have combined the information theory concept of forward error correction with the chemical tool of DNA encapsulation.[3,4] In a first experimental validation of the idea 83kB of digital information were encoded by a Reed-Solomon error correction code and translated to DNA sequences (4991 sequences each 117bp long). The DNA sequences were synthesized by a microarray technology and encapsulated into a silica matrix. This encapsulation resulted in very low DNA degradation rates, which were measured by accelerated aging experiments in various atmospheres. Following a simulated 2'000 year room temperature storage of the DNA the digital information could be recovered without error by the aid of the error correction capabilities introduced during the coding. Aside of giving an insight into the state of the art of information preservation in DNA we will also discuss future challenges and needs of digital data preservation in the form of chemical information.

[1] Church et al. Science 2012, 337, 6102.
[2] Meyer et al. Nature 2014, 505, 403.
[3] Paunescu et al. Nat. Protoc. 2013, 8, 2440.
[4] Grass et al. Angew. Chem. Int. Ed. 2015, 54, 2552.


Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.