Capture
Microsoft announced that it has successfully stored 200 megabytes of data on strands of synthetic DNA, surpassing the previous record of 22 MB. In this picture, the DNA can be seen as a faint pink smear at the end of the test tube. Tara Brown Photography/ University of Washington

Imagine a future where all the data in the world is stored in a device the size of a shoebox. In a significant step toward realizing this vision, Microsoft on Thursday announced that it had successfully stored 200MB of data on strands of synthetic DNA, surpassing the previous record of 22MB.

In a statement released Thursday, Microsoft, which sought the help of researchers from the University of Washington and the San Francisco startup Twist Bioscience for the project, said that the data, once encoded, occupied an area smaller than the tip of a pencil.

“It’s essentially a test tube and you can barely see what’s in it,” Karin Strauss, the principal Microsoft researcher on the project, said. “It looks like a little bit of salt was dried in the bottom.”

Scientists have previously demonstrated that DNA is a good storage medium, with its four nucleotide base pairs — adenine, thymine, cytosine and guanine — acting as the 1s and 0s of a binary digital file. In a proof-of-concept experiment conducted in 2012, George Church — a molecular biologist at Harvard — encoded a 50,000-word book, which occupied less than a megabyte, into DNA by translating the binary files into strings of the four nucleotides.

For this particular project, the researchers did something similar. After the binary data had been translated into the nucleotide letters, they translated those letters from their electronic form into the molecules themselves.

“DNA is an amazing information storage molecule that encodes data about how a living system works. We’re repurposing that capacity to store digital data — pictures, videos, documents,” Luis Henrique Ceze, an associate professor of computer science and engineering at the University of Washington, said in the statement. “This is one important example of the potential of borrowing from nature to build better computer systems.”

In addition to occupying a miniscule fraction of the space taken up by present-day data centers, which rely on magnetic tapes for long-term storage, DNA has another advantage — it is remarkably durable. Unlike any current storage technology, DNA stays intact and readable for as long as 1,000 to 10,000 years, and sometimes even longer. For instance, scientists recently announced that they have been able to use fragments of nuclear DNA from a tooth and a leg bone to reconstruct the genomes of Neanderthals living in Spain roughly 430,000 years ago.

Although the cost of making the 200MB DNA data store is estimated to be a few thousand dollars, researchers at Microsoft said that in the coming years, the cost is likely to drop significantly, making DNA molecules a viable alternative to the devices currently being used.

“As long as there is DNA-based life on the planet, we’ll be interested in reading it,” Strauss said. “So it’s eternally relevant.”