Microsoft recently purchased over ten million strands of synthetic DNA for the purpose of advancing the newest frontier of digital data storage technology.
Digital data storage technology is constantly under pressure to develop and advance to enable the safe storage of more data; after all, the amount of digital data in the world doubles nearly every two years. This is part of the reason the last ten years has seen the further development of hard drives as well as the invention of solid-state drives, flash drives, and 3D NAND, and ongoing experiments with storing data in quartz and even with sound.
The latest iteration of this constant effort to create technology fit to store the exponential growth of digital data is perhaps more surprising than any other mediatized experiment. Microsoft will be attempting to store digital data directly onto DNA.
According to the tech mogul, its experiments are made possible by a partnership with Twist Bioscience.
“[The] vast majority of digital data is stored on media that has a finite shelf life and periodically needs to be re-encoded. DNA is a promising storage media, as it has a known shelf life of several thousand years, offers a permanent storage and can be read for continuously decreasing costs,” explained Emily Leproust, CEO of Twist Bioscience.
Because the digital universe is expected to make up 44 trillion gigabytes by 2020, Twist Bioscience claims that using DNA for archival purposes is the best solution for data storage in terms of aiming for long lifespans and high data density. After all, with only a single gram of DNA. it’s possible to store nearly one trillion gigabytes of digital data.
Storing data on DNA was first proposed and proven possible by Harvard geneticist George Church. He was able to encode a book in DNA.
It’s also important to note that Twist Bioscience can create a custom DNA sequence for only 10 cents per base. The company’s R&D team are currently working around the clock to find ways to bring that cost down to two cents per base. Twist Bioscience currently mass produces synthetic DNA using its own cutting edge machines. They used to manufacture bits of DNA to place into microbes, which in turn would produce desirable nutrients.
How exactly is digital information written onto DNA? Put simply, the data is translated into genetic code, which is written with base pairs A and C, G and T. Combinations of these base pairs represent the chemical building blocks of DNA, but they can also represent the binary building blocks of digital information. This has been confirmed by tests conducted jointly by Microsoft and Twist Bioscience in which 100 percent of digital data could be encoded and recovered from silicon-based synthetic DNA.
“They give us the DNA sequence, we make the DNA from scratch,” Leproust continued. Perhaps a relief to cybersecurity activists, this process can be done without the DNA producer having any idea what the digital data actually is. Apparently even during the test with Microsoft, Twist Bioscience “[didn’t] have the decoder key, so [they had] no idea what it is.”
Doug Carmean, a member of Microsoft’s technology and research division, had this to say about the ongoing project: “We’re still years away from a commercially-viable product, but our early tests with Twist demonstrate that in the future we’ll be able to substantially increase the density and durability of data storage.”
DNA sequencing continues to drop in price as the years go on. For example, the Human Genome Project, which ran for thirteen years and ended in 2003, cost $3 billion to carry through. This year, the same task could be performed for around $1,000. The dropping price of services coupled with the American Chemical Society’s confirmation that DNA could store data for up to 2,000 years without deterioration all seem to imply that Microsoft might be onto something.