Researchers from Harvard have now encoded an entire book in molecules of DNA. This is the first time that a book- 53,400 words, 11 JPG images and a JavaScript- has been encoded into DNA sequences.

DNA is made of many nucleotides and each of these can encode two bits of data. Theoretically, one requires just four grams of DNA material to store all the digital data created per year. Our DNA could store billions of gigabytes of data, in theory, allowing it to be a gigantic hard drive that would put any computer to shame.

Using DNA as a storage material is better than the current technology because DNA-stored information is more stable and can be read thousands of years after being encoded. The major disadvantages, at least at present, are that DNA is easily fragmented and the current cost of synthesizing DNA is very expensive. The idea of encoding vast amounts of data onto DNA has just been science fiction up until this point.

The idea of encoding digital information in a genetic material isn't new but the new study from Harvard describing the coding of the textbook Regenesis is exciting because “it's using some simple ideas in very elegant ways to improve the density of information that one can store,” says Anne Condon, a computer scientist at the University of British Columbia in Vancouver to Nature.

DNA contains genetic information coded by the nucleotide bases adenine (A), guanine (G), cytosine (C) and thymine (T).

Researchers used a short chain of DNA nucleotides to encode the book. They first converted the text, images and the JavaScript to HTML and then to a sequence of 5.27 million 0s and 1s. One DNA nucleotide was used to encode one bit. Nucleotide bases A and C encoded 0 while G and T encoded 1.

The DNA strand was then fragmented and replicated many times over. Each strand was then given an address that located where it occurs in the book.

The DNA strands were then embedded on a microchip using a printer. This chip was then stored for three months after which it was dissolved and sequenced. Each block of DNA was sequenced thousands of times so that any error can be fixed by comparing it to other copies.

The book encoded was then translated back to a digital version and the team found that the system had two errors per million bits due to certain typos.

The technology used to write and read the book in a DNA format is both slow and expensive "but the field is moving fast and the technology will soon be cheaper, faster, and smaller," said Daniel Gibson, a synthetic biologist at the J. Craig Venter Institute in Rockville, Maryland, reports Science.

The study was published in the journal Science.