Book written in DNA code | Science | The Guardian


Scientists have for the first time used DNA to encode the contents of a book. At 53,000 words, and including 11 images and a computer program, it is the largest amount of data yet stored artificially using the genetic material.

The researchers claim that the cost of DNA coding is dropping so quickly that within five to 10 years it could be cheaper to store information using this method than in conventional digital devices.

Deoxyribonucleic acid or DNA – the chemical that stores genetic instructions in almost all known organisms – has an impressive data capacity. One gram can store up to 455bn gigabytes: the contents of more than 100bn DVDs, making it the ultimate in compact storage media.

A three-strong team led by Professor George Church of Harvard Medical School has now demonstrated that the technology to store data in DNA, while still slow, is becoming more practical. They report in the journal Science that the 5.27 megabit collection of data they stored is more than 600 times bigger than the largest dataset previously encoded this way.

Writing the data to DNA took several days. “This is currently something for archival storage,” explained co-author Dr Sriram Kosuri of Harvard’s Wyss Institute, “but the timing is continually improving.”

DNA has numerous advantages over traditional digital storage media. It can be easily copied, and is often still readable after thousands of years in non-ideal conditions. Unlike ever-changing electronic storage formats such as magnetic tape and DVDs, the fundamental techniques required to read and write DNA information are as old as life on Earth.

The researchers, who have filed a provisional patent application covering the idea, used off-the-shelf components to demonstrate their technique.

To maximise the reliability of their method, and keep costs down, they avoided the need to create very long sequences of code – something that is much more expensive than creating lots of short chunks of DNA. The data was split into fragments that could be written very reliably, and was accompanied by an address book listing where to find each code section.

Written By: Geraint Jones
continue to source article at


  1. Oh, come on, the first book to be coded this way and they chose something the team leader had written – so cravenly lacking in imagination!

    Surely it would have been more artful to choose something meaningful, like Watson and Crick’s original 1953 paper on the discovery of DNA, or Darwin’s Origin of Species? Or even the Complete Works of Shakespeare or the Iliad.

    Heck, why not even go for a joke. Code Ayn Rand’s Atlas Shrugged into DNA form and create the most selfish gene in the history of biology!

  2. Who needs authors? We will evolve our own books! Of course the editing will be a nightmare but..

  3. “DNA is such a dense storage system because it is three-dimensional. Other advanced storage media, including experimental ones such as positioning individual atoms on a surface, are essentially confined to two dimensions.”

    How does DNA store information 3 dimensionally? If they just use a binary code I don’t see how 3 dimensions (or even two for that matter) are relevant. It seems it would just be a long string of bits (0 or 1).

  4. How does DNA store information 3 dimensionally? If they just use a
    binary code I don’t see how 3 dimensions (or even two for that matter)
    are relevant. It seems it would just be a long string of bits (0 or 1).

    I think they’re referring to where the bits are physically stored. For example, DVD’s and hard drives store information 2-dimensionally; the bits are distributed over a flat surface (as burn marks in DVDs or magnetized sectors in hard drives), which are then read back by a laser or a magnetic head.

    3-dimensional storage would be something like distributing the bits throughout an object. I remember an article in Scientific American about changing the quantum spin of the atoms inside some kind of cube-shaped crystal, letting “up” spin denote a 0, and “down” denote a 1.

    Of course, from the computer’s perspective, all data storage mechanisms will appear as a 1-dimensional string of bits, simply because that’s the most practical format for software to use.

  5.  Hasn’t Dawkins speculated about a spy using DNA to encode a secret message? 90% of what is discussed on these forums has already been dealt with somewhere in his books, and usually more elegantly than we could manage!

  6. Not 0s and 1s, but As, Cs, Gs and Ts, so not binary, but Quaternary. Also the DNA molecule doesn’t exist normally as a straight line, but is massively coiled up and compacted, and I think that’s what they mean by three-dimensional.

  7. “Not 0s and 1s, but As, Cs, Gs and Ts, so not binary”  

    In the article they specifically said they were just treating ATCG’s as 0 or 1 only, here is the quote: 

    “Although DNA offers the ability to use four “numbers”: A, C, G and T, to minimise errors Church’s team decided to stick with binary encoding, with A and C both indicating zero, and G and T representing one.”

  8. Dawkins has speculated about it. It was also used in an episode of Star Trek TNG. In The Drumhead a Klingon spy injects “a modified hypospray syringe to encode privileged computer information into amino acid sequences which can be injected for secret transport”…  

  9. What the ****?

    I posted my comment yesterday, but now it’s gone? I really hate this Disqus system.

    Anyway, in reply to Red Dog:

    From a computer’s perspective, any data storage mechanism would appear as a 1-dimensional string of bits, simply because it’s the most practical format for software to use.

    The article, however, refers to how the bits are distributed physically through the medium. For example, DVDs store information two dimensionally, as burn marks (I think) scattered over the surface, which can then be read back by the laser in your DVD drive.

    As for 3-dimensional storage… I do recall a Scientific American article a couple years ago, discussing a method of storing data by altering the quantum state of the atoms within some sort of cube-shaped crystal. Basically, instead of being on the surface of an object, the bits are embedded within it.

    Not 0s and 1s, but As, Cs, Gs and Ts, so not binary, but Quaternary.

    Except, I’m sure that when this technology becomes practical, each molecule would represent two bits: for example., A, C, G, T would denote “00”, “01”, “10”, and “11” respectively.

  10. Thanks, I see what they meant now, I was thinking of it differently as if they were somehow using the 3 dimensions as part of the information, like a three dimensional array of bits rather than just a one dimensional string. But they just meant it in difference of the physical medium. 

  11. Very cool.  How about encoding “On the Origin of Species” – what a tribute to Darwin that would be!  
    Also, if possible, I’d like to have my favorite Darwin quote from Origin of Species written into part of my non-coding germ-line DNA, like a message to future generations.  

  12. Yes, I remember that story about Professor Crickson. This sounds a lot like science fiction.

Leave a Reply