By Robert Lee Hotz
In the latest attempt to corral society’s growing quantities of digital data, Harvard University researchers encoded an entire book into the genetic molecules of DNA, the basic building block of life, and then accurately read back the text.
Their experiment, reported online Thursday in Science, translated the English text of a coming book on genomic engineering into actual DNA, using the chemical ingredients of genes as a code.
In that form, a billion copies of the book could fit easily in a test tube and, under normal conditions, last for centuries, the researchers said.
The unconventional exercise—one that is a long way from being commercially viable—highlights the potential of DNA as a stable, long-term archive for ordinary information, such as photographs, books, financial records, medical files and videos.
“It shows that the vast increase in capacity to synthesize and sequence DNA can be applied to store significant amounts of data,” said pioneering synthetic biologist Drew Endy at Stanford University, who wasn’t involved in the project. “If you wanted to have your library encoded in DNA, you could probably do that now.”
Molecular biologists have long known that DNA is a natural information-storage system inside every cell that encodes the recipe for individual heredity.
DNA is a complex molecule that contains the genetic instructions for life written in a simple but powerful code made up of four chemicals called bases: adenine (A), guanine (G), cytosine (C) and thymine (T).
The exact order of those bases—which for the average person is a sequence of about three billion—determines the meaning of the biological instructions stored in genes and chromosomes, just as letters of the alphabet make up words and sentences.
But some scientists have been experimenting with ways to use that code to store other kinds of information.
In a series of recent experiments, independent research groups in the U.S., Europe and Canada devised ways to use DNA to encode trademarks and secret messages in cells. When genomics pioneer Craig Venter and his colleagues created the first synthetic cell in 2010, they wrote their names into its chemical DNA code, the way an artist might sign a painting, along with three apt literary quotations and a website address.
Other researchers used DNA to encode poetry and popular music inside the living cells of bacteria. In 2003, genetic engineers at the Pacific Northwest National Laboratory in Washington state created micro-organisms that carry the tune of Disney’s “It’s a Small World (After All)” in their DNA.
“For some archival problems, this could be the wave of the future,” said Dr. Church, who was the project’s senior researcher. “A device the size of your thumb could store as much information as the whole Internet.”
To begin, the researchers translated the manuscript into the standard digital binary computer code of zeros and ones, used today to store videos on DVDs and photos in digital cameras, among other uses. Then they used two chemical base pairs of DNA—A and C—to signify one and the two other base pairs—G and T—to signify zero.
Next, they created short strands of DNA that held the coded sequence—almost 55,000 of them in all. Each strand contained a portion of the text and an address that indicated where it occurred in the flow of the book.
“It is a very simple way to store information,” said bioengineer Sriram Kosuri at the Wyss Institute for Biologically Inspired Engineering at Harvard, who was the project’s lead researcher. “This is sequential, like a magnetic tape, where you have to spool through stuff to get at the data.”
For the foreseeable future, however, the DNA book is expensive and time-consuming to read. It requires a series of laboratory procedures, microarray chips and a high-speed gene-sequencing machine to assemble the strands in the proper order, correct errors and read the final text.
The experiment, though, has set a milestone of sorts in book marketing.
Dr. Church wrote the book that was used as the text in the experiment. Called “Regenesis,” it is scheduled for more conventional commercial publication in October.