A Visual Guide to DNA Sequencing

Ella Watkins-Dulaney for Asimov Press.

When the Human Genome Project (HGP) released its initial draft sequence in 2001, President Bill Clinton hailed it as “the most wondrous map ever produced by mankind.” After more than ten years of work, an estimated $3 billion in research costs, and a “genome war” with Craig Venter’s private company, Celera Genomics, the project had produced a nearly complete sequence of a human genome.

UK Prime Minister Tony Blair predicted that this map would yield “a revolution in medical science whose implications far surpass even the discovery of antibiotics.” (Whether this claim turned out to be true is debatable.) A few months later, the two teams — from HGP and Celera — published cover stories in Nature and Science, respectively.

Although the quest to sequence a human genome began in 1990, the techniques it used had already been in development for more than twenty years. And those DNA sequencing methods, in turn, were directly inspired by protein and RNA sequencing research stretching all the way back to the 1940s.

In the twenty years after the draft human genome was first released, the average sequencing cost per genome fell roughly one hundred thousand-fold, ending up just north of $500. In that same period, the cost to sequence a million letters or “megabase” of DNA fell to six tenths of a cent. This plummeting price is due largely to technological innovation, including new sequencing chemistries, computational methods for assembling raw reads into finished genomes, and highly efficient commercial sequencing machines.

Out of the many sequencing methods developed over the decades, five are particularly important. These are their histories.

Sanger Sequencing

Fred Sanger was biology’s great decoder. A British biochemist who spent his entire career at the University of Cambridge, Sanger earned two Nobel Prizes in the same field: first, the 1958 Nobel Prize in Chemistry for creating a method to determine the amino acid sequence of proteins (most famously insulin) and, second, a share of the 1980 Nobel Prize in Chemistry for inventing methods to sequence DNA.

After winning his first Nobel, Sanger turned his gaze to RNA, seeking to become the first person to sequence a full strand. He was beaten by Cornell biochemist Robert Holley, however, who reported the full 77-nucleotide sequence of the alanine transfer RNA molecule in 1965.

Although many scientists today assume that Sanger was the first to figure out how to sequence DNA, that’s not the case. As with RNA, Sanger was edged out by a Cornell biochemist. This time it was Ray Wu, who, in 1970, published a method to “read” specific sections of two bacterial virus genomes, called λ and bacteriophage 186. Wu’s method was only capable of sequencing “cohesive ends,” short single-stranded sections of these particular phage genomes, and so wasn’t considered a “general” solution to the DNA sequencing problem. In 1974, Wu’s lab refined this technique into the first general sequencing method, but it proved extremely labor-intensive and failed to catch on.

... continue reading