0% found this document useful (0 votes)
62 views10 pages

05.metsker. Emerging Technologies in DNA Sequencing

Uploaded by

miguel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
62 views10 pages

05.metsker. Emerging Technologies in DNA Sequencing

Uploaded by

miguel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Perspective

Emerging technologies in DNA sequencing


Michael L. Metzker
Human Genome Sequencing Center and Department of Molecular and Human Genetics, Baylor College of Medicine, Houston,
Texas 77030, USA

Demand for DNA sequence information has never been greater, yet current Sanger technology is too costly, time
consuming, and labor intensive to meet this ongoing demand. Applications span numerous research interests,
including sequence variation studies, comparative genomics and evolution, forensics, and diagnostic and applied
therapeutics. Several emerging technologies show promise of delivering next-generation solutions for fast and
affordable genome sequencing. In this review article, the DNA polymerase-dependent strategies of Sanger
sequencing, single nucleotide addition, and cyclic reversible termination are discussed to highlight recent advances
and potential challenges these technologies face in their development for ultrafast DNA sequencing.

More than just a mapping and sequencing endeavor, the Human with a minor allele frequency >5% and their potential role in
Genome Project (HGP) has altered the mindset and approach to common disease (Lander 1996; Risch and Merikangas 1996; Col-
many basic and applied research efforts. Early skepticism and lins et al. 1997). Recent, large-scale genotyping efforts of these
controversy (Koshland 1989; Luria et al. 1989; Roberts 1989b; common SNPs have shown that much of the human genome can
Fox et al. 1990) were soon laid to rest by well-developed strategies be parsed into common haplotype blocks (Daly et al. 2001; Patil
(Roberts 1989a; Collins and Galas 1993; Collins et al. 1998) that et al. 2001; Gabriel et al. 2002). The International HapMap Con-
led to the successful execution of mankind’s largest biology sortium (2003) was formed to characterize common patterns of
project. At the core of the HGP was technology development that sequence variation by determining allele frequencies and the de-
advanced the pace of sequencing a mammalian-size genome gree of association between SNPs among geographically distinct
from years to months. Along the way, numerous strategies groups, leading to the identification of “tagSNPs” for genome-
emerged that hold promise for rapid, efficient, and inexpensive wide, disease-based association studies. With this method of
delivery of DNA sequence information. For the HGP, a brute- characterization, however, rare SNPs/haplotypes may be over-
force approach was adopted for completing the job by coupling looked, as highlighted by Liu et al. (2005), who described an
the core technologies of Sanger sequencing and fluorescence de- association of rare variants/haplotypes with osteoporosis.
tection. The completion of the sequencing phase could not have A shift in large-scale strategies from genotyping to rese-
been accomplished without major innovations in recombinant quencing is currently taking place to explore the significance of
protein engineering, fluorescent dye development, capillary elec- less-common SNPs to human biology and disease. The “re” in
trophoresis, automation, robotics, informatics, and process man- this approach is the sequencing of additional genomes related to
agement. The result was completion of a high-quality, reference a reference genome for de novo SNP discovery and comparative
sequence of the human genome in April, 2003 (Collins et al. genomics application. The ENCODE Project Consortium (2004)
2003), marking the 50-year anniversary of the discovery of the has described significant efforts toward resequencing megabase-
double-helix structure. For many outside the genome commu- sized blocks of the human genome. Consequently, genome cen-
nity, that heroic milestone signaled the end of this international ters are now diverting at least 10%–20% of their resources, which
scientific project, but for the rest of us, it only marked the be- currently translates to ∼5% capacity, to resequencing hundreds
ginning of things to come. to thousands of gene regions. This increase in momentum for
The need for sequencing has never been greater than it is high-throughput resequencing will greatly facilitate studies to
today, with applications spanning diverse research sectors in- determine the genetic basis of susceptibility to common disease,
cluding comparative genomics and evolution, forensics, epide- cancer biology, and disease association in model and nonmodel
miology, and applied medicine for diagnostics and therapeutics. organisms.
Arguably, the strongest rationale for ongoing sequencing is the Current sequencing technologies are too expensive, labor
quest for identification and interpretation of human sequence intensive, and time consuming for broad application in human
variation as it relates to health and disease. The most common sequence variation studies. Genome center cost is calculated on
form of variation is the single nucleotide polymorphism (SNP). the basis of dollars per 1000 Q20 bases (defined below) and can be
Although two unrelated people share, on average, 99.9% se- generally divided into the categories of instrumentation, person-
quence identity (i.e., one difference in a thousand base pairs), the nel, reagents and materials, and overhead expenses. Currently,
average occurrence of an SNP in the general population is once these centers are operating at less than one dollar per 1000 Q20
every few hundred base pairs. As such, more than nine million bases, with at least 50% of the cost resulting from DNA sequenc-
unique SNPs have been cataloged in the public database, dbSNP ing instrumentation alone. Developments in novel detection
(Crawford and Nickerson 2005), with many more expected to be methods, miniaturization in instrumentation, microfluidic sepa-
found in large-scale resequencing efforts. ration technologies, and an increase in the number of assays per
A great deal of attention has been focused on common SNPs run will most likely have the biggest impact on reducing cost. It
should be emphasized, however, that new sequencing strategies
E-mail [email protected]; fax (713) 798-5741.
Article and publication are at http://www.genome.org/cgi/doi/10.1101/ will be needed to use these high-throughput platforms effec-
gr.3770505. tively. In September, 2004, the National Human Genome Re-

15:1767–1776 ©2005 by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/05; www.genome.org Genome Research 1767
www.genome.org
Metzker

search Institute (NHGRI) initiated two new programs aimed at Table 1. Companies involved in DNA sequencing
bringing the cost of whole-genome sequencing down to technology development
$100,000 (http://grants.nih.gov/grants/guide/rfa-files/RFA-HG- Company names Web site addresses
04-002.html), with the eventual goal being $1000 (http://
grants.nih.gov/grants/guide/rfa-files/RFA-HG-04-003.html). 454 Life Sciences Corp. www.454.com
Numerous strategies and platforms for ultrafast DNA se- Agencourt Biosciences Corp. www.agencourt.com
GE Healthcare, formerly Amersham
quencing currently under development include sequencing-by-
Biosciences www.amershambiosciences.com
hybridization (SBH), nanopore sequencing, and sequencing-by- Applied Biosystems, Inc. www.appliedbiosystems.com
synthesis (SBS), the latter of which encompasses many different Genovoxx www.genovoxx.de
DNA polymerase-dependent strategies. Use of the term SBS has Helicos Bioscience Corp. www.helicosbio.com
become increasingly ambiguous in the literature; therefore, I pro- LaserGen, Inc. www.lasergen.com
Li-Cor, Inc. www.licor.com
pose a classification of DNA polymerase-dependent strategies Microchip Biotechnologies, Inc. www.mcbiotech.com
into three major categories: Sanger sequencing, single nucleotide Nanofluidics www.nanofluidics.com
addition (SNA), and cyclic reversible termination (CRT) (Text Box 1). SeqWright www.seqwright.com
In this review, I will focus only on DNA polymerase-dependent Solexa-Lynx www.solexa.com
Visigen Biotechnologies, Inc. www.visigenbio.com
strategies, which represent the broadest area of research and de-
velopment. For the SNA and CRT strategies, I will emphasize the
chemistry in an effort to illustrate the advantages and challenges
ments in fluorescence detection (Smith et al. 1986; Prober et al.
of these methods. Because of the competitive nature of technol-
1987), enzymology (Tabor and Richardson 1989, 1995), fluores-
ogy development, the exchange of scientific ideas is often
cent dyes (Ju et al. 1995; Metzker et al. 1996; Lee et al. 1997),
thwarted, as many companies do not readily publish results. Al-
dynamic-coating polymers and their derivatives (Ruiz-Martinez
though this review will highlight recent advances reported in the
et al. 1993; Carrilho et al. 1996; Madabhushi et al. 1996, 1999;
literature, readers are directed to the Web sites of companies who
Madabhushi 1998; Salas-Solano et al. 1998; Guttman 2002a,
are active in the sequencing field (Table 1). A recent review by
2002b), and capillary array electrophoresis (CAE) (Takahashi et
Shendure et al. (2004) provides a comprehensive overview of SBH
al. 1994; Kheterpal et al. 1996) have helped to define current
and nanopore sequencing technologies. Important issues sur-
DNA sequencing platforms.
rounding whole-genome sequencing, such as ownership, con-
For automated Sanger sequencing, either the primer or the
sent, privacy, and legal, ethical, and social implications, will not
terminating ddNTP is tagged with a specific fluorescent dye (e.g.,
be addressed here (Foster and Sharp 2002; Robertson 2003; Bon-
ddATP is labeled with the green dye). As these dye-labeled frag-
ham et al. 2005).
ments pass through the detection region, fluorophores are ex-
cited by the laser in the DNA sequencer, producing fluorescence
Sanger sequencing: State-of-the-art technology emissions of four different colors. The determination of the color
The Sanger method is a mixed-mode process involving synthesis is the underlying method for assigning a base call, and the order
of a complementary DNA template using natural 2⬘-deoxy- of the fluorescent fragments reveals the DNA sequence. The
nucleotides (dNTPs) and termination of synthesis using 2⬘,3⬘- “raw” fluorescence signals, however, must be transformed. Re-
dideoxynucleotides (ddNTPs) by DNA polymerase (Sanger et al. moval of cross-talk, correction for dye mobility alterations, and
1977). Balanced appropriately, competition between synthesis normalization of emission intensities must be performed before
and termination processes results in the generation of a set of readable DNA sequence information can be obtained (Smith et
nested fragments, which differ in nucleoside monophosphate al. 1987). Base-calling and error probability assignment (Ewing
units. The ratio of dNTP/ddNTP in the sequencing reaction de- and Green 1998; Ewing et al. 1998) applications are then used to
termines the frequency of chain termination, and hence the dis- call the DNA sequence and assess the accuracy of the call. A
tribution of lengths of terminated chains. The nested fragments Phred20 or Q20 score, equivalent to an error probability of 1% for
are then separated by their size using high-resolution gel electro- a given base call, is considered a high-quality base and serves as
phoresis and analyzed to reveal the DNA sequence. Advance- the commodity standard throughout the sequencing community.

Text Box 1. DNA polymerase-dependent strategies


In the broadest sense, all methods involving a DNA polymerase activity, unless the nucleotide Cyclic reversible termination (CRT) uses re-
DNA polymerase could be considered a SBS is removed by the process of phosphorolysis. versible terminators containing a protecting
approach, if synthesis alone was the defining This process is mediated by high concentrations group attached to the nucleotide that termi-
process. The defining element of these DNA of pyrophosphate or ATP and is a major cause nates DNA synthesis. For the reversible termi-
polymerase-dependent methods, however, is of “drop-outs” in DNA sequence data. nator, removal of the protecting group restores
not really synthesis at all but rather the means Single nucleotide addition (SNA) methods such the natural nucleotide substrate, allowing sub-
by which DNA synthesis terminates. From this as pyrosequencing use limiting amounts of in- sequent addition of reversible terminating
point of view, the DNA sequencing ap- dividual natural dNTPs to cause DNA synthesis nucleotides. One example of a reversible termi-
proaches highlighted here have been orga- to pause, which, unlike the Sanger method, can nator is a 3⬘-O-protected nucleotide (Fig. 4B),
nized according to their termination strategies. be resumed with the addition of natural nucleo- although protecting groups can be attached to
Sanger sequencing and “dideoxy” sequencing tides. Limiting the amount of a given dNTP is other sites on the nucleotide as well. This step-
are frequently used as synonymous terms. required to minimize misincorporation effects wise base addition approach, which cycles be-
These unnatural ddNTP terminators replace observed at higher concentrations. A major tween coupling and deprotection, mimics
the OH with an H at the 3⬘-position of the de- drawback with the SNA approach is the incom- many of the steps of automated DNA synthesis
oxyribose molecule and irreversibly terminate plete extension through homopolymer repeats. of oligonucleotides.

1768 Genome Research


www.genome.org
Emerging technologies in DNA sequencing

High-throughput DNA sequencing is conducted primarily


at large genome centers that continue to refine the sequencing
process and strive for Q20 bases at lower cost. For example, the
Baylor College of Medicine Human Genome Sequencing Center
(BCM-HGSC) produces approximately four million sequencing
reactions per month (R.A. Gibbs, pers. comm.). The current pro-
duction efficiency or pass rate is approximately 89% (after re-
moval of failed reactions, vector sequences, etc.), with sequenc-
ing reads averaging 805 Q20 bases in length. These metrics trans-
late into the equivalent of sequencing one mammalian-size
genome per month. Redundancy is required to improve the base-
calling accuracy and contiguity of assembled genomes, resulting
in the generation of six times the genome size in Q20 bases for
production of a draft-quality sequence. Thus, delivery of a mam-
malian-size, draft-quality sequence requires approximately six
months and $12 million. Ongoing advances in new technologies
will be critical to meet the goal of rapid, genome-scale sequenc-
ing for the price of $100,000 and, ultimately, $1000 per genome.

Sanger sequencing: Recent advances

Microfluidic separation platforms


Technology development remains active for the fluorescence-
based Sanger approach with emphasis on producing faster and
cheaper sequencing reads. One key area of research is the appli-
cation of microfluidic separation devices to DNA sequencing.
These microfluidic devices can be fabricated using a variety of
substrate materials, with several molecular biology processes in-
tegrated onto a single device (e.g., lab-on-a-chip). A number of
reviews have been devoted to microfluidic devices (Becker and Figure 1. Microfabricated technologies. (A) Examples of a T-injector
and cross-T injector layout. (B) Expanded view of the sample injector and
Gartner 2000; Carrilho 2000; McDonald et al. 2000; Quake and pinched turn. (C) Schematic of the 96 channels in a radial chip design.
Scherer 2000; Boone et al. 2002; Paegel et al. 2003; Kan et al. (B,C) Reprinted with permission from National Academy of Sciences,
2004), recent advances of which I will highlight as they relate to U.S.A. © 2002, Paegel et al. 2002.
DNA sequencing. These miniature devices have several advan-
tages over CAE, including improved sample injection and faster using their four-color scanner technology. Data quality and read-
separation times. lengths have improved significantly since then, because of an
The separation principles of microfabricated devices are increase in the effective separation lengths with run times of 30
similar to those of conventional CAE, however, their injection minutes or less (Table 2) (Woolley and Mathies 1995; Liu et al.
methods are very different. With CAE, the sample is introduced 1999; Schmalzing et al. 1999; Backhouse et al. 2000; Koutny et al.
by electrokinetic injection into the capillary. The injection time, 2000; Liu et al. 2000; Salas-Solano et al. 2000; Simpson et al.
which defines the length of the sample plug, is typically short 2000; Boone et al. 2002; Paegel et al. 2002; Shi and Anderson
and allows only a minute fraction of the sample to be analyzed. 2003). For example, Liu et al. (1999) reported 99.4% accuracy
A further drawback is that data quality is compromised with in- over 500 bases in 20 minutes, with an increase in separation
creasing impurities in the sample and an intrinsic bias in favor of length from 3.5 cm to 6.5 cm. More recent developments by
shorter DNA fragments over longer ones. Microfluidic devices, Boone et al. (2002) and Shi and Anderson (2003) have shown the
on the other hand, are less susceptible to these injection prob- first DNA sequencing applications on plastic chips (Table 2).
lems because the sample is introduced via a channel network by These chips can be fabricated with high geometric aspect ratios
a variety of process strategies (Zhang and Manz 2001). Although (i.e., deep and narrow channels) at significantly lower cost. Deep
early microfabricated chips employed a “T”-injector design (Har- and narrow channel structures have the advantages of improved
rison et al. 1992), the cross-T design (Harrison et al. 1993) is electrophoretic resolution (i.e., longer read-length) and better de-
widely used today because of its superior sample control (Fig. 1A). tection sensitivity.
The narrow width of the injector affords greater control in selec- While single-channel devices are useful for demonstrating
tion of sample plug size, which contributes to higher resolution feasibility, the construction of multiple channel arrays is essen-
separations with shorter separation lengths compared with tial for high-throughput DNA sequencing. A summary of DNA
CAE. sequence metrics from several microfabricated multiple channel
Most microfabricated devices use borofloat glass or fused array devices is presented in Table 2. While Backhouse et al.
silica substrates, which have the advantages of (1) high-quality (2000) and Koutny et al. (2000) reported improved read-lengths
optical properties, (2) good thermal conductivity, (3) well- by increasing the effective separation lengths to 46.5 cm and 40
documented surface chemistry, and (4) effective translation of cm, respectively, these microfabricated channels were con-
capillary innovations. Woolley and Mathies (1995) demon- structed on glass plates ⱖ50 cm in length, which is out of line
strated the first application of DNA sequencing using a micro- with current efforts to miniaturize devices. One approach to cir-
fabricated glass device in 1995, reporting single-base resolution cumvent this dilemma has been the introduction of turns along

Genome Research 1769


www.genome.org
Metzker

Table 2. Summary of microfabricated devices for DNA sequencing applications

Number of Template Separation Accuracy Read-length Read-out


Research group channels source length (cm) (%) (bp) time (min)

Single channel
Woolley & Mathies (1995) 1 M13mp18 3.5 97 147 9
Liu et al. (1999) 1 M13mp18 6.5 99.4 500 20
Schmalzing et al. (1999) 1 M13 clonesa 11.5 99 505 27
Salas-Solano et al. (2000) 1 M13mp18 11.5 98.5 640 30
Boone et al. (2002) 1 M13mp18 18.0 98 640 30
Shi & Anderson (2003) 1 Unknown 4.5 99.1 320 13
Multiple channel arrays
Liu et al. (2000) 16 M13mp18 7.5 99 457 16
Simpson et al. (2000) 48 M13mp18 10.0 97 400 25–45
Backhouse et al. (2000) 48 BigDye Std 46.5 98 640 150
Koutny et al. (2000) 32 M13 clonesa 40.0 98 800 78
Paegel et al. (2002) 96 M13mp18 15.9 99 430 24

a
Mixture of M13mp18 vector or twelve M13 clones from human chromosome 17 project.

the length of the separation channel. Early studies, however, re- components in DNA sequencing assays (Zhu et al. 2003, 2004).
ported lower separation efficiency in channel turns due to band Alaverdian et al. (2002) proposed using four continuous wave
broadening (Jacobson et al. 1994) and differential field strength (CW) mode lasers, which are modulated at different RFs. To es-
effects (Culbertson et al. 1998). Paegel et al. (2000) introduced a timate the fluorescence signal for each dye, however, the result-
“pinched-turn” design (Fig. 1B) with an effective separation ing emission intensity pattern must be demodulated, which in-
length of 15.9 cm on a 15-cm-diameter silica disc, which has been troduces a significant computational load for each capillary sig-
multiplexed into a 96-channel radial device (Fig. 1C) showing tre- nal channel. Coupled with repetition rates on the order of ⱖ100
mendous potential for increasing throughput in DNA sequencing Hz, the RF method does not appear to be compatible with con-
applications (Paegel et al. 2002). Most of the data shown in Table 2, ventional CCD technology, limiting its scalability for detection
however, were derived using the standard M13mp18 vector as the of high-density capillary arrays.
sequencing template, and similar performance is not typically ob- Recently, Lewis et al. (2005) described a simple but effective
served under the same conditions with “real-world” samples such method for multifluorescence discrimination called pulsed mul-
as those from genome center production lines. tiline excitation (PME). The underlying principle of this four-
laser system is the correlation of sequential laser pulses with detec-
Fluorescence detection tor response (Fig. 2A). Advantages of PME are such that (1) absorp-
tion maxima for the four fluorescent dyes are matched to the
The most widely used detection method for four-color DNA se-
excitation sources yielding maximum signal intensities, (2) tempo-
quencing was initially described almost 20 years ago (Smith et al.
ral separation of the laser pulses and expansion of the dye set across
1986; Prober et al. 1987). This method is based on resolution of
the visible spectrum eliminate cross-talk between the dyes, and (3)
the emission signal from a dye-labeled nucleotide into color,
collection of emission signals is improved by eliminating the re-
with subsequent assignment in the DNA sequence. While suc-
quirement for dispersing elements (prisms or gratings) in color
cessful for the sequencing of numerous higher and lower eukary-
separation. In other words, PME measures multicomponent fluo-
otic and prokaryotic genomes, these four-color systems have
rescence assays in a color-blind manner. To demonstrate these ad-
several disadvantages, including inefficient excitation of the
vantages, Lewis et al. (2005) applied the PME technology to capil-
fluorescent dyes, significant spectral overlap, and inefficient col-
lary electrophoresis for DNA sequencing. Figure 2B shows the un-
lection of the emission signals. The issue of inefficient excitation
processed signals from the four PME laser waveforms for a portion
has been partially addressed by the use of fluorescence resonance
of the PCR amplicon for the TCF1 (formerly known as HNF1A)
energy-transfer (FRET) dyes (Ju et al. 1995; Metzker et al. 1996;
exon 10. Transformation of the data into unambiguous sequence
Lee et al. 1997). At present, FRET dye-labeled ddNTP terminators
data (Fig. 2C) is accomplished by applying only dye mobility cor-
are widely used throughout the sequencing community. The re-
rection software, eliminating the need for cross-talk and signal nor-
sulting improvements in acceptor dye signal intensities, how-
malization software transformation. The PME technology holds
ever, are suboptimal compared with those of single dyes excited
promise for real-time field applications for DNA sequencing.
at their absorption maxima by the appropriate laser source.
To overcome these deficiencies, some investigators have
proposed strategies using additional properties such as fluores- SNA methodologies
cence life-time (Nunnally et al. 1997; Lieberwirth et al. 1998;
Lassiter et al. 2000; Zhu et al. 2003, 2004) and radio frequency Pyrosequencing
(RF) modulation (Alaverdian et al. 2002). For DNA sequencing Arguably the most successful non-Sanger method developed to
applications, fluorescence life-time measurements have been de- date is pyrosequencing, first described in the literature by Hyman
scribed using pulsed lasers with high repetition rates (picosecond (1988). Pyrosequencing is a nonfluorescence technique that
time-scale) with detection in the photon-counting mode. Soper measures the release of inorganic pyrophosphate, which is pro-
and colleagues have recently demonstrated a combined approach portionally converted into visible light by a series of enzymatic
of emission wavelength and fluorescence life-time measure- reactions (Ronaghi et al. 1996, 1998). Unlike other sequencing
ments, with the potential to increase the number of fluorescent approaches that use 3⬘-modified dNTPs to terminate DNA

1770 Genome Research


www.genome.org
Emerging technologies in DNA sequencing

greater than five nucleotides cannot be quantitatively measured.


This is attributed to incomplete extension by DNA polymerase,
which results from limiting the dNTP concentration to minimize
nucleotide misincorporation effects. It has been suggested that
re-addition of the same dNTP may be performed to ensure com-
plete polymerization (Ronaghi 2001), although its practicality for
high-throughput sequencing is unclear. Finally, the dispensing
order of dNTPs determines the pyrogram profile, which must be
carefully designed to avoid asynchronistic extensions of hetero-
zygous sequences.
For a given dispensing order, approximately one half of all
heterozygous sequences will result in asynchronistic extensions
past the variable site. A survey of heterozygous variants detected
by direct DNA sequencing of the TCF1 gene revealed that 16 of
37 SNPs would result in nonsynchronistic extension after the
heterozygous base (data not shown). If one allele extends past the
heterozygous base position before the other and advances to the
next nucleotide cycle, the nonsynchronicity becomes perma-
nent. An illustration of the effect of dispensing order on asyn-
chronistic extension is shown in Figure 3A. This observation is
further highlighted by Entz et al. (2005) with the identification of
more than 40 unique dispensing orders for the accurate typing of
HLA-DQB1 and HLA-DRB1 alleles. Pyrosequencing may, there-
fore, be suited for pattern matching of known SNP profiles, while
its application for de novo SNP discovery is less certain. Not
Figure 2. (A) Illustration of the PME technology. Here, each laser op- surprisingly, base-calling for de novo SNPs is problematic and
erates in a CW mode with mechanical shutters pulsing the different ex- still performed manually (Langaee and Ronaghi 2005).
citation beams in sequential order. The single coaxial PME beam inter- The 454 Corporation has recently introduced a whole-
rogates the fluorescently labeled DNA fragments, which are separated by genome sequencing strategy by integrating pyrosequencing with
capillary gel electrophoresis. Scattered laser light is rejected via specific
long-pass or wavelength notch filters, with pulsed emission signals from
their PicoTiterPlate (PTP) platform, which has been shown to
the dye-labeled DNA fragments being detected by the photomultiplier amplify and image approximately 300,000 PCR templates cap-
tube (PMT) without use of any dispersing elements. (B) Unprocessed tured on Sepharose beads (Leamon et al. 2003). The PTP is manu-
fluorescence data obtained during the electrophoretic run for the TCF1 factured by anisotropic etching of a fiber optic faceplate with a
exon 10 gene region using PME dye-primers. Blue, green, black, and red
well diameter of approximately 40 µm. The 454 group has devel-
traces are AF-405, BODIPY-FL, 6-ROX, and Cy5.5 dye-primers terminated
with ddCTP, ddATP, ddGTP, and ddTTP respectively. (C) Transformation oped a solution-based emulsion strategy to create microreactors
of the raw trace data derived from the experiment described in B into for clonal amplification of single DNA molecules and attachment to
readable, DNA sequence data using mobility software correction. Re- these beads. One advantage of the clonal amplification strategy is
printed with permission from National Academy of Sciences, U.S.A. © that it addresses the dependence issue of dispensing order for se-
2005, Lewis et al. 2005.
quencing of heterozygous bases discussed above. Following an en-
richment step, DNA positive beads are loaded into individual PTP
synthesis, the pyrosequencing assay manipulates DNA polymer- wells, which contain additional beads coupled with the necessary
ase by single addition of dNTPs in limiting amounts. Upon ad- enzymes to perform the pyrosequencing chemistry (Margulies et al.
dition of the complementary dNTP, DNA polymerase extends the 2005). Recently, the company announced its first complete genome
primer and pauses when it encounters a noncomplementary sequencing of a recombinant adenoviral construct and the shot-
base. DNA synthesis is reinitiated following the addition of the gun sequencing of the Mycoplasma genitalium genome.
next complementary dNTP in the dispensing cycle. The light The assembly of non-Sanger sequencing data will represent
generated by the enzymatic cascade is recorded as a series of new challenges because the input read will differ in length, quan-
peaks called a pyrogram, which corresponds to the order of tity, and quality. The complexity of the genome under analysis
complementary dNTPs incorporated and reveals the underlying may also prove more difficult for assemblies compared with
DNA sequence. Applications for pyrosequencing have been re- Sanger data, even when the offset is higher coverage of shorter
viewed by Ronaghi (2001) and Langaee and Ronaghi (2005). reads. Chaisson et al. (2004) recently performed a simulated as-
Although elegant in design, the pyrosequencing approach sembly study (short, error-free reads sampled at 30⳯ coverage)
has several limitations. For example, sequence reads are typically using genome sequences from adenovirus, two mouse BACs, and
fewer than 100 bases in length, which has application in se- two bacteria: Campylobacter jejuni, which contains very few re-
quence tag identification such as serial analysis of gene expres- peat sequences (Parkhill et al. 2000b), and Neisseria meningitidis,
sion (SAGE) (Velculescu et al. 1995), mini-sequencing for known which contains several hundred repetitive elements (Parkhill et
SNPs, and mapping related genomes to a reference sequence, but al. 2000a). Compared with Sanger data, Chaisson et al. (2004)
limited application for whole-genome sequencing. Recent re- found that the read-length was inversely proportional to the
ports describe the use of single-stranded binding protein (Ro- number of contigs in the assembly (i.e., longer reads gave fewer
naghi 2000) and the isomeric Sp form of the dATP␣S nucleotide contigs). Increasing genome complexity, on the other hand, di-
(Gharizadeha et al. 2002), which may improve read-lengths up to rectly increases the number of contigs. Here, they found that
100 bases in routine settings. Secondly, homopolymer repeats 95% of the genome was contained within 9–10 contigs for the

Genome Research 1771


www.genome.org
Metzker

onstrated the addition of single Cy5-SS-


dNTPs followed by dye cleavage for ac-
curate DNA sequencing of several tem-
plates. The presence of a fluorescence
signal corresponding to the dispersing
order of the Cy5-SS-dNTPs revealed the
DNA sequence. Although read-lengths
up to eight bases were demonstrated,
several miscalls were reported. One such
call resulted from nucleotide read-
through. That is, consecutive incorpora-
tions of dye-labeled dNTPs can occur
(e.g., the sequence 5⬘-CAGCC was read
as 5⬘-CAGC), presumably with different
efficiencies that are dependent on the lo-
cal DNA sequence context. A second er-
ror occurred as a result of a single
nucleotide insertion (e.g., the sequence
5⬘-ATGT was read as 5⬘-AGTGT). Al-
though more difficult to interpret, it is
possible that the residual linker struc-
ture, remaining on the nucleobases fol-
Figure 3. SNA technologies. (A) Simulated effects of two different dNTP dispensing orders on the
lowing dye cleavage, could alter nucleo-
outcome of the pyrogram profile. (B) The photocleavage reaction of a fluorescently labeled dNTP
coupled with a photocleavable linker. tide specificity and incorporation effi-
ciency of subsequent incoming dNTPs in
a sequence-dependent manner. More re-
BAC clones, and the number of contigs increased from 21 to 344 cently, Seo et al. (2004, 2005) described a similar strategy using
for C. jejuni and N. meningitides genome sequences, respectively. four different dye-labeled dNTPs with photocleavable linkers
Observed errors for real sequence data will undoubtedly decrease (Fig. 3B) and reported read-lengths of 12 bases. A key advantage
assembly performance for short reads. Thus, the success of the of the four-color approach is that all four dNTPs can be assayed
non-Sanger strategies for whole-genome sequencing applications simultaneously, although both reports demonstrated use of the
will be highly dependent on the degree of its complexity, which single dNTP addition method.
appears to traverse all three phylogenetic domains. Kartalov and Quake (2004) proposed a different approach to
overcome the steric effects of consecutive dye-labeled bases by
Other single addition dNTP strategies use of single-addition, same-nucleobase mixtures (e.g., dCTP/
Methods other than pyrophosphate detection can be used to TAMRA-labeled ddCTP) as a method for DNA sequencing. The
monitor single dNTP additions. For example, Braslavsky et al. nucleobase mixture strategy serves the dual purpose of dye-
(2003) used the technique of single-pair FRET (spFRET) to deter- labeling for fluorescence detection (reporter phase) and ongoing
mine the order of nonconsecutive nucleotide additions. With DNA synthesis of the complementary nucleotide (extension
this single molecule approach, Cy3-labeled-UTP was initially in- phase). The dNTP and dye-labeled ddNTP concentrations are bal-
corporated into the primer strand, serving as the donor dye. Sub- anced appropriately so that only a fraction of the primer strands
sequent incorporation of a complementary Cy5-labeled-UTP or incorporate the dye-labeled ddNTP. The presence of a fluores-
Cy5-labeled-dCTP substrate resulted in the spFRET signal. Fol- cence signal reveals the complementary nucleotide in the DNA
lowing photobleaching of the Cy5 dye, the natural nucleotides sequence, but reporters are eliminated from subsequent dNTP
dATP and dGTP were added to increase the nucleotide distance additions. With each nucleotide addition, signal loss is inversely
between subsequent Cy5-labeled dNTP additions, which would proportional to the increased accumulation of termination prod-
otherwise have resulted in a significant reduction in incorpora- ucts. The fluorescence is then quenched by photobleaching be-
tion efficiencies due to steric hindrance effects. For the DNA tem- fore the next nucleobase mixture is dispensed to repeat the pro-
plate sequence, written 3⬘-ATCGTCATCG-5⬘ for convenience, cess. Configured in a microfluidic device, the average read-length
the read-out would be the fingerprint sequence of 5⬘-UCUC. Lev- for the mixed nucleobase addition scheme was three bases,
ene et al. (2003) have recently described a zero-mode waveguide which can be partially attributed to signal loss with subsequent
approach to single-molecule detection of R110-labeled-dCTP and base additions. The accuracy of the method is highly dependent
coumarin-labeled-dCTP incorporation events by DNA polymerase. on the reporter phase mimicking the extension phase. For ex-
Taking advantage of the steric effects observed in consecu- ample, a simple homopolymer repeat of two bases will be under-
tively incorporated dye-labeled dNTPs, Mitra et al. (2003) intro- called in the DNA sequence, as the reporter phase will reflect a
duced fluorescently labeled dNTPs, which contained cleavable single base addition while the extension phase will incorporate
linkers, to remove the bulky fluorescent group following incor- two bases.
poration by DNA polymerase. This method, called fluorescent in
situ sequencing (FISSEQ), used linkers containing either a disulfide CRT
bridge, which is efficiently cleaved with a reducing agent, or a While CRT technology represents tremendous potential for
photocleavable group (Fig. 3B). Using the polony technology whole-genome sequencing, this strategy still faces significant
(Mitra and Church 1999), Church and colleagues elegantly dem- challenges in its implementation. The CRT cycle is comprised of

1772 Genome Research


www.genome.org
Emerging technologies in DNA sequencing

three steps: incorporation, imaging, and deprotection, as illustrated For CRT terminators to function properly, the protecting
in Figure 4A. The advantages of CRT over Sanger are (1) elimina- group must be efficiently cleaved under mild conditions while
tion of gel electrophoresis and (2) formatting of the CRT assay in coupled to the primer. Removal of the protecting group generally
a highly parallel fashion. Its advantages over pyrosequencing are involves either treatment with strong acid or base, catalytic or
that (1) all four bases are present during the incorporation phase, chemical reduction, or a combination of these methods. Unfor-
(2) step-wise control allows for single-base additions through ho- tunately, these conditions may chemically perturb the DNA
mopolymer repeats, and (3) synchronistic extensions are main- polymerase, nucleotides, oligonucleotide-primed template, or
tained past heterozygous bases. An additional advantage is that the solid support. Use of photocleavable protecting groups is an
unlike the pyrosequencing assay, which must be contained attractive alternative to rigorous chemical treatment and can be
within a defined reaction well, the CRT assay can be performed employed in a noninvasive manner. Of the various photocleav-
on a number of highly parallel platforms, such as high-density able protecting groups (Pillai 1980), the light-sensitive 2-nitro-
oligonucleotide arrays (Pease et al. 1994; Albert et al. 2003), PTP benzyl group has been widely used. For example, it has been
arrays, (Leamon et al. 2003), polony arrays (Mitra and Church applied to natural nucleotides (Metzker et al. 1994, 1998), to the
1999), or random dispersion of single molecules. Albert et al. (2003) linker structure coupling a fluorescent dye to nucleobases (Li et
have demonstrated the 5⬘→3⬘ synthesis of oligonucleotide on a al. 2003; Mitra et al. 2003), and to other nucleic acid structures as
high-density array and the application of incorporation of dye- well (Ohtsuka et al. 1974; Pease et al. 1994; Chaulk and MacMil-
labeled ddNTPs by DNA polymerase. These advantages of the CRT lan 1998; Singh-Gasson et al. 1999). Under appropriate deprotec-
technology could represent significant improvements in speed, tion conditions (e.g., ultraviolet light >300 nm), the 2-nitroben-
throughput, and accuracy over Sanger and SNA approaches. zyl group can be efficiently cleaved (Fig. 4B) without affecting
At the center of the CRT chemistry is the reversible termi- either the pyrimidine or purine bases (Bartholomew and Broom
nator. Ideally, these terminators should exhibit fast and efficient 1975; Pease et al. 1994).
deprotection kinetics, efficient incorporation kinetics by DNA Other protecting groups have been described for reversible
polymerase, and labels with desired characteristics, such as fluo- terminators as well. For example, Metzker et al. (1994) first de-
rophores with good fluorescence properties. Of the challenges scribed the synthesis and incorporation of a 3⬘-O-allyl-dATP by
associated with CRT for high-throughput genome sequencing, DNA polymerase, with the O-allyl group being removed using the
creating these reversible terminators with the desired properties well-known palladium (Pd) catalyst chemistry (Hayakawa et al.
and identifying DNA polymerases that recognize these substrates 1986, 1993; Honda et al. 1997). Recently, Ruparel et al. (2005)
with high affinities are the most demanding aspects of the tech- reported the synthesis of the first fluorescently labeled 3⬘-O-allyl-
nology. The latter point is exemplified by the presence of com- dNTPs. These unique reversible terminators require dual depro-
peting natural nucleotides, which can readily cause asynchronis- tection steps using UV light to cleave the fluorophore from the
tic base extensions (Metzker et al. 1998). The first examples of nucleotide (Fig. 3B), and the Pd catalyst reaction to restore the
reversible terminators using commercially available DNA poly- natural 3⬘-OH substrate. At this year’s Advances in Genome Biology
merases were reported by Canard and Sarfati (1994) and Metzker and Technology/Automation in Mapping and Sequencing meeting,
et al. (1994). Solexa reported on a similar CRT chemistry with a sequence read-
length of approximately 20 bases (http://www.agbt.org) and re-
cently reported the complete sequencing of the ␸␹174 genome
(http://www.solexa.com).
Earlier concerns regarding short read-lengths and assemblies
for SNA strategies will prove relevant to CRT as well. To overcome
this issue, research efforts in CRT technology development will
continue to focus on the cycle efficiency. The CRT read-length is
governed by the overall cycle efficiency, which is highly depen-
dent on the product of deprotection and incorporation efficien-
cies. For example, if one considers the conservative loss of 50%
signal as the assay’s end-point, the read-length is a function of
the cycle efficiency (Ceff) (Fig. 4C). Here, a read-length of only
seven bases will be achieved with an overall cycle efficiency of
90% and can be increased beyond 100 bases in length by im-
proving cycle efficiency to >99%. Figure 4D illustrates the effect
that chemical modifications of the 2-nitrobenzyl ring system
have on deprotection efficiency and thymidine production (V.A.
Litosh, W. Wu, B. Stupi, and M. Metzker, unpubl.). Thus, recent
improvements in chemical engineering of reversible terminators
are important developments for CRT as an emerging technology
for DNA sequencing applications.
Figure 4. CRT technologies. (A) The CRT cycle. (B) The photocleavage
reaction of a 3⬘-O-2-nitrobenzyl-nucleoside. (C) Effect of cycle efficiency
on CRT read-length. (D) Kinetic study of protocleavage reaction for single Conclusions
substituted (2-SSNB) and double substituted (2-dsNB) 2-nitrobenzyl thy- Recent developments in DNA polymerase-dependent strategies
midine analogs. Percentage thymidine (%Thy) was calculated according
to the equation: %Thy = AThy/(AThy + As2NB), where AThy and As2NB are highlight the central role these methods play in determination of
the integrated peak areas from RP-HPLC analysis for thymidine and sub- the overall success of the sequencing assay. Although the stan-
stituted 2-nitrobenzyl thymidine analogs, respectively. dards for current Sanger technology have set the mark for emerg-

Genome Research 1773


www.genome.org
Metzker

ing SNA and CRT technologies, these measures have evolved over Domratchev, S., et al. 2002. A family of novel DNA sequencing
instruments based on single-photon detection. Electrophoresis
several decades and from numerous research laboratories. The
23: 2804–2817.
integration of additional technologies will be key for develop- Albert, T.J., Norton, J., Ott, M., Richmond, T., Nuwaysir, K., Nuwaysir,
ment of robust DNA sequencing platforms, including instrumen- E.F., Stengele, K.-P., and Green, R.D. 2003. Light-directed 5⬘→3⬘
tation, microfluidics, robotics, automation, software control, synthesis of complex oligonucleotide microarrays. Nucleic Acids Res.
31: e35.
data acquisition, and informatics. Backhouse, C., Caamano, M., Oaks, F., Nordman, E., Carrillo, A.,
Beyond the integrated instrumentation built around the Johnson, B., and Bay, S. 2000. DNA sequencing in a monolithic
chemistry, the method by which genomes are sequenced will be microchannel device. Electrophoresis 21: 150–156.
Bartholomew, D.G. and Broom, A.D. 1975. One-step chemical synthesis
important. Most strategies described in this review will employ of ribonucleosides bearing a photolabile ether protecting group. J.
the random approach of whole-genome shotgun sequencing and Chem. Soc. Chem. Commun. Issue 2: 38.
assembly (Weber and Myers 1997), including resequencing ef- Becker, H. and Gartner, C. 2000. Polymer microfabrication methods for
microfluidic analytical applications. Electrophoresis 21: 12–26.
forts for human sequence variation studies. While the random Bonham, V.L., Warshauer-Baker, E., and Collins, F.S. 2005. Race and
approach has the advantage of simplicity, it will require a tre- ethnicity in the genome era: The complexity of the constructs. Am.
mendous number of sequence reads (i.e., a minimum of 900 mil- Psychol. 60: 9–15.
Boone, T., Fan, Z., Hooper, H., Ricco, A., Tan, H., and Williams, S. 2002.
lion, 100-base reads will be needed to achieve a 30⳯ assembly for
Plastic advances microfluidic devices. Anal. Chem. 74: 78A–86A.
a mammalian-size genome) to produce comprehensive sequence Braslavsky, I., Hebert, B., Kartalov, E., and Quake, S.R. 2003. Sequence
data for comparative studies between genomes. A directed ap- information can be obtained from single DNA molecules. Proc. Natl.
proach, which targets specific regions across the genome, can Acad. Sci. 100: 3960–3964.
Canard, B. and Sarfati, R. 1994. DNA polymerase fluorescent substrates
effectively reduce genome size and complexity and, therefore, with reversible 3⬘-tags. Gene 148: 1–6.
the number of sequencing reads needed to produce these com- Carrilho, E. 2000. DNA sequencing by capillary array electrophoresis
prehensive data sets. One example of a directed strategy for hu- and microfabricated array systems. Electrophoresis 21: 55–65.
Carrilho, E., Ruiz-Martinez, M.C., Berka, J., Smirnov, I., Goetzinger, W.,
man resequencing could be the application of the CRT method to Miller, A.W., Brady, D., and Karger, B.L. 1996. Rapid DNA
5⬘→3⬘ synthesized high-density oligonucleotide arrays (Albert et sequencing of more than 1000 bases per run by capillary
al. 2003) by relying on the reference sequence as anchor points electrophoresis using replaceable linear polyacrylamide solutions.
Anal. Chem. 68: 3305–3313.
along the genome. The careful selection of unique and functional Chaisson, M., Pevzner, P., and Tang, H. 2004. Fragment assembly with
priming sites would represent an oligonucleotide tiling path short reads. Bioinformatics 20: 2067–2074.
across the genome. Priming CRT reactions from these anchor Chaulk, S. and MacMillan, A. 1998. Caged RNA: Photo-control of a
ribozyme reaction. Nucleic Acids Res. 26: 3173–3178.
points and sequencing to adjacent priming sites would provide Collins, F. and Galas, D. 1993. A new five-year plan for the U.S. Human
contiguous coverage of the targeted regions of interest. CRT reads Genome Project. Science 262: 43–46.
could then be aligned to the known positions along the reference Collins, F.S., Guyer, M.S., and Chakravarti, A. 1997. Variations on a
theme: Cataloging human DNA sequence variation. Science
genome in a straightforward manner. This approach could also
278: 1580–1581.
be used for mapping sequence reads to related genomes for com- Collins, F.S., Patrinos, A., Jordon, E., Chakravarti, A., Gesteland, R.,
parative genomics studies. Alignment of random reads could be Walters, L., and Members of the DOE and NIH Planning Groups.
performed using conventional assembly algorithms, guided by 1998. New goals for the U.S. Human Genome Project: 1998–2003.
Science 282: 682–689.
the reference sequence, to produce contiguous DNA sequence Collins, F.S., Green, E.D., Guttmacher, A.E., and Guyer, M.S. 2003. A
information. vision for the future of genomics research. Nature 422: 835–847.
Although in its infancy, the potential for these emerging Crawford, D.C. and Nickerson, D.A. 2005. Definition and clinical
importance of haplotypes. Annu. Rev. Med. 56: 303–320.
sequencing strategies to deliver next-generation technologies Culbertson, C.T., Jacobson, S.C., and Ramsey, J.M. 1998. Dispersion
looks promising. Improvements in speed, efficiency, throughput, sources for compact geometries on microchips. Anal. Chem.
and sensitivity will all contribute to a reduction in cost over the 70: 3781–3789.
Daly, M.J., Rioux, J.D., Schaffner, S.F., Hudson, T.J., and Lander, E.S.
next several years. The timing of these strategies coincides with 2001. High-resolution haplotype structure in the human genome.
an increasing demand for resequencing capacity, which will pro- Nat. Genet. 29: 229–237.
vide valuable insight into the role of specific sequence variation The ENCODE Project Consortium. 2004. The ENCODE (ENCyclopedia
Of DNA Elements) Project. Science 306: 636–640.
with common disease. Integration of multidisciplinary technolo- Entz, P., Toliat, M.R., Hampe, J., Valentonyte, R., Jenisch, S., Nürnberg,
gies will translate into practical and affordable sequencing de- P., and Nagy, M. 2005. New strategies for efficient typing of HLA
vices capable of whole-genome analyses. Application of genome class-II loci DQB1 and DRB1 by using pyrosequencing. Tissue
Antigens 65: 67–80.
sequence information to health benefits could revolutionize dis-
Ewing, B. and Green, P. 1998. Base-calling of automated sequencer
ease prevention measures, early disease interventions, and make traces using Phred. II. Error probabilities. Genome Res. 8: 186–194.
the possibility of personalized therapies routine. Ewing, B., Hillier, L., Wendl, M.C., and Green, P. 1998. Base-calling of
automated sequencer traces using Phred. I. Accuracy assessment.
Genome Res. 8: 175–185.
Acknowledgments Foster, M.W. and Sharp, R.R. 2002. Race, ethnicity, and genomics: Social
classifications as proxies of biological heterogeneity. Genome Res.
12: 844–850.
I am extremely grateful to Richard A. Gibbs, Donna M. Muzny, Fox, M.S., Magasanik, B., Signer, E.R., Solomon, F., Gellert, M.F., Haber,
and Sherry Metzker for critical review of the manuscript; Steven J.E., Daniel, J., Koshland, E., and Muschel, L.H. 1990. The Genome
A. Soper for technical discussion; and NHGRI for their support Project: Pro and con. Science 247: 270.
from grants R01 HG003573, R41 HG003072, R41 HG003265, and Gabriel, S.B., Schaffner, S.F., Nguyen, H., Moore, J.M., Roy, J.,
Blumenstiel, B., Higgins, J., DeFelice, M., Lochner, A., Faggart, M., et
R21 HG002443. al. 2002. The structure of haplotype blocks in the human genome.
Science 296: 2225–2229.
Gharizadeha, B., Nordströma, T., Ahmadiana, A., Ronaghi, M., and
References Nyrén, P. 2002. Long-read pyrosequencing using pure
2⬘-deoxyadenosine-5⬘-O⬘-(1-thiotriphosphate) Sp-isomer. Anal.
Alaverdian, L., Alaverdian, S., Bilenko, O., Bogdanov, I., Filippova, E., Biochem. 301: 82–90.
Gavrilov, D., Gorbovitski, B., Gouzman, M., Gudkov, G., Guttman, A. 2002a. Capillary electrophoresis using replaceable gels. U.S.

1774 Genome Research


www.genome.org
Emerging technologies in DNA sequencing

patent no. RE37,606. electrophoresis channels. Anal. Chem. 71: 566–573.


———. 2002b. Capillary electrophoresis using replaceable gels. U.S. Liu, S., Ren, H., Gao, Q., Roach, D.J., Loder Jr., R.T., Armstrong, T.M.,
patent no. RE37,941. Mao, Q., Blaga, I., Barker, D.L., and Jovanovich, S.B. 2000.
Harrison, D.J., Manz, A., Fan, Z., Luedi, H., and Widmer, H.M. 1992. Automated parallel DNA sequencing on multiple channel
Capillary electrophoresis and sample injection systems integrated on microchips. Proc. Natl. Acad. Sci.. 97: 5369–5374.
a planar glass chip. Anal. Chem. 64: 1926–1932. Liu, P.-Y., Zhang, Y.-Y., Lu, Y., Long, J.-R., Shen, H., Zhao, L.-J., Xu,
Harrison, D.J., Fluri, K., Seiler, K., Fan, Z., Effenhauser, C.S., and Manz, F.-H., Xiao, P., Xiong, D.-H., Liu, Y.-J., et al. 2005. A survey of
A. 1993. Micromachining a miniaturized capillary haplotype variants at several disease candidate genes: The
electrophoresis-based chemical analysis system on a chip. Science importance of rare variants for complex diseases. J. Med. Genet.
261: 895–897. 42: 221–227.
Hayakawa, Y., Kato, H., Uchiyama, M., Kajino, H., and Noyori, R. 1986. Luria, S.E., Cooper, D.M., and Berkowitz, A. 1989. Human Genome
Allyloxycarbonyl group: A versatile blocking group for nucleotide Project. Science 246: 873–874.
synthesis. J. Org. Chem. 51: 2400–2402. Madabhushi, R.S. 1998. Separation of 4-color DNA sequencing extension
Hayakawa, Y., Hirose, M., and Noyori, R. 1993. O-Allyl protection of products in noncovalently coated capillaries using low viscosity
guanine and thymine residues in oligodeoxyribonucleotides. J. Org. polymer solutions. Electrophoresis 19: 224–230.
Chem. 58: 5551–5555. Madabhushi, R.S., Menchen, S.M., Efcavitch, J.W., and Grossman, P.D.
Honda, M., Morita, H., and Nagakura, I. 1997. Deprotection of allyl 1996. Polymers for separation of biomolecules by capillary
groups with sulfinic acids and palladium catalyst. J. Org. Chem. electrophoresis. U.S. patent no. 5,567,292.
62: 8932–8936. ———. 1999. Polymers for separation of biomolecules by capillary
Hyman, E.D. 1988. A new method of sequencing DNA. Anal. Biochem. electrophoresis. U.S. patent no. 5,916,426.
174: 423–436. Margulies, M., Egholm, M., Altman, W.E., Attiya, S., Bader, J.S., Bemben,
The International HapMap Consortium. 2003. The International L.A., Berka, J., Braverman, M.S., Chen, Y.-J., Chen, Z., et al. 2005.
HapMap Project. Nature 426: 789–796. Genome sequencing in microfabricated high-density picolitre
Jacobson, S.C., Hergenroder, R., Koutny, L.B., Warmack, R.J., and reactors. Nature 437: 376–380.
Ramsey, J.M. 1994. Effects of injection schemes and column McDonald, J.C., Duffy, D.C., Anderson, J.R., Chiu, D.T., Wu, H.,
geometry on the performance of microchip electrophoresis devices. Schueller, O.J.A., and Whitesides, G.M. 2000. Fabrication of
Anal. Chem. 66: 1107–1113. microfluidic systems in poly(dimethylsiloxane). Electrophoresis
Ju, J., Ruan, C., Fuller, C., Glazer, A., and Mathies, R. 1995. Fluorescence 21: 27–40.
energy transfer dye-labeled primers for DNA sequencing and Metzker, M.L., Raghavachari, R., Richards, S., Jacutin, S.E., Civitello, A.,
analysis. Proc. Natl. Acad. Sci. 92: 4347–4351. Burgess, K., and Gibbs, R.A. 1994. Termination of DNA synthesis by
Kan, C.-W., Fredlake, C.P., Doherty, E.A.S., and Barron, A.E. 2004. DNA novel 3⬘-modified deoxyribonucleoside triphosphates. Nucleic Acids
sequencing and genotyping in miniaturized electrophoresis systems. Res. 22: 4259–4267.
Electrophoresis 25: 3564–3588. Metzker, M.L., Lu, J., and Gibbs, R.A. 1996. Electrophoretically uniform
Kartalov, E.P. and Quake, S.R. 2004. Microfluidic device reads up to four fluorescent dyes for automated DNA sequencing. Science
consecutive base pairs in DNA sequencing-by-synthesis. Nucleic Acids 271: 1420–1422.
Res. 32: 2873–2879. Metzker, M.L., Raghavachari, R., Burgess, K., and Gibbs, R.A. 1998.
Kheterpal, I., Scherer, J., Clark, S., Radhakrishnan, A., Ju, J., Ginther, C., Elimination of residual natural nucleotides from 3⬘-O-modified-dNTP
Sensabaugh, G.F., and Mathies, R.A. 1996. DNA sequencing using a syntheses by enzymatic Mop-Up. BioTechniques 25: 814–817.
four-color confocal fluorescence capillary array scanner. Mitra, R. and Church, G. 1999. In situ localized amplification and
Electrophoresis 17: 1852–1859. contact replication of many individual DNA molecules. Nucleic Acids
Koshland, D.E. 1989. Sequences and consequences of the human Res. 27: e34.
genome. Science 246: 189. Mitra, R.D., Shendure, J., Olejnik, J., Edyta-Krzymanska-Olejnik, and
Koutny, L., Schmalzing, D., Salas-Solano, O., El-Difrawy, S., Adourian, Church, G.M. 2003. Fluorescent in situ sequencing on polymerase
A., Buonocore, S., Abbey, K., McEwan, P., Matsudaira, P., and colonies. Anal. Biochem. 320: 55–65.
Ehrlich, D. 2000. Eight hundred-base sequencing in a Nunnally, B.K., He, H., Li, L.-C., Tucker, S.A., and McGown, L.B. 1997.
microfabricated electrophoretic device. Anal. Chem. 72: 3388–3391. Characterization of visible dyes for four-decay fluorescence detection
Lander, E.S. 1996. The new genomics: Global views of biology. Science in DNA sequencing. Anal. Chem. 69: 2392–2397.
274: 536–539. Ohtsuka, E., Tanaka, S., and Ikehara, M. 1974. Studies on transfer
Langaee, T. and Ronaghi, M. 2005. Genetic variation analyses by ribonucleic acids and related compounds. IX(1) Ribooligonucleotide
pyrosequencing. Mutat. Res. 573: 96–102. synthesis using a photosensitive o-nitrobenzyl protection at the
Lassiter, S.J., Stryjewski, W., Benjamin, J., Legendre, L., Erdmann, R., 2⬘-hydroxyl group. Nucleic Acids Res. 1: 1351–1357.
Wahl, M., Wurm, J., Peterson, R., Middendorf, L., and Soper, S.A. Paegel, B.M., Hutt, L.D., Simpson, P.C., and Mathies, R.A. 2000. Turn
2000. Time-resolved fluorescence imaging of slab gels for lifetime geometry for minimizing band broadening in microfabricated
base-calling in DNA sequencing applications. Anal. Chem. capillary electrophoresis channels. Anal. Chem. 70: 3030–3037.
72: 5373–5382. Paegel, B.M., Emrich, C.A., Wedemayer, G.J., Scherer, J.R., and Mathies,
Leamon, J.H., Lee, W.L., Tartaro, K.R., Lanza, J.R., Sarkis, G.J., deWinter, R.A. 2002. High throughput DNA sequencing with a microfabricated
A.D., Berka, J., Weiner, M., Rothberg, J.M., and Lohman, K.L. 2003. 96-lane capillary array electrophoresis bioprocessor. Proc. Natl. Acad.
A massively parallel PicoTiterPlate™ based platform for discrete Sci. 99: 574–579.
picoliter-scale polymerase chain reactions. Electrophoresis Paegel, B.M., Blazej, R.G., and Mathies, R.A. 2003. Microfluidic devices
24: 3769–3777. for DNA sequencing: Sample preparation and electrophoretic
Lee, L., Spurgeon, S., Heiner, C., Benson, S., Rosenblum, B., Menchen, analysis. Curr. Opin. Biotechnol. 14: 42–50.
S., Graham, R., Constantinescu, A., Upadhya, K., and Cassel, J. 1997. Parkhill, J., Achtman, M., James, K.D., Bentley, S.D., Churcher, C., Klee,
New energy transfer dyes for DNA sequencing. Nucleic Acids Res. S.R., Morelli, G., Basham, D., Brown, D., Chillingworth, T., et al.
25: 2816–2822. 2000a. Complete DNA sequence of a serogroup A strain of Neisseria
Levene, M.J., Korlach, J., Turner, S.W., Foquet, M., Craighead, H.G., and meningitidis Z2491. Nature 404: 502–506.
Webb, W.W. 2003. Zero-mode waveguides for single-molecule Parkhill, J., Wren, B.W., Mungall, K., Ketley, J.M., Churcher, C., Basham,
analysis at high concentrations. Science 299: 682–686. D., Chillingworth, T., Davies, R.M., Feltwell, T., Holroyd, S., et al.
Lewis, E.K., Haaland, W.C., Nguyen, F., Heller, D.A., Allen, M.J., 2000b. The genome sequence of the food-borne pathogen
MacGregor, R.R., Berger, C.S., Willingham, B., Burns, L.A., Scott, Campylobacter jejuni reveals hypervariable sequences. Nature
G.B.I., et al. 2005. Color-blind fluorescence detection for four-color 403: 665–668.
DNA sequencing. Proc. Natl. Acad. Sci. 102: 5346–5351. Patil, N., Berno, A.J., Hinds, D.A., Barrett, W.A., Doshi, J.M., Hacker,
Li, Z., Bai, X., Ruparel, H., Kim, S., Turro, N.J., and Ju, J. 2003. A C.R., Kautzer, C.R., Lee, D.H., Marjoribanks, C., McDonough, D.P., et
photocleavable fluorescent nucleotide for DNA sequencing and al. 2001. Blocks of limited haplotype diversity revealed by
analysis. Proc. Natl. Acad. Sci. 100: 414–419. high-resolution scanning of human chromosome 21. Science
Lieberwirth, U., Arden-Jacob, J., Drexhage, K.H., Herten, D.P., Muller, R., 294: 1719–1723.
Neumann, M., Schulz, A., Siebert, S., Sagner, G., Klingel, S., et al. Pease, A.C., Solas, D., Sullivan, E.J., Cronin, M.T., Holmes, C.P., and
1998. Multiplex dye DNA sequencing in capillary gel electrophoresis Fodor, S.P.A. 1994. Light-generated oligonucleotide arrays for rapid
by diode laser-based time-resolved fluorescence detection. Anal. DNA sequence analysis. Proc. Natl. Acad. Sci. 91: 5022–5026.
Chem. 70: 4771–4779. Pillai, V.N.R. 1980. Photoremovable protecting groups in organic
Liu, S., Shi, Y., Ja, W., and Mathies, R.A. 1999. Optimization of synthesis. Synthesis Issue 2: 1–26.
high-speed DNA sequencing on microfabricated capillary Prober, J., Trainor, G., Dam, R., Hobbs, F., Robertson, C., Zagursky, R.,

Genome Research 1775


www.genome.org
Metzker

Cocuzza, A., Jensen, M., and Baumeister, K. 1987. A system for rapid Shi, Y. and Anderson, R.C. 2003. High-resolution single-stranded DNA
DNA sequencing with fluorescent chain-terminating analysis on 4.5 cm plastic electrophoretic microchannels.
dideoxynucleotides. Science 238: 336–341. Electrophoresis 24: 3371–3377.
Quake, S. and Scherer, A. 2000. From micro- to nanofabrication with Simpson, J.W., Ruiz-Martinez, M.C., Mulhern, G.T., Berka, J., Latimer,
soft materials. Science 290: 1536–1540. D.R., Ball, J.A., Rothberg, J.M., and Went, G.T. 2000. Transmission
Risch, N. and Merikangas, K. 1996. The future of genetic studies of imaging spectrograph and microfabricated channel system for DNA
complex human diseases. Science 273: 1516–1517. analysis. Electrophoresis 21: 135–149.
Roberts, L. 1989a. New game plan for genome mapping. Science Singh-Gasson, S., Green, R.D., Yue, Y., Nelson, C., Blattner, F., Sussman,
245: 1438–1440. M.R., and Cerrina, F. 1999. Maskless fabrication of light-directed
———. 1989b. Watson versus Japan. Science 246: 576–578. oligonucleotide microarrays using a digital micromirror array. Nat.
Robertson, J.A. 2003. The $1000 genome: Ethical and legal issues in Biotechnol. 17: 974–978.
whole genome sequencing of individuals. Am. J. Bioeth. 3: W-IF1. Smith, L., Sanders, J., Kaiser, R., Hughes, P., Dodd, C., Connell, C.,
Ronaghi, M. 2000. Improved performance of pyrosequencing using Heiner, C., Kent, S., and Hood, L. 1986. Fluorescence detection in
single-stranded DNA-binding protein. Anal. Biochem. 286: 282–288. automated DNA sequence analysis. Nature 321: 674–679.
———. 2001. Pyrosequencing sheds light on DNA sequencing. Genome Smith, L.M., Kaiser, R.J., Sanders, J.Z., and Hood, L.E. 1987. The
Res. 11: 3–11. synthesis and use of fluorescent oligonucleotides in DNA sequence
Ronaghi, M., Karamohamed, S., Pettersson, B., Uhlén, M., and Nyrén, P. analysis. Methods Enzymol. 155: 260–301.
1996. Real-time DNA sequencing using detection of pyrophosphate Tabor, S. and Richardson, C.C. 1989. Effect of manganese ions on the
release. Anal. Biochem. 242: 84–89. incorporation of dideoxynucleotides by bacteriophage T7 DNA
Ronaghi, M., Uhlén, M., and Nyrén, P. 1998. A sequencing method polymerase and Escherichia coli DNA polymerase I. Proc. Natl. Acad.
based on real-time pyrophosphate. Science 281: 363, 365. Sci. 86: 4076–4080.
Ruiz-Martinez, M.C., Berka, J., Belenkii, A., Foret, F., Miller, A.W., and ———. 1995. A single residue in DNA polymerases of the Escherichia coli
Karger, B.L. 1993. DNA sequencing by capillary electrophoresis with DNA polymerase I family is critical for distinguishing between
replaceable linear polyacrylamide and laser-induced fluorescence deoxy- and dideoxyribonucleotides. Proc. Natl. Acad. Sci.
detection. Anal. Chem. 65: 2851–2858. 92: 6339–6343.
Ruparel, H., Bi, L., Li, Z., Bai, X., Kim, D.H., Turro, N.J., and Ju, J. 2005. Takahashi, S., Murakami, K., Anazawa, T., and Kambara, H. 1994.
Design and synthesis of a 3⬘-O-allyl photocleavable fluorescent Multiple sheath-flow gel capillary-array electrophoresis for
nucleotide as a reversible terminator for DNA sequencing by multicolor fluorescent DNA detection. Anal. Chem. 66: 1021–1026.
synthesis. Proc. Natl. Acad. Sci. 102: 5932–5937. Velculescu, V.E., Zhang, L., Vogelstein, B., and Kinzler, K.W. 1995. Serial
Salas-Solano, O., Carrilho, E., Kotler, L., Miller, A.W., Goetzinger, W., analysis of gene expression. Science 270: 484–487.
Sosic, Z., and Karger, B.L. 1998. Routine DNA Sequencing of 1000 Weber, J.L. and Myers, E.W. 1997. Human whole-genome shotgun
Bases in Less Than One Hour by Capillary Electrophoresis with sequencing. Genome Res. 7: 401–409.
Replaceable Linear Polyacrylamide Solutions. Anal. Chem. Woolley, A.T. and Mathies, R.A. 1995. Ultra-high-speed DNA sequencing
70: 3996–4003. using capillary electrophoresis chips. Anal. Chem. 67: 3676–3680.
Salas-Solano, O., Schmalzing, D., Koutny, L., Buonocore, S., Adourian, Zhang, C.-X. and Manz, A. 2001. Narrow sample channel injectors for
A., Matsudaira, P., and Ehrlich, D. 2000. Optimization of capillary electrophoresis on microchips. Anal. Chem. 73: 2656–2662.
high-performance DNA sequencing on short microfabricated Zhu, L., Stryjewski, W., Lassiter, S., and Soper, S.A. 2003. Fluorescence
electrophoretic devices. Anal. Chem. 72: 3129–3137. multiplexing with time-resolved and spectral discrimination using a
Sanger, F., Nicklen, S., and Coulson, A.R. 1977. DNA sequencing with near-IR detector. Anal. Chem. 75: 2280–2291.
chain-terminating inhibitors. Proc. Natl. Acad. Sci. 74: 5463–5467. Zhu, L., Stryjewski, W.J., and Soper, S.A. 2004. Multiplexed fluorescence
Schmalzing, D., Tsao, N., Koutny, L., Chisholm, D., Srivastava, A., detection in microfabricated devices with both time-resolved and
Adourian, A., Linton, L., McEwan, P., Matsudaira, P., and Ehrlich. D. spectral-discrimination capabilities using near-infrared fluorescence.
1999. Toward real-world sequencing by microdevice electrophoresis. Anal. Biochem. 330: 206–218.
Genome Res. 9: 853–858.
Seo, T.S., Bai, X., Ruparel, H., Li, Z., Turro, N.J., and Ju, J. 2004.
Photocleavable fluorescent nucleotides for DNA sequencing on a
chip constructed by site-specific coupling chemistry. Proc. Natl. Acad. Web site references
Sci. 101: 5488–5493.
Seo, T.S., Bai, X., Kim, D.H., Meng, Q., Shi, S., Ruparel, H., Li, Z., Turro, http://grants.nih.gov/grants/guide/rfa-files/RFA-HG-04-002.html;
N.J., and Ju, J. 2005. Four-color DNA sequencing by synthesis on a RFA-HG-04-002. 2004. $100,000 genome RFA.
chip using photocleavable fluorescent nucleotides. Proc. Natl. Acad. http://grants.nih.gov/grants/guide/rfa-files/RFA-HG-04-003.html;
Sci. 102: 5926–5931. RFA-HG-04-003. 2004. $1000 genome RFA.
Shendure, J., Mitra, R.D., Varma, C., and Church, G.M. 2004. Advanced http://www.agbt.org; Home page for the Advances in Genome Biology
sequencing technologies: Methods and goals. Nat. Rev. Genet. and Technology meeting.
5: 335–344. http://www.solexa.com; Home page for Solexa, Inc.

1776 Genome Research


www.genome.org

You might also like