0% found this document useful (0 votes)
22 views13 pages

Single-Cell Sequencing-Based-Technologies-Will-Revolutionize-Whole-Organism-Science - 2022 - Sten

Uploaded by

xinyi.huang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views13 pages

Single-Cell Sequencing-Based-Technologies-Will-Revolutionize-Whole-Organism-Science - 2022 - Sten

Uploaded by

xinyi.huang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

REVIEWS

A P P L I C AT I O N S O F N E X T- G E N E R AT I O N S E Q U E N C I N G

Single-cell sequencing-based
technologies will revolutionize
whole-organism science
Ehud Shapiro1,2, Tamir Biezuner1,2 and Sten Linnarsson3
Abstract | The unabated progress in next-generation sequencing technologies is fostering a
wave of new genomics, epigenomics, transcriptomics and proteomics technologies. These
sequencing-based technologies are increasingly being targeted to individual cells, which will
allow many new and longstanding questions to be addressed. For example, single-cell
genomics will help to uncover cell lineage relationships; single-cell transcriptomics will
supplant the coarse notion of marker-based cell types; and single-cell epigenomics and
proteomics will allow the functional states of individual cells to be analysed. These
technologies will become integrated within a decade or so, enabling high-throughput,
multi-dimensional analyses of individual cells that will produce detailed knowledge of the
cell lineage trees of higher organisms, including humans. Such studies will have important
implications for both basic biological research and medicine.

Next-generation sequencing
DNA sequencing has undergone constant improvement Single cells can be studied and tracked using many
(NGS). High-throughput DNA since its inception in the 1970s. Today, next-generation detection technologies, including quantitative imaging
sequencing of a large number sequencing (NGS) approaches are accelerating in speed and mass spectrometry. However, our Review focuses on
of DNA molecules in parallel. and decreasing in cost more quickly than Moore’s law 1. single-cell analysis using DNA-sequencing-based tech-
There is a trade-off between
DNA sequencing technologies have improved in preci- nologies. Although single-cell sequencing-based analysis
read length and throughput
that depends on the sion and throughput, and have enabled the sequenc- has been applied to both unicellular 14 and multicellular
sequencing technology, ing of entire genomes of species2,3 and individuals4. organisms15, this Review focuses on mammalian (pri-
run time and quality. An increasing number of questions can be addressed marily mouse and human) single-cell analysis. We first
by DNA-sequencing-based technologies. In particular, survey current technologies for single-cell isolation,
transcriptomic5, epigenomic6 and proteomic7 analyses which is essential for DNA-sequencing-based single-
are being carried out using methods that reduce a spe- cell analysis. We then review technologies for single-cell
cific analysis problem to a DNA-sequencing problem, as genomic and transcriptomic analysis, and their applica-
explained in FIG. 1. tions. We briefly discuss methods for sequencing-based
1
Department of Computer DNA sequencing technology has not only scaled up epigenomic and proteomic analyses that have yet to be
Science and Applied Math rapidly in throughput but — through advances in sam- scaled to single cells. Finally, we describe the impact that
and 2Department of Biological
Chemistry, Weizmann Institute
ple preparation — has also scaled down in terms of the the integration of these methods will have on whole-
of Science, Rehovot 76100, amount of DNA that is required for analysis, to the point organism science (FIG. 1). We predict an era of integrated
Israel. at which it is now feasible to analyse the DNA content of single-cell genomic, epigenomic, transcriptomic and
3
Department of Medical individual cells8,9. This opens up a wealth of previously proteomic analysis, which we believe will revolutionize
Biochemistry and Biophysics,
impossible applications in both basic research and clini- whole-organism science by enabling the reconstruction
Karolinska Institutet, Scheeles
väg 2, 17177 Stockholm, cal science. Examples are: the study of microorganisms of organismal cell lineage trees for higher organisms, cul-
Sweden. that cannot be cultured, using direct single-cell genome minating in the reconstruction of an entire human cell
Correspondence to E.S. and S.L. sequencing 10; transcriptome analysis of rare, circulating lineage tree16, which will have broad implications for
e-mails: ehud.shapiro@ tumour cells11; characterization of the earliest differentia- human biology and medicine.
weizmann.ac.il;
[email protected]
tion events in human embryogenesis; the investigation of Naturally, in such a diverse, rapidly developing and
doi:10.1038/nrg3542 transcriptional noise and stochastic fate choice; and the interdisciplinary field, we cannot possibly cover all of the
Published online 30 July 2013 study of tumour heterogeneity 12 and microevolution13. work that has been carried out over the past few years.

618 | SEPTEMBER 2013 | VOLUME 14 www.nature.com/reviews/genetics

© 2013 Macmillan Publishers Limited. All rights reserved


REVIEWS

a
Cell Assay that transforms a DNA library Sequence data Computational Knowledge of
cell property into a DNA analysis inferring the cell property
library reflecting it DNA sequencer GATCGATCATTGCTAGCTC the cell property (e.g. transcriptome)
TACGTAGCTAGCTAGCTAG
CATAGCTAGCCATAGCTTA mRNA
ATCGCTAGCTATTCAGCTC miRNA

b Assays that transform each DNA libraries Sequence data Computational Knowledge of P1
cell property Pi to a DNA analysis inferring (e.g. mutations)
library reflecting Pi GATCGATCATTGCTAGCTC
properties P1 … Pn TGGCATGA
TACGTAGCTAGCTAGCTAG ATGTCTTA
CATAGCTAGCCATAGCTTA GGAGATTG
Cell
ATCGCTAGCTATTCAGCTC
DNA sequencer GCTAGCTATAGCTCTAGCT Knowledge of P2
AGCATTCGATCTAGCTATG (e.g. methylation)
TTGCTATGCTATCGACTAG Me
… …
CTAGCTATCGCTCTACGAC Me
TGACTGCTTAGCTATTCAG
CTC GCTAGCTATAGCTCT Knowledge of Pn
AGCATTCGATCTAGCTATG (e.g. mRNA count)
CTGCTATGCTATCAGCGAT mRNA
miRNA

Figure 1 | Single-cell sequencing-based analysis methods and their anticipated integration. a | Architecture of
Nature Reviews | Genetics
single-cell DNA-sequencing-based technologies. Current implementations include single-cell genomics (targeted
exome or mutational analysis 9,16,56,59
, copy-number variation , and recombination analysis in germ cells27,68),
8,9,58

transcriptomics (transcriptome analysis11,99,102,104 and recombination analysis in the immune system136) and epigenomics113.
b | Architecture of future integrated single-cell DNA-sequencing-based analysis. We expect that within a decade
this architecture will allow the simultaneous analysis of multiple properties of an individual cell, including
genomics3,4,38,60–63,76,78–81,83,84,137–139, epigenomics (methylation6,82,106,107,140, chromatin108 and conformational110,111 analysis),
transcriptomics (transcriptome analysis5,141–143, allele-specific gene expression94 and molecule counting93–97) and
Organismal cell lineage tree proteomics7,127, all of which are currently limited to bulk experiments.
A mathematical entity
capturing all cell division and
death events in the life of an
organism up to a particular
time point. The tree consists of Also, we expect that by the time this Review is published, Their disadvantages are that they are only applicable
labelled nodes, which represent
additional progress will have been made, which we to cells in suspension, they are low-throughput, and
all organismal cells, and
directed edges, which represent have been unable to cover. We apologize to the authors they are susceptible to errors, such as misidentifica-
progeny relationships among whose work we have not discussed. tion of a cell under a microscope. These disadvantages
them. A reconstructed tree are partially addressed by semi-automated devices for
describes lineage relationships Methods for single-cell isolation cell isolation, with which an expert operator can iso-
among cells sampled from an
organism, and is precise only
Tissues are rarely homogenous, and typically consist of late approximately 50–100 cells per hour 20. A different
if it is a subtree of the (true) tens or hundreds of distinct cell types, which are often approach, which is also classified as micromanipula-
organismal cell lineage tree. intermingled and present at widely different abun- tion, is the optical tweezers technology, which uses a
dances. Single cells can be isolated from such tissues in laser beam to capture cells. Although not commonly
Cell type
various ways (TABLE 1), which can be classified as either used, it allows specific cell micromanipulation and
A classification of cells by
morphology, genotype, unbiased (randomized) or biased (targeted) sampling. measurement 21.
phenotype or developmental In principle, an unbiased sample better reflects the Cell isolation can also be achieved by flow sorting
origin. There is no consensus composition of the tissue, but a targeted sample may using fluorescence-activated cell sorting (FACS), either
on which properties are be necessary in order to isolate rare cell types. using cell-type-specific markers for a biased, targeted
necessary and sufficient for
this classification, nor is there
There are two key steps in the isolation of single sample, and/or using the light-scattering properties of
general agreement on the cells from a solid tissue. First, the tissue must be cells to obtain an unbiased sample. The main advan-
actual number of cell types or removed from the animal or plant — typically by dis- tages of FACS-based sorting are the ability to choose
their proper classification in section or biopsy — and dissociated into its constituent between biased and unbiased isolation, high levels of
any higher organism, including
individual cells, usually using enzymatic disaggrega- accuracy and high-throughput single-cell isolation12.
in humans.
tion. Second, single cells must be placed into individual However, FACS requires a large number of cells in sus-
Fluorescence-activated reaction chambers for lysis and further processing. pension as starting material, which might affect the
cell sorting Individual cells can be isolated using micro- yield with respect to low-abundance cell subpopula-
(FACS). A tool that enables manipulation, for example, using a simple mouth tions. In addition, the rapid flow in the machine might
high-speed counting and/or
sorting of cells according
pipette9,17 or by serial dilution18,19. As micromanipula- damage the cells, and care must be taken to ensure the
to features detected by tion methods are easy and cheap, they are the most viability of the collected cells if live cells are necessary
fluorescence. commonly used single-cell isolation methodologies. for downstream protocols.

NATURE REVIEWS | GENETICS VOLUME 14 | SEPTEMBER 2013 | 619

© 2013 Macmillan Publishers Limited. All rights reserved


REVIEWS

Table 1 | Advantages and disadvantages of common single-cell isolation methods


Method Unbiased (randomized) Throughput Cost Manual or automatic Refs
or biased (targeted)? isolation process?
Micromanipulation Unbiased Low-throughput Low Mainly manual 9,17–20
Fluorescence- Either biased or unbiased High-throughput High Automatic 12
activated cell sorting
Laser-capture Unbiased Low-throughput High Manual 22–24
microdissection
Microfluidics Unbiased High-throughput High Automatic 26–29

Laser-capture microdissection (LCM)22,23 can be used accumulated since the zygotic stage, endow each cell
to cut cells from fixed tissues or cryosections and is in our bodies with a genomic signature that is unique
effective for collecting nuclei for genomic analyses. The with a very high probability 16. As the differences in cel-
great advantage of LCM is that knowledge of the spa- lular genomic signatures are mostly without phenotypic
tial location of a sampled cell within a tissue is retained, effect, what would science gain by knowing them?
unlike methodologies in which tissue disaggregation The answer is that knowing the unique genomic
is required. There are several current disadvantages of signatures of our body cells allows the reconstruction
LCM. First, it requires expert manual operation and is a of cell lineage trees with very high precision16. Central
low-throughput technique. Second, in our opinion it is unresolved problems in human biology and medicine
less suitable than other methods for transcriptome anal- are in fact questions about the human cell lineage tree: its
ysis, because it is nearly impossible to capture all or most structure, dynamics and variability during development,
of the cytoplasm of a cell without also collecting material growth, renewal, ageing and disease. For example: does
from neighbouring cells. Third, because the section to the oocyte pool renew during adulthood32? Do β-cells
be dissected has to be of a single-cell width, DNA might be renew 33? Do neural progenitor cells produce each brain
lost by partial nuclei dissection. Finally, selection may cell type as needed, or do specialized progenitors each
be biased owing to the misuse of markers24. For these produce a single cell type34,35? Information about the cell
reasons, LCM is less widely used than other methods for lineage trees of higher organisms consists largely of data
single-cell isolation. from cell fate maps36,37, which are mostly derived from
Recently introduced microfluidic devices have clonal-marking experiments that are not applicable to
opened new horizons in single-cell isolation and analy- humans. Complete knowledge of the unique somatic
sis12,25. These devices allow the compartmentalization mutations that are accumulated in each cell would allow
and controlled management of nanolitre reactions using the reconstruction of cell lineage trees with extremely
fabricated microfluidic chips, and they use controlled high precision16,38.
liquid streaming. The ability to accurately construct Work in this direction has focused on identifying
low-volume chambers and tubes makes microfluidics somatic mutations in microsatellites39 that are hyper-
ideal for single-cell isolation, as well as for further down- mutable in normal cells and even more so in microsat-
stream processes. Microfluidic devices provide inherent ellite-instable (MSI) cells19,40,41 and in mismatch repair
Laser-capture
advantages by allowing higher throughput with less (MMR)-deficient organisms16,42,43. Knowing only a small
microdissection effort, reducing reagent cost and improving accuracy. proportion of such mutations allowed fairly precise lin-
(LCM). A method that In recent years several implementations of microfluidic eage reconstruction using standard phylogenetic algo-
combines high-resolution devices have been presented for single-chromosome rithms, depending on cell depth40,44,45. By applying this
microscopy and the accurate
isolation26 and single-cell isolation followed by analy- approach to samples of cells from tissues of interest, key
isolation of user-defined
regions of a tissue slice for sis27,28. We expect microfluidic technologies and prod- aspects of the underlying cell state dynamics were char-
downstream analysis. Typically, ucts to continue their advance and ultimately to provide acterized. The cell lineage trees thus obtained provided
a powerful laser is used to cut a robust foundation for single-cell sequencing-based information about the substructure of the population,
an outline of the target region, analysis29. such as the existence of small populations of stem cells.
which can then be ejected into
a sample tube.
Such information has applications for developmental
Single-cell genomics biology (for example, oocyte maturation, colon crypt
Microsatellites Reconstructing cell lineage trees using somatic muta- development 18 and muscle stem cell lineages46) and for
Repetitive elements in the tions. Different cells from the same individual were ini- leukaemia19.
genome that consist of basic
tially thought to harbour identical genomes. This turns Somatic mutations can be used for cell lineage recon-
units 1–6 bp long that are
repeated from a few to a few out to be false, not only for the immune system30 and struction only if: the mutations do not confer a selec-
dozen times. Microsatellites cancer cells31 (which both undergo somatic evolution) tive advantage or disadvantage, they are associated
occupy 3% of the human and for germline cells that undergo recombination27, with DNA replication (rather than elapsed time, for
genome. but for all cells in our bodies. During normal mitotic example) and/or their dynamics is well understood and
Cell depth
cell division DNA is replicated with very high, but can be modelled. The accuracy of lineage reconstruc-
The number of divisions a cell not absolute, precision, which leads to the incorpora- tion increases with the fraction of the genome analysed
underwent since the zygote. tion of somatic mutations. These somatic mutations, per cell, and there is a trade-off between accuracy and

620 | SEPTEMBER 2013 | VOLUME 14 www.nature.com/reviews/genetics

© 2013 Macmillan Publishers Limited. All rights reserved


REVIEWS

Box 1 | Whole-organism science: the Caenorhabditis elegans benchmark and the principle of biological uncertainty
Caenorhabditis elegans is the best-studied multicellular organism and hence offers a 0 Y
benchmark for the systematic and integrative study of organismal biology at the
molecular, cellular, organ and organismal levels, which we refer to as ‘whole-organism
science’. The scaffold on which the comprehensive knowledge of C. elegans biology
is structured is its cell lineage tree131,132, a fragment of which is shown in the figure. The 5
structure of the tree shows the lineage relationships among all the organism’s cells, past
and present; the labels give the identities and types of organismal cells, and the length L1
of tree edges represent the timing of cell division and death events. Additional
knowledge that is not shown in the tree is the location within the organism of each cell. 10
New knowledge is constantly being added to this scaffold (for example, the a p
transcriptome of each organismal cell99).
Whereas the development of C. elegans is deemed to be deterministic, higher
organisms exhibit great variability during development, renewal, ageing and disease, 15

Y. a; PDA
which is caused by genetic and environmental differences. Owing to this variability, we
postulate that whole-organism science of higher organisms must deal with a ‘biological
uncertainty principle’. Heisenberg’s uncertainty principle states that it is impossible to

Developmental stage
measure accurately and simultaneously the position and momentum of an elementary
20 L2 l r
particle. Similarly, in general it is not possible to measure accurately and simultaneously

Time (hours)
the ‘cellular position’ (for example, the genomic, transcriptomic, epigenomic and a p a p
proteomic state of a cell) and the ‘cellular momentum’ (for example, the next av pd av pd av pd av pd
differentiation, division or degradation event of a cell) for individual cells in an
25
organism. In order to know accurately the state of a cell, one must destroy it and analyse
d v d v

Y. plaa; PChL

Y. praa; PChR

Y. prpa; PCBR
Y. plpa; PCBL
Y. plap; PCsoL

Y. prap; PCsoR
its content, thereby eliminating cellular momentum. Alternatively, to observe cellular
momentum, one cannot interfere with the behaviour of the cell and hence must
compromise on precisely knowing the state of the cell. For example, using fluorescent L3

Y. plppd; PCAL
Y. plppv; PCshL

Y. prppd; PCAR
Y. prppv; PCshR
reporters that are minimally invasive, limited information can be obtained on both the 30
state and momentum of single cells133–135. The use of external markers may still have an
effect on the cell, and currently their use is limited to measuring a small number of
parameters and is not applicable to humans. In non-deterministic higher organisms the
structure of the cell lineage tree may be affected by nature (the genome), or nurture 35
(the environment), as well as stochastic events such as cancerous mutations. Integrated
single-cell analysis providing knowledge of the genome, epigenome, transcriptome and Post-cloacal sensilla
proteome of each sampled cell can be used to reconstruct detailed lineage trees of the L4
sampled cells and to infer the functional state of ancestor cells. Such inferred trees can 40
be used to predict the next differentiation or division decision of a cell on the basis of its
functional state, thus overcoming (to some extent) the limitations imposed by the
biological uncertainty principle. The figure is reproduced, with permission, from
http://www.wormatlas.org/images/BYUFlineages.jpg © (2013) WormAtlas. 45

Nature Reviews | Genetics

cost per cell. Given a fixed fraction of the genome to be on the same cell lineage tree, as has been done for
analysed, the accuracy of its sequencing is crucial. One Caenorhabditis elegans (BOX 1), and labels of internal
way to increase sequencing accuracy (up to a point) is nodes of a cell lineage tree can only be approximated
by increasing the sequencing depth. Sequencing accu- from the properties of the sampled cells, namely the
racy is decreased by the bias and infidelity introduced leaves of the tree. Labelling the leaves and internal nodes
by the biochemical steps of preparing cellular DNA for of the reconstructed cell lineage tree with cell type and
sequencing, including whole-genome amplification (elabo- state information requires further single-cell epigenomic
rated on below), library preparation and the sequencing and transcriptomic analyses, as explained below.
process itself 47. Trade-offs between cost and accuracy
require fine-tuning these parameters (for example, car- Cell lineage reconstruction of cancer will elucidate
Sequencing depth
rying out additional sequencing runs using fewer cells its development. Cancer patients typically do not die
The total amount of raw per run or increasing the number of analysed loci). from the effects of the primary tumour but from those
sequence mapped to a A disadvantage of cell lineage reconstruction using of its metastases. Yet, despite decades of research, the
reference genome, divided by somatic mutations is that it cannot provide, by itself, key question of where metastases originate from has
the length of the genome.
information on the state of inferred ancestor cells. It can not been fully answered 48 (FIG. 2). For example, can
Whole-genome show the depth of sampled cells and the lineage rela- metastases originate from any tumour cell or only
amplification tionships among them, but not the type of the ances- from a distinct tumour subclone (for example, cir-
(WGA). Refers to methods tor cells that are represented by internal nodes in the culating tumour cells49)? In the latter case, are these
that are used to amplify the reconstructed cell lineage tree. In particular, the results subclones created early or late in the development of
genomic DNA of single cells to
increase the number of copies
of ‘time-lapse’ experiments — in which different tissue the tumour 23,50? Alternatively, perhaps metastases and the
of DNA for downstream samples of the same or different organisms are analysed primary tumour are both independent descendants of
processing. at different organism ages — cannot be superimposed cancer stem cells51,52. Or maybe metastases are formed

NATURE REVIEWS | GENETICS VOLUME 14 | SEPTEMBER 2013 | 621

© 2013 Macmillan Publishers Limited. All rights reserved


REVIEWS

a Random cells b Specific deep c Specific shallow d Other metastases e Cell fusion
subpopulation subpopulation

Primary tumour
Metastasis
Mobile cells

Figure 2 | Alternative hypotheses on the origin of metastases. a | Metastases originate from random cells during
tumour development. b | Following tumour growth, metastases originate from a specific tumourNature subpopulation,
Reviews which
| Genetics
underwent many divisions (that is, a ‘deep’ subpopulation). c | At the initial tumour growth stages, metastases originate
from a specific tumour subpopulation (that is, a ‘shallow’ subpopulation). d | Metastases originate from other metastases.
e | Tumour cells engage metastasis by fusion with other cells, which endow a mobility property.

through the fusion of primary tumour cells and nor- penetrance, to construct models that distinguish driver
mal mobile cells such as macrophages53,54. As another and passenger mutational events and to reconstruct
example, consider the origin of cancer relapse after tumour ancestries using coalescent models23,61–63.
chemotherapy. Is this caused by ordinary tumour cells Lineage tracing using NGS of somatic mutations
escaping chemotherapy stochastically, or by a distinct has been demonstrated in vivo for bulk cell populations
subpopulation of infrequently dividing cancer-initiating but not for single cells. Bulk methods do not provide
cells that escape chemotherapy owing to their slow divi- accurate information on how different combinations of
sion rate? The answers to these questions are encoded mutations or aberrations emerge, and precise answers
in the patient’s cancer cell lineage tree19,55. Understanding to such questions await single-cell lineage analysis
the emergence and distribution of driver mutations of cancer.
in the context of the cancer cell lineage tree is also of
prime importance56. The road to single-cell genomics. Although sequencing
Early experiments analysed a few key markers in each the DNA of a cell population is now straightforward61,
individual cell. In one recent example, heterogeneity and sequencing DNA from single cells is still a challenge.
Clonal expansion
tumour origin in acute lymphoblastic leukaemia were Historically, the cost of sequencing multiple individual
A method to retrieve studied by assaying the occurrence of up to eight chro- cells at adequate depth for genetic profiling was prohibi-
representative DNA from a mosomal aberrations and their combinations in single tively high, and despite the remarkable recent increase
single cell following its cells using fluorescence in situ hybridization (FISH). in throughput, the sequencing cost is still a hurdle to
proliferation. A single cell is
This allowed an analysis of subclonal architecture dur- large-scale single-cell genomics, transcriptomics and
isolated, cultured ex vivo, and
allowed to divide several times. ing cancer progression57. More recently, sequencing of epigenomics. The current prevailing DNA sequenc-
DNA is isolated from the bulk hundreds of single nuclei was used to generate approxi- ing approach combines WGA with the preparation of
cell population using standard mated copy-number profiles for individual breast cancer amplified, nanogram-sized DNA libraries. WGA can be
DNA extraction techniques cells8,58, thus allowing the reconstruction of tumour pop- achieved through multiple variants of PCR-based ampli-
that do not involve
amplification.
ulation structure and evolutionary history. In another fication8,58,64–66 or isothermal amplification using multi-
study 59, whole-exome single-cell sequencing in a patient ple-displacement amplification16,27,56,59,67. Demand for
Single-nucleotide with myeloproliferative neoplasm was carried out to unbiased single-cell DNA amplification has inspired the
polymorphism calls reconstruct tumour ancestries and to identify candidate development of new techniques for WGA. These include
(SNP calls). Following
driver mutations. In a final example, single-cell DNA the multiple-annealing and looping-based amplification
sequencing read assembly, this
is the identification of single templates were extracted following clonal expansion and cycles (MALBAC) method9,68 and single-cell-specific
nucleotides that are different were sequenced60 (a method that is discussed below) to WGA kits such as the single-cell RepliG kit, by Qiagen.
from the nucleotide at the study the lineage of normal cells and to determine the Performance of the available methods varies between
same position in a specific earliest precancerous mutations that ultimately led to applications, and a comprehensive side‑by‑side com-
reference genome. This
process requires high-
the development of the tumour. parison of the different methods is still much needed14,69.
quality sequencing and Bulk sequence analysis methods are practical and In this article we do not provide a comprehensive sum-
adequate sequencing depth efficient predecessors to single-cell studies. They can mary of all the available techniques. For a recent review
for statistical significance. be used to extract efficiently distributions of markers summarizing WGA methods see REF. 14.
of interest (for example, somatic mutations) from a A high-fidelity, low-bias method for genome ampli-
Sequencing coverage
In a sequencing experiment, large number of cells. Recent bulk studies have used fication is especially crucial for single-cell DNA analysis
the number of reads covering a a two-tier design of low-depth, whole-genome sequenc- because the initial copy number is one, unlike for DNA
specific nucleotide position is ing combined with deep sequencing of loci underlying sequencing from bulk cell populations or even for single-
the coverage of that position. putative driver carcinogenic events. The approach can cell RNA analysis. Low-fidelity amplification can produce
Increasing read depth leads
to increasing coverage, and to
quantify the frequencies of genetic and epigenetic vari- non-representative and biased sequencing results, which
increasing accuracy of the ants in vitro38 or in vivo; for example, it can be used to in turn may lead to incorrect single-nucleotide polymor-
base calls. estimate tumour cellular dynamics such as mutation phism calls (SNP calls), uneven sequencing coverage and

622 | SEPTEMBER 2013 | VOLUME 14 www.nature.com/reviews/genetics

© 2013 Macmillan Publishers Limited. All rights reserved


REVIEWS

Box 2 | How many individual cells are needed for quantitative transcriptomic analysis?
In a standard bulk RNA sequencing (RNA-seq) experiment, precision is limited only by sequencing depth. Typically, ten
million reads are generated, and a threshold of 50 reads per kb per million reads (RPKM) is considered adequate to call a
gene as expressed. For a gene that is 1 kb long, this corresponds to 500 reads, thus leading to a minimum coefficient of
variation (CV; which is equal to the standard deviation divided by the mean) of 4%, as given by the Poisson distribution.
In a fairly typical single mammalian cell containing 200,000 mRNA molecules, 50 RPKM corresponds to about ten mRNA
molecules. Again, assuming a Poisson distribution across cells, the expected CV is 32%, but this can be reduced by
pooling data from many cells. How many cells are needed to reduce this error to that of the bulk experiment? The
answer is 50, because the pooled data from 50 cells will contain 500 mRNA molecules. These are ideal numbers, and in
practice more cells will be required. For example, if the efficiency of converting mRNA to cDNA is only 10% (which is not
an unrealistic assumption), then tenfold more cells will be required. Similarly, when additional noise is introduced (for
example, by PCR amplification) the number of cells required will increase correspondingly. Furthermore, if the sample is
heterogeneous, then enough cells must be analysed so that all representative cell types are observed. Finally, all these
estimates assume that the single-cell measurements are accurate, as systematic inaccuracies (for example, due to
amplification bias) will not be cured by collecting more cells.
Although necessarily simplistic, these ‘back‑of‑the-envelope’ calculations suggest that hundreds or thousands of
single cells will need to be analysed to answer targeted questions in single tissues. For a whole-organism view, at least
millions of cells will need to be analysed (that is, thousands of cells in thousands of tissues and time points), which is a
feat that will require miniaturization, automation and further reductions in the cost of DNA sequencing.

missing loci (termed locus-dropout or allele-dropout). using DNA barcodes in a single sequencing run, which
Such biases have less effect when sequencing bulk makes it ideal for large-scale analysis of multiple single
cell populations or even WGA products from a few cells (BOX 2; TABLE 2). Genomic enrichment was initially
hundred cells14. approached through PCR amplification of a few to a
An alternative single-cell DNA extraction technique hundred amplicons, and single-cell isolation followed by
uses clonal expansion60,70. However, this method suf- PCR is a current practical alternative to whole-genome
fers from several drawbacks. First, the efficiency of sequencing due to technological advancements in this
the proliferation of a single cell ex vivo is dependent field80,81. Specifically, the ability to cost-effectively syn-
on the cell type, cell stress that occurs post-isolation thesize thousands of custom-designed oligonucleotides
and the availability of suitable conditioned growth enabled the development of more-powerful genome
media71. Second, cell death or decomposition prevents enrichment techniques based on the hybridization of
culturing 8. Third, this approach is contamination- target material to oligonucleotide probes and subsequent
prone, laborious and more time consuming than single- processing (namely selective circularization methods82,83
cell WGA procedures, especially when a large number and hybridization-based capture methods56,59,62,84). These
of cells need to be cultured and their DNA extracted methods allow cost-efficient targeted DNA enrichment
independently. Finally, mutations are introduced dur- and high-throughput NGS library preparation. Further
ing this procedure, especially in MMR-deficient cells. development of the sensitivity and throughput of these
When single-cell WGA techniques mature, it will techniques will probably make these methods more
be valuable to compare them comprehensively with common in single-cell genomics, as an interim step to
clonal expansion for reproducibility and for artefacts whole-genome sequencing or as a long-term companion
caused by polymerase bias. Nevertheless, as previously to such a capability.
explained, subsequent biochemical steps following
clonal expansion can also cause artefacts, even more Single-cell transcriptomics
than the single-cell WGA itself 32. The molecular state of cell populations. Given a het-
Single-molecule sequencing methods (often referred erogeneous cell population, measurement of the mean
to as third-generation sequencing technologies) 72–75 values of key factors, such as the genotype, RNA out-
eliminate the amplification step before sequencing. As put or epigenetic state of a locus of interest, provide
such, they eliminate amplification bias and hold great only a partial characterization of the state of the sys-
promise for single-cell sequencing. Yet, these technolo- tem. Unfortunately, most of the methods that are used
gies currently suffer from high error rates, low-through- for quantifying the molecular state of a cell popula-
put and low sequencing efficiency, owing to slow and tion, from transcriptional profiling to proteomics, are
non-robust detection74,75. based on estimating mean behaviours in ensembles of
For some applications, analysing the entire genome millions of cells by averaging the signal of individual
(or transcriptome) is not essential, and targeting a cells. For example, it is impossible to determine the
genomic subset using genomic enrichment methods cell‑to‑cell variation of gene expression based on
may allow higher sensitivity and lower per-sample cost. microarray or RNA sequencing (RNA-seq) data, or to
For example, genomic subsets can include exomes76,77 determine whether intermediate levels of a signalling
Amplicons
or specific mutations in genes of interest 78–81. High- protein are a consequence of a bimodal or uniform
DNA products of PCR throughput sequencing has the combined advantages of intrapopulation distribution based on standard prot-
amplifications. both high-throughput analysis and sample multiplexing eomics. Going beyond mean-based characterization of

NATURE REVIEWS | GENETICS VOLUME 14 | SEPTEMBER 2013 | 623

© 2013 Macmillan Publishers Limited. All rights reserved


REVIEWS

Table 2 | Current trade-offs in sampling heterogeneous cell populations


Experimental approach
Bulk average Tagged Multi- Deep Small samples Large samples of
libraries dimensional sequencing of of single cells single cells
cell sorting bulk samples
Number of cells Millions Hundreds per Millions Millions Tens to hundreds Thousands to tens of
marker thousands
Molecular markers Any RNA or tagged Surface markers Genetics or DNA RNA, genetics or RNA, genetics or DNA
proteins or signalling methylation DNA methylation methylation
molecules
Typical costs Low High setup cost High Low to medium Medium, High, depending
but subsequently depending on on the sequencing
low the sequencing component, but low
component per-cell cost if samples
are multiplexed
Mean? Global For markers For markers (<50) Yes Yes Yes
(thousands)
Variance? No For markers For markers (<50) No Yes (of limited Yes
(thousands) accuracy)
Pairwise No No For profiled Only linked Yes (of limited Yes
covariance? markers (<50) marks accuracy)
Complex No No Among profiled No No Possibly
correlations and/or markers (<50)
causal networks?
Subpopulation No No Excellent, but model-based Good, but only for Excellent
structure? only if markers (for example, subpopulations
are appropriate carcinogenesis) with significant
(>10%) frequency
Cell lineage tree? No (averaged to No (averaged to No (averaged to No (averaged to Yes Yes
most-recent common MRCA) MRCA) MRCA)
ancestor (MRCA))

cell populations requires balancing the number of sam- Yet, our current level of understanding of cell types,
pled cells and the completeness of functional coverage their origin, evolution and diversity is embarrass-
on varying scales (TABLE 2). ingly poor, despite progress in some specific cases85,86.
Furthermore, there is no general agreement on the
Applications of single-cell transcriptomics. One major number of cell types in a mammalian body. In fact,
application for single-cell transcriptomics is in the anal- there is no agreement on what defines a cell type, and
ysis of rare cell types. For example, circulating tumour finding such a definition must surely be one of the
cells can be obtained from patient blood, but typically most important goals as we embark on large-scale
only a few cells are isolated per blood sample and these single-cell transcriptome analysis. As a starting point,
will often be contaminated by a larger number of nor- we suggest that cell types can be provisionally identi-
mal cells. Single-cell RNA-seq could be used to differ- fied as cells for which global transcriptional states are
entiate between these cell types and simultaneously to similar. Just how similar, and just which parts of the
obtain expression data from the tumour. Similarly, the transcriptome are relevant, will be crucial questions
early human embryo by definition contains only rare for the future. But this provisional concept of cell type
cell types, which exist only transiently. Key questions leads immediately to an unbiased method of cell-
about early development could be addressed using type discovery (FIG. 3): collect a large, unbiased sample
transcriptomics. In this context, transcriptomics has of cells from the tissue of interest, generate transcrip-
the advantage of being able to use sequence polymor- tomes for each cell and use computational methods
phisms (for example, SNPs) to distinguish transcripts to find sets of similar cells. Established clustering and
that are derived from each of the two parental genomes. dimension-reduction methods — such as K‑means,
Another area that will benefit immensely from single- affinity propagation and hierarchical clustering, and
cell transcriptomics is the study of adult stem cells, principal component analysis — will be useful starting
which are often rare, sometimes exist only transiently, points87. Because some laboratories are already ana-
and can be intermingled with other cell types. However, lysing hundreds or thousands of single-cell transcrip-
by using single-cell RNA-seq, each cell type can be tomes, we anticipate that the time will soon be ripe to
extensively sampled simply by taking unbiased samples embark on large-scale, whole-body cell-type discovery
of cells from the tissue. and characterization.
Individual cells differ greatly in their size, morphol- A further area of application for single-cell tran-
ogy, developmental origin and functional properties. scriptomics is the characterization of transcriptional

624 | SEPTEMBER 2013 | VOLUME 14 www.nature.com/reviews/genetics

© 2013 Macmillan Publishers Limited. All rights reserved


REVIEWS

a Obtain an unbiased b Generate single-cell c Identify cell types


sample of single cells expression profiles by clustering

Figure 3 | Cell-type discovery by unbiased sampling and transcriptome profiling of single cells.
a | A sample of cells is taken from the tissue of interest, with the aim of obtaining a representative
Nature sample
Reviewsof| Genetics
the
types of cells that are present in the tissue. b | Each cell is profiled using single-cell RNA sequencing (RNA-seq).
c | Subsequently, the resulting expression profiles are clustered. The result is a map of ‘cell space’, in which similar
cells are grouped close to each other. The strategy is shown here in cartoon form, but in practice it will be
necessary to collect and analyse thousands of cells in each tissue (that is, millions of cells overall) to make a
comprehensive cell space map of a whole organism.

fluctuations. Dynamic changes in RNA content are in the cell) and local (for example due to co‑regulation or
associated with cyclic processes, such as the cell cycle large-scale chromatin modifications). There is also tech-
in dividing cells and circadian rhythms. Other fluctua- nical noise, for example due to pipetting errors, tempera-
tions are stochastic and reflect the fact that transcription ture differences, differences in sequencing depth, PCR
is a discrete process composed of many probabilistic amplification bias and differences in reverse transcrip-
steps. Further heterogeneity is introduced by uneven tion efficiency. It is important to realize that single-cell
partitioning of the cellular content at cell division (for transcriptome analysis is also a single-molecule analysis,
example, REF. 88). Direct transcriptome analysis of large because many genes are expressed at only a few mRNA
numbers of single cells should open up the study of molecules per cell. Amplification from small numbers of
oscillatory and stochastic regulatory processes in unper- molecules is subject to the Monte Carlo effect, in which
turbed cell populations. In a population of putatively stochastic events in the first few cycles of PCR are amplified
identical cells, sets of co‑regulated genes can be identi- exponentially, causing large quantitative errors.
fied. Each set must be part of a functional process, such The ultimate goal of quantitative single-cell tran-
as an oscillator or a stochastic process. For example, scriptome analysis must be to count every RNA mol-
genes that share a common upstream regulator would ecule in the cell exactly, resulting in near-zero technical
presumably show correlated expression. At present, the error. This is required, for example, if we are to use the
number of single cells that must be analysed in order shape of mRNA count distributions to infer the kinetics
to discover covariant genes is unknown, and finding of transcription. Accurate molecule counting is in fact
first estimates of these numbers will be a key task in the possible by using unique labels for molecules93–97. After
near future. There is also evidence that transcription amplification and deep sequencing, each original mole-
is subjected to strong intrinsic fluctuations89,90. Models cule can be identified. As long as the sample is sequenced
to explain this intrinsic noise lead to predictions about deeply enough, so that each molecular label is observed
the shape of the mRNA copy-number distribution, at least once, differences in amplification efficiency do
which can be tested against experimentally measured not matter. Although the use of unique molecular labels
distributions89. Such tests cannot be carried out using has until now been used only for bulk samples, it is a key
bulk measurements, which do not give any information advance that will probably enable a more quantitative
about the variance or any higher moments. Nonetheless, analysis of single-cell transcriptomes.
single-cell transcriptome analysis provides only a snap- Another source of error is losses, which can be severe.
shot in time, and it will remain important to comple- The detection limit of published protocols is 5–10 mol-
ment this view with dynamic, long-term measurements ecules of mRNA. If, as seems likely, the limit of detection
by, for example, time-lapse microscopy 91. is primarily determined by losses during sample prepa-
ration, this would indicate that 80–90% of mRNA was
The road to single-cell transcriptomics. Despite advances lost. Or, to put it the other way around, a 90% loss leads
in single-molecule DNA72–74 and RNA92 sequencing, to an approximately 50% chance of failing to detect a
it is not yet possible to sequence RNA directly from gene that is expressed at a level of seven mRNA mol-
single cells. Currently, RNA needs to be converted to ecules (from the binomial distribution). These losses
cDNA and amplified, and this must be achieved with are especially problematic in small cells, such as stem
minimal losses and without introducing too much cells, in which the mRNA content is low to begin with.
Higher moments quantitative bias. But even in larger cells, such losses introduce a severe
Measures of the shape of a
statistical distribution beyond
There are several sources of noise in a single-cell tran- quantitative error owing to the stochastic sampling of
mean and variance, such as scriptome experiment. There are biological fluctuations, small numbers of molecules. For example, measuring
skewness and kurtosis. both global (that is, affecting the total amount of RNA 100 molecules with a 90% loss leads to 10 ± 3 detected

NATURE REVIEWS | GENETICS VOLUME 14 | SEPTEMBER 2013 | 625

© 2013 Macmillan Publishers Limited. All rights reserved


REVIEWS

Table 3 | Recently published single-cell RNA-seq methods


Method Principle Strand-specific? Positional bias? Early multiplexing? Ref
Tang et al. Homopolymer tailing No Weakly 3′-biased No 102
STRT Template switching Yes 5′ (TSS) Yes 104
SMART–seq Template switching No Nearly full-length No 11
CEL–seq In vitro transcription Yes Strongly 3′-biased Yes 99
Quartz–seq Homopolymer tailing No Weakly 3′-biased No 144
CEL–seq, cell expression by linear amplification and sequencing; SMART–seq, switching mechanism at the 5ʹ end of the RNA
template sequencing; STRT, single-cell tagged reverse transcription; TSS, transcription start site.

molecules, which means that the loss alone has intro- Protocols also differ in when they introduce a bar-
duced a 30% standard deviation. To mitigate the impact code for multiplexing. The great advantage of already
of technical noise, we suggest analysing large numbers of introducing barcodes at the first step is that many cells
single cells (BOX 2). (for example, 96) can be processed together in one tube,
The earliest single-cell transcriptomes were generated reducing both cost and time by a considerable factor.
by in vitro transcription (IVT)98, and recently IVT was However, no early-multiplexed protocols are currently
used to produce libraries for Illumina sequencing, in a capable of sequencing the full length of RNA, because
method called cell expression by linear amplification and barcodes are added only to one end of each cDNA
sequencing (CEL–seq)99. The main advantage of IVT is molecule.
the linear amplification, which should in theory be less Several recently published protocols are compared
biased than exponential amplification methods such in TABLE 3. The most important differences between
as PCR. A disadvantage is that the resulting library is them are shown, but it is also important to stress that
biased towards the 3′ end of genes, and this bias can be the approaches have much in common: similar detection
difficult to control. By contrast, PCR-based protocols are limits (5–10 molecules of mRNA), quantitative biases
capable of amplifying full-length cDNA. due to amplification, limitation to polyadenylated RNAs,
A second approach is to add a homopolymer tail to and gene-specific biases due to GC content or secondary
the first-strand cDNA, which allows the cDNA strand structure.
to be amplified by PCR. An early example used deoxy­ Through automation and the optimization of rea-
guanosine-tailing followed by PCR100. Subsequently, gent consumption, the sample preparation costs of all of
this protocol was optimized101 and adapted for sequenc- the published protocols are similar, and the overall cost
ing 102. Like IVT, homopolymer tailing is biased towards is dominated by the cost of sequencing. For a typical
the 3ʹ end. mammalian cell that contains 200,000 mRNA mol-
A third approach uses ‘template switching’. Common ecules, and assuming tenfold oversampling, at least two
reverse transcriptases of the Moloney murine leukaemia million reads must be generated. The current minimal
virus family tend to add a short tail of (preferentially) cost per cell, when sequencing at high-throughput on
cytosines to the end of the first-strand cDNA. If a helper an Illumina HiSeq 2000 machine, will be approximately
oligonucleotide, carrying a short GGG motif, is included US$10. However, sequencing costs continue to decrease
in the reaction, it will anneal to the cytosine motif and exponentially, which should make it feasible, within five
the reverse transcriptase will switch template and copy years, to analyse millions of single-cell transcriptomes.
the helper oligonucleotide sequence103. The result is that
an arbitrary sequence can be introduced at the 5ʹ end Single-cell epigenomics and proteomics
(by tailing the reverse transcription primer) and at the 3ʹ Clearly, the genome and transcriptome of a cell capture
end (by template switching) of the cDNA, thus allowing only part of its state, and much of the function of the cell
subsequent amplification by PCR. Additionally, template is determined by its epigenome and proteome, which
switching has a preference for 5ʹ-capped RNA, so that the add to the diversity of cells in a population. The epig-
resulting cDNA is enriched for full-length transcripts. enomic state of a cell includes epigenomic marks such as
This is also the main disadvantage, as template switch- DNA methylation and histone methylation and acetyl­
ing will occur only if reverse transcription successfully ation, the structural and regulatory proteins bound to
reaches the 5ʹ end of the mRNA; any partially reverse- chromatin, the spatial interactions between enhancers
transcribed mRNA will fail to be amplified, which and promoters forming transcriptional complexes, and
limits the total yield. Two alternative approaches have the three-dimensional orientation of the chromosomes.
been published for processing the full-length cDNA: Bulk bisulphite sequencing provides information on
single-cell tagged reverse transcription (STRT)104, which the average DNA methylation states for groups of clus-
isolates and sequences the 5ʹ end, corresponding to tered CpG sites at a locus. Depletion of CpG methylation
the transcription start site; and switching mechanism at the is associated with transcriptional activation, and may
5ʹ end of the RNA template (SMART)–seq11, which frag- be a consequence of the binding of regulatory proteins.
ments the cDNA and generates reads that cover the full Bulk experiments can provide data on the distribution
length of each transcript. of methylation within cells or alleles105,106, or support

626 | SEPTEMBER 2013 | VOLUME 14 www.nature.com/reviews/genetics

© 2013 Macmillan Publishers Limited. All rights reserved


REVIEWS

models for the stochastic emergence of differential single-cell proteomics have yet to be reported, although
methylation107. However, in bulk experiments it is gen- preliminary proof-of‑concept studies have been
erally impossible to determine whether two methylated published128,129.
sites are actually present in an individual cell, unless the
methylated sites are so close that they can be detected in Conclusions
a single sequencing read. Single cells are the fundamental units of life. Therefore,
Chromatin immunoprecipitation followed by single-cell analysis is not just one more step towards
sequencing (ChIP–seq) is used to study protein–DNA more-sensitive measurements, but is a decisive jump
interactions genome-wide108, as well as to generate to a more-fundamental understanding of biology. Here
genome-wide maps of histone modifications. ChIP–seq we have described recent advances in sequencing-based
has been used to determine genome-wide patterns of single-cell analysis. These advances include sequencing
transcription factor binding and their relationships to the genomes and transcriptomes of single cells, and we
active transcription and epigenomic marks. Using chro- predict that it will soon be possible to sequence fully all
mosome conformation analysis and all its derivative the nucleic acids in many thousands or even millions of
methods (for example, 3C109, 4C110 and Hi-C111), it is also cells. In addition, we have described how other cellular
possible to measure the interaction between distal chro- phenomena can be converted into a DNA-sequence-
matin elements directly, thus revealing the large-scale based readout. For example, epigenomic marks such
chromosome organization within the nucleus, as well as as histone modifications can be converted into a DNA
the finer details of enhancer–promoter interactions at signal by ChIP–seq. Similarly, protein modifications and
individual loci. Again, however, by using bulk experi- interactions can be converted to DNA by the proximity
ments it is impossible to know if a complex chromatin ligation assay.
conformation or a combination of bound transcription The enormous and ever-increasing power of DNA
factors actually exists in an individual cell. For example, sequencing means that many different cellular phenom-
consider the analysis of a tumour sample. The observa- ena are likely to be convertible to a DNA readout. A for-
tion that a transcription factor is bound to a promoter tuitous consequence of this convergence should allow
and that the corresponding gene is transcribed does not integrated measurements of multiple modalities. The
necessarily imply that these two events have occurred feasibility of such integration has been already demon-
in the same cell. Instead, it is possible that one event strated for genomic and transcriptomic analysis100, and
occurred in the tumour and another in the infiltrating simultaneous DNA, RNA and protein measurements
stromal cells. Combined measurements of epigenomic in single cells can be used to quantitatively describe the
and transcriptomic states in single cells are required to central dogma of molecular biology 130. Nonetheless,
settle the issue. although single-cell analysis methods for single proper-
Broad applications of sequencing-based methods ties (such as only DNA or only RNA) are developing at
to single-cell epigenomics have yet to be reported. The a rapid pace, there is still a long road ahead for assaying
challenges in extending epigenetics to the single-cell multiple properties in single-cell integrated analyses.
level are similar to those faced by single-cell transcrip- The biochemical differences between the cellular prop-
tomics: avoiding loss of material and minimizing quan- erties lead to variations in the methods that are needed
titative bias. For this reason, widespread and largely to isolate them, and modifications of current isolation
binary marks such as DNA methylation and histone methods will be needed to develop a unified single-cell
modifications should be relatively easy to detect in sin- multi-property analysis protocol.
gle cells. Indeed, proof‑of‑concept single-cell epigenetic Such integrated single-cell genetic, epigenetic, tran-
analyses have already been demonstrated for both DNA scriptional and proteomic sequencing-based analyses
methylation112,113 and histone modification114. By con- (FIG. 1), will allow modelling of the relationships among
trast, ChIP–seq targeting transcription factors in single multiple molecular markers, unbiased identification of
cells is a formidable challenge because of the small num- complex cell population structure, and characterization
ber of transcription factors that are present in any single of direct, indirect and in some cases causal dependen-
cell, the low affinity for their target sequence and the cies among factors. Development of complex single-cell
often imperfect nature of antibodies. genetic analysis methods may allow for a better under-
Epigenetic markers were used on bulk cell populations standing of these cellular properties and for redefining
to analyse the dynamics of colorectal cancer41,115,116 and to the concept of ‘cell type’. The feasibility of such integra-
construct lineage trees for colon crypt stem cells117–119. tion has been already demonstrated for genomic and
Proteomic analysis methods include protein transcriptomic analysis100.
arrays120, FACS analysis121, co-immunoprecipitation122, Finally, the accumulation of mutations in single cells
pull-down assays123 and mass spectrometry assays124, during development can be used to infer the lineage
and they reveal different protein properties in a sample ancestry of each cell. Although cell-fate maps describe
(for example, protein concentration or protein–protein potential next states for cells in a particular state36, they
interactions). Methods for DNA-based proteomic anal- do not capture precise lineage relationships. By con-
ysis have also been developed — for example, immuno- trast, cell lineage trees reconstructed using somatic
PCR125 and proximity ligation assays126 — and these mutations capture the lineage relationships among
were recently applied using NGS7,127. As in epigenom- the sampled cells, but do not provide information on the
ics, broad applications of sequencing-based methods to state of ancestor cells. C. elegans is the first and highest

NATURE REVIEWS | GENETICS VOLUME 14 | SEPTEMBER 2013 | 627

© 2013 Macmillan Publishers Limited. All rights reserved


REVIEWS

organism with a known cell lineage tree that captures expanded to describe state dynamics and to integrate
both its cell fate map and the lineage relationships cell lineage trees with cell fate mapping.
among cells131,132. We anticipate that integrated single- Cell lineage trees of higher organisms harbour
cell analysis culminating from the wave of developments answers to many open questions in human biology and
reviewed here will allow similarly powerful results for medicine, and have the potential to transform medicine
higher organisms such as mice and humans. If the states towards personalized, rather than generalized, diagnosis
of the sampled cells — as determined by their transcrip- and treatment. Almost a decade ago it was suggested16
tomes and epigenomes, and perhaps further enhanced that advances in single-cell genomics may inspire the
by their proteomes — that constitute the leaves of a initiation of a ‘‘Human Cell Lineage Project,’’ the aim
reconstructed cell lineage tree, could be known with of which would be to reconstruct an entire human cell
high precision, then additional assumptions about the lineage tree. We believe that the advances reviewed and
states of ancestor cells represented by internal nodes in anticipated here in single-cell sequencing-based technol-
the tree can be formalized into a mathematical model. ogies will bring us closer to achieving this goal and along
This would allow the reconstruction paradigm to be the way will revolutionize whole-organism science.

1. Wetterstrand, K. DNA Sequencing Costs: Data from 17. Kurimoto, K., Yabuta, Y., Ohinata, Y. & Saitou, M. 37. Schepers, A. G. et al. Lineage tracing reveals Lgr5+
the NHGRI Genome Sequencing Program [online] Global single-cell cDNA amplification to provide a stem cell activity in mouse intestinal adenomas.
http://www.genome.gov/sequencingcosts (2013). template for representative high-density Science 337, 730–735 (2012).
2. Walker, T. M. et al. Whole-genome sequencing to oligonucleotide microarray analysis. Nature Protoc. 2, 38. Carlson, C. et al. Decoding cell lineage from acquired
delineate Mycobacterium tuberculosis outbreaks: 739–752 (2007). mutations using arbitrary deep sequencing. Nature
a retrospective observational study. Lancet Infect. Dis. 18. Reizel, Y. et al. Colon stem cell and crypt dynamics Methods 9, 78–80 (2012).
13, 137–146 (2013). exposed by cell lineage reconstruction. Plos Genet. 7, 39. Ellegren, H. Microsatellites: Simple sequences with
3. Lander, E. et al. Initial sequencing and analysis of the e1002192 (2011). complex evolution. Nature Rev. Genet. 5, 435–445
human genome. Nature 409, 860–921 (2001). 19. Shlush, L. I. et al. Cell lineage analysis of acute (2004).
4. The 1000 Genomes Project Consortium. An integrated leukemia relapse uncovers the role of replication-rate 40. Salipante, S. & Horwitz, M. Phylogenetic fate
map of genetic variation from 1,092 human genomes. heterogeneity and miscrosatellite instability. Blood mapping. Proc. Natl Acad. Sci. USA 103, 5448–5453
Nature 491, 56–65 (2012). 120, 603–612 (2012). (2006).
5. Mortazavi, A., Williams, B. A., McCue, K., Schaeffer, L. 20. Choi, J. H. et al. Development and optimization 41. Tsao, J. et al. Colorectal adenoma and cancer
& Wold, B. Mapping and quantifying mammalian of a process for automated recovery of single cells divergence - evidence of multilineage progression.
transcriptomes by RNA-Seq. Nature Methods 5, identified by microengraving. Biotechnol. Prog. 26, Am. J. Pathol. 154, 1815–1824 (1999).
621–628 (2008). 888–895 (2010). 42. Zhou, W. et al. Use of somatic mutations to quantify
6. Gu, H. et al. Preparation of reduced representation 21. Zhang, H. & Liu, K. K. Optical tweezers for single cells. random contributions to mouse development.
bisulfite sequencing libraries for genome-scale DNA J. R. Soc. Interface 5, 671–690 (2008). BMC Genomics 14, 39 (2013).
methylation profiling. Nature Protoc. 6, 468–481 22. Frumkin, D. et al. Amplification of multiple genomic 43. Vilkki, S. et al. Extensive somatic microsatellite
(2011). loci from single cells isolated by laser micro-dissection mutations in normal human tissue. Cancer Res. 61,
7. Darmanis, S. et al. ProteinSeq: high-performance of tissues. BMC Biotechnol. 8,17 (2008). 4541–4544 (2001).
proteomic analyses by proximity ligation and next 23. Yachida, S. et al. Distant metastasis occurs late during 44. Wasserstrom, A. et al. Reconstruction of cell lineage
generation sequencing. PLoS ONE 6, e25583 the genetic evolution of pancreatic cancer. Nature trees in mice. PLoS ONE 3, e1939 (2008).
(2011). 467, 1114–1117 (2010). 45. Wasserstrom, A. et al. Estimating cell depth from
8. Navin, N. et al. Tumour evolution inferred by single- 24. Bhattacherjee, V. et al. Laser capture microdissection somatic mutations. PLoS Computat. Biol. 4,
cell sequencing. Nature 472, 90–104 (2011). of fluorescently labeled embryonic cranial neural crest e1000058 (2008).
9. Zong, C., Lu, S., Chapman, A. R. & Xie, X. S. cells. Genesis 39, 58–64 (2004). 46. Segev, E. et al. Muscle-bound primordial stem cells
Genome-wide detection of single-nucleotide and 25. Guo, M. T., Rotem, A., Heyman, J. A. & Weitz, D. A. give rise to myofiber-associated myogenic and non-
copy-number variations of a single human cell. Science Droplet microfluidics for high-throughput biological myogenic progenitors. PLoS ONE 6, e25605 (2011).
338, 1622–1626 (2012). assays. Lab. Chip 12, 2146–2155 (2012). 47. Ross, M. G. et al. Characterizing and measuring bias
10. Kalisky, T., Blainey, P. & Quake, S. R. Genomic 26. Fan, H., Wang, J., Potanina, A. & Quake, S. in sequence data. Genome Biol. 14, R51 (2013).
analysis at the single-cell level. Annu. Rev. Genet. 45, Whole-genome molecular haplotyping of single cells. 48. Fidler, I. & Kripke, M. Metastasis results from
431–445 (2011). Nature Biotech. 29, 51–57 (2011). preexisting variant cells within a malignant-tumor.
11. Ramskold, D. et al. Full-length mRNA-Seq from single- 27. Wang, J., Fan, H. C., Behr, B. & Quake, S. R. Science 197, 893–895 (1977).
cell levels of RNA and individual circulating tumor Genome-wide single-cell analysis of recombination 49. Kim, M. Y. et al. Tumor self-seeding by circulating
cells. Nature Biotech. 30, 777–782 (2012). activity and de novo mutation rates in human sperm. cancer cells. Cell 139, 1315–1326 (2009).
This paper described the first single-cell RNA-seq Cell 150, 402–412 (2012). 50. Fidler, I. Critical determinants of metastasis.
method to achieve near full-length coverage of 28. White, A. et al. High-throughput microfluidic Seminars Cancer Biol. 12, 89–96 (2002).
transcripts, and demonstrated transcriptome single-cell RT‑qPCR. Proc. Natl Acad. Sci. USA 108, 51. Eaves, C. J. Cancer stem cells: here, there,
sequencing from single circulating tumour cells. 13999–14004 (2011). everywhere? Nature 456, 581–582 (2008).
12. Dalerba, P. et al. Single-cell dissection of 29. Lecault, V., White, A. K., Singhal, A. & Hansen, C. L. 52. Frank, N. Y., Schatton, T. & Frank, M. H.
transcriptional heterogeneity in human colon tumors. Microfluidic single cell analysis: from promise to The therapeutic promise of the cancer stem cell
Nature Biotech. 29, 1120–1127 (2011). practice. Curr. Opin. Chem. Biol. 16, 381–390 concept. J. Clin. Invest. 120, 41–50 (2010).
13. Cristofanilli, M. et al. Circulating tumor cells: a novel (2012). 53. Pawelek, J. M. & Chakraborty, A. K. Fusion of tumour
prognostic factor for newly diagnosed metastatic 30. Schatz, D. G. & Swanson, P. C. V(D)J recombination: cells with bone marrow-derived cells: a unifying
breast cancer. J. Clin. Oncol. 23, 1420–1430 mechanisms of initiation. Annu. Rev. Genet. 45, explanation for metastasis. Nature Rev. Cancer 8,
(2005). 167–202 (2011). 377–386 (2008).
14. Blainey, P. C. The future is now: single-cell genomics 31. Yates, L. R. & Campbell, P. J. Evolution of the cancer 54. Lazova, R. et al. A melanoma brain metastasis with a
of bacteria and archaea. FEMS Microbiol. Rev. 37, genome. Nature Rev. Genet. 13, 795–806 (2012). donor-patient hybrid genome following bone marrow
407–427 (2013). 32. Reizel, Y. et al. Cell lineage analysis of the transplantation: first evidence for fusion in human
A review of single-cell genomics of microorganisms, mammalian female germline. PLoS Genet. 8, cancer. PLoS ONE 8, e66731 (2013).
including currently available WGA techniques. e1002477 (2012). 55. Blagosklonny, M. V. Target for cancer therapy:
15. Gundry, M., Li, W., Maqbool, S. B. & Vijg, J. 33. Szabat, M. et al. Maintenance of β-cell maturity proliferating cells or stem cells. Leukemia 20,
Direct, genome-wide assessment of DNA mutations and plasticity in the adult pancreas: developmental 385–391 (2006).
in single cells. Nucleic Acids Res. 40, 2032–2040 biology concepts in adult physiology. Diabetes 61, 56. Xu, X. et al. Single-cell exome sequencing reveals
(2012). 1365–1371 (2012). single-nucleotide mutation characteristics of a kidney
16. Frumkin, D., Wasserstrom, A., Kaplan, S., Feige, U. & 34. Ming, G. & Song, H. Adult neurogenesis in the tumor. Cell 148, 886–895 (2012).
Shapiro, E. Genomic variability within an organism mammalian brain: significant answers and significant 57. Anderson, K. et al. Genetic variegation of clonal
exposes its cell lineage tree. PLoS Computat. Biol. 1, questions. Neuron 70, 687–702 (2011). architecture and propagating cells in leukaemia.
382–394 (2005). 35. Chojnacki, A. K., Mak, G. K. & Weiss, S. Identity crisis Nature 469, 356–361 (2011).
A conceptual and theoretical basis for organism for adult periventricular neural stem cells: 58. Baslan, T. et al. Genome-wide copy number analysis of
cell lineage tree reconstruction using the genomic subventricular zone astrocytes, ependymal cells or single cells. Nature Protoc. 7, 1024–1041 (2012).
variability among organismal cells. It is also a both? Nature Rev. Neurosci. 10, 153–163 (2009). 59. Hou, Y. et al. Single-cell exome sequencing and
preliminary proof-of-concept demonstration of 36. Yona, S. et al. Fate mapping reveals origins and monoclonal evolution of a JAK2‑negative
reconstructing cell lineage trees using somatic dynamics of monocytes and tissue macrophages myeloproliferative neoplasm. Cell 148, 873–885
mutations in a small panel of microsatellites. under homeostasis. Immunity 38, 79–91 (2013). (2012).

628 | SEPTEMBER 2013 | VOLUME 14 www.nature.com/reviews/genetics

© 2013 Macmillan Publishers Limited. All rights reserved


REVIEWS

60. Jan, M. et al. Clonal evolution of preleukemic 85. Arendt, D. The evolution of cell types in animals: 105. Arand, J. et al. In vivo control of CpG and non-CpG
hematopoietic stem cells precedes human acute emerging principles from molecular studies. Nature DNA methylation by DNA methyltransferases.
myeloid leukemia. Sci. Transl Med. 4, 149ra118 Rev. Genet. 9, 868–882 (2008). PLoS Genet. 8, e1002750 (2012).
(2012). In this Review, the author discusses the origin and 106. Taylor, K. H. et al. Ultradeep bisulfite sequencing
61. Ding, L. et al. Clonal evolution in relapsed acute evolution of diverse cell types in animals, an issue analysis of DNA methylation patterns in multiple gene
myeloid leukaemia revealed by whole-genome that has been curiously neglected by biologists. promoters by 454 sequencing. Cancer Res. 67,
sequencing. Nature 481, 506–510 (2012). 86. Vickaryous, M. K. & Hall, B. K. Human cell type 8511–8518 (2007).
62. Gerlinger, M. et al. Intratumor heterogeneity and diversity, evolution, development, and classification 107. Landan, G. et al. Epigenetic polymorphism and the
branched evolution revealed by multiregion with special reference to cells derived from the stochastic formation of differentially methylated
sequencing. N. Engl. J. Med. 366, 883–892 (2012). neural crest. Biol. Rev. Cambridge Philos. Soc. 81, regions in normal and cancerous tissues. Nature
An exposition of the heterogeneity within different 425–455 (2006). Genet. 44, 1207–1214 (2012).
regions in a single tumour, demonstrating the This paper is a careful review of all human cell 108. Jothi, R., Cuddapah, S., Barski, A., Cui, K. & Zhao, K.
importance of the integration of several analysis types that have been given names in the literature, Genome-wide identification of in vivo protein-DNA
methods including DNA and RNA sequencing. which is a useful starting point for future cell-type binding sites from ChIP-Seq data. Nucleic Acids Res.
63. Nik-Zainal, S. et al. The life history of 21 breast discovery experiments. 36, 5221–5231 (2008).
cancers. Cell 149, 994–1007 (2012). 87. Gehlenborg, N. et al. Visualization of omics data 109. Dekker, J., Rippe, K., Dekker, M. & Kleckner, N.
64. Cheung, V. & Nelson, S. Whole genome amplification for systems biology. Nature Methods 7, S56–S68 Capturing chromosome conformation. Science 295,
using a degenerate oligonucleotide primer allows (2010). 1306–1311 (2002).
hundreds of genotypes to be performed on less than 88. Johnston, I. G. et al. Mitochondrial variability as a 110. van de Werken, H. J. et al. Robust 4C‑seq data
one nanogram of genomic DNA. Proc. Natl Acad. Sci. source of extrinsic cellular noise. PLoS Computat. Biol. analysis to screen for regulatory DNA interactions.
USA 93, 14676–14679 (1996). 8, e1002416 (2012). Nature Methods 9, 969–972 (2012).
65. Arneson, N., Hughes, S., Houlston, R. & Done, S. 89. Raj, A., Peskin, C. S., Tranchina, D., Vargas, D. Y. & 111. Lieberman-Aiden, E. et al. Comprehensive mapping
Whole-genome amplification by improved primer Tyagi, S. Stochastic mRNA synthesis in mammalian of long-range interactions reveals folding principles
extension preamplification PCR (I‑PEP-PCR). CSH cells. PLoS Biol. 4, e309 (2006). of the human genome. Science 326, 289–293
Protoc. 2008, pdb.prot4921 (2008). This study using single-molecule imaging of mRNAs (2009).
66. Klein, C. A. et al. Comparative genomic hybridization, shows that mRNA abundances vary tremendously 112. Kantlehner, M. et al. A high-throughput DNA
loss of heterozygosity, and DNA sequence analysis within putatively homogenous cell populations, and methylation analysis of a single cell. Nucleic Acids Res.
of single cells. Proc. Natl Acad. Sci. USA 96, provides initial estimates of transcriptional burst 39, e44 (2011).
4494–4499 (1999). kinetics in mammalian cells. 113. Denomme, M. M., Zhang, L. & Mann, M. R.
67. Dean, F. et al. Comprehensive human genome 90. Raj, A. & Vanoudenaarden, A. Nature, nurture, or Single oocyte bisulfite mutagenesis. J. Vis. Exp. 64,
amplification using multiple displacement amplification. chance: stochastic gene expression and its e4046 (2012).
Proc. Natl Acad. Sci. USA 99, 5261–5266 (2002). consequences. Cell 135, 216–226 (2008). 114. Hayashi-Takanaka, Y. et al. Tracking epigenetic histone
68. Lu, S. et al. Probing meiotic recombination and 91. Endele, M. & Schroeder, T. Molecular live cell modifications in single cells using Fab-based live
aneuploidy of single sperm cells by whole-genome bioimaging in stem cell research. Ann. NY Acad. Sci. endogenous modification labeling. Nucleic Acids Res.
sequencing. Science 338, 1627–1630 (2012). 1266, 18–27 (2012). 39, 6475–6488 (2011).
69. Peng, W., Takabayashi, H. & Ikawa, K. Whole genome 92. Ozsolak, F. et al. Direct RNA sequencing. Nature 461, 115. Tsao, J. L. et al. Genetic reconstruction of individual
amplification from single cells in preimplantation 814–818 (2009). colorectal tumor histories. Proc. Natl Acad. Sci. USA
genetic diagnosis and prenatal diagnosis. Eur. 93. Casbon, J. A., Osborne, R. J., Brenner, S. & 97, 1236–1241 (2000).
J. Obstet. Gynecol. Reprod. Biol. 131, 13–20 Lichtenstein, C. P. A method for counting PCR template 116. Siegmund, K., Marjoram, P., Woo, Y., Tavare, S. &
(2007). molecules with application to next-generation Shibata, D. Inferring clonal expansion and cancer stem
70. Salipante, S. J., Kas, A., McMonagle, E. & sequencing. Nucleic Acids Res. 39, e81 (2011). cell dynamics from DNA methylation patterns in
Horwitz, M. S. Phylogenetic analysis of developmental 94. Kivioja, T. et al. Counting absolute numbers of colorectal cancers. Proc. Natl Acad. Sci. USA 106,
and postnatal mouse cell lineages. Evol. Dev. 12, molecules using unique molecular identifiers. Nature 4828–4833 (2009).
84–94 (2010). Methods 9, 72–74 (2011). 117. Nicolas, P., Kim, K., Shibata, D. & Tavare, S. The stem
71. Zaretsky, I. et al. Monitoring the dynamics of primary 95. Shiroguchi, K., Jia, T. Z., Sims, P. A. & Xie, X. S. cell population of the human colon crypt: analysis
T cell activation and differentiation using long term Digital RNA sequencing minimizes sequence- via methylation patterns. PLoS Computat. Biol. 3,
live cell imaging in microwell arrays. Lab. Chip 12, dependent bias and amplification noise with optimized 364–374 (2007).
5007–5015 (2012). single-molecule barcodes. Proc. Natl Acad. Sci. USA 118. Yatabe, Y., Tavaré, S. & Shibata, D. Investigating
72. Harris, T. D. et al. Single-molecule DNA sequencing 109, 1347–1352 (2012). stem cells in human colon by using methylation
of a viral genome. Science 320, 106–109 (2008). 96. Fu, G. K., Hu, J., Wang, P. H. & Fodor, S. P. patterns. Proc. Natl Acad. Sci. USA 98,
73. Eid, J. et al. Real-time DNA sequencing from single Counting individual DNA molecules by the stochastic 10839–10844 (2001).
polymerase molecules. Science 323, 133–138 attachment of diverse labels. Proc. Natl Acad. Sci. 119. Kim, K. M. & Shibata, D. Methylation reveals a niche:
(2009). USA 108, 9026–9031 (2011). stem cell succession in human colon crypts. Oncogene
74. Schadt, E., Turner, S. & Kasarskis, A. A window into 97. Kinde, I., Wu, J., Papadopoulos, N., Kinzler, K. W. & 21, 5441–5449 (2002).
third-generation sequencing. Hum. Mol. Genet. 19, Vogelstein, B. Detection and quantification of rare 120. Hodgkinson, V., ElFadl, D., Drew, P., Lind, M. &
R227–R240 (2010). mutations with massively parallel sequencing. Proc. Cawkwell, L. Repeatedly identified differentially
75. Xu, M., Fujita, D. & Hanagata, N. Perspectives and Natl Acad. Sci. USA 108, 9530–9535 (2011). expressed proteins (RIDEPs) from antibody
challenges of emerging single-molecule DNA 98. Eberwine, J. et al. Analysis of gene expression in microarray proteomic analysis. J. Proteom. 74,
sequencing technologies. Small 5, 2638–2649 single live neurons. Proc. Natl Acad. Sci. USA 89, 698–703 (2011).
(2009). 3010–3014 (1992). 121. Bendall, S. et al. Single-cell mass cytometry of
76. Ng, S. et al. Targeted capture and massively parallel 99. Hashimshony, T., Wagner, F., Sher, N. & Yanai, I. differential immune and drug responses across a
sequencing of 12 human exomes. Nature 461, CEL-Seq: single-cell RNA-Seq by multiplexed linear human hematopoietic continuum. Science 332,
272–276 (2009). amplification. Cell Rep. 2, 666–673 (2012). 687–696 (2011).
77. Teer, J. & Mullikin, J. Exome sequencing: the sweet 100. Klein, C. A. et al. Combined transcriptome and 122. Lee, H. W. et al. Real-time single-molecule
spot before whole genomes. Hum. Mol. Genet. 19, genome analysis of single micrometastatic cells. co‑immunoprecipitation analyses reveal cancer-specific
R145–R151 (2010). Nature Biotech. 20, 387–392 (2002). Ras signalling dynamics. Nature Commun. 4, 1505
78. Giulino-Roth, L. et al. Targeted genomic sequencing of This study reported a simultaneous genomic and (2013).
pediatric Burkitt lymphoma identifies recurrent transcriptomic analysis of individual cells using a 123. Jain, A. et al. Probing cellular protein complexes using
alterations in antiapoptotic and chromatin-remodeling microarray readout. This is a first example of an single-molecule pull-down. Nature 473, 484–488
genes. Blood 120, 5181–5184 (2012). integrated single-cell analysis. (2011).
79. Valencia, C. A. et al. Comprehensive mutation 101. Kurimoto, K. et al. An improved single-cell cDNA 124. Keshishian, H. et al. Quantification of cardiovascular
analysis for congenital muscular dystrophy: amplification method for efficient high-density biomarkers in patient plasma by targeted mass
a clinical PCR-based enrichment and next-generation oligonucleotide microarray analysis. Nucleic Acids Res. spectrometry and stable isotope dilution. Mol. Cell
sequencing panel. PLoS ONE 8, e53083 (2013). 34, e42 (2006). Proteom. 8, 2339–2349 (2009).
80. Hollants, S., Redeker, E. & Matthijs, G. Microfluidic 102. Tang, F. et al. mRNA-Seq whole-transcriptome 125. Niemeyer, C., Adler, M. & Wacker, R. Detecting
amplification as a tool for massive parallel sequencing analysis of a single cell. Nature Methods 6, antigens by quantitative immuno-PCR. Nature Protoc.
of the familial hypercholesterolemia genes. Clin. Chem. 377–382 (2009). 2, 1918–1930 (2007).
58, 717–724 (2012). The first demonstration of single-cell RNA-seq with 126. Fredriksson, S. et al. Multiplexed protein detection by
81. Tewhey, R. et al. Microdroplet-based PCR enrichment accurate detection of alternatively spliced proximity ligation for cancer biomarker validation.
for large-scale targeted sequencing. Nature Biotech. transcripts in single mouse oocytes. Nature Methods 4, 327–329 (2007).
27, 1025–1031 (2009). 103. Maleszka, R. & Stange, G. Molecular cloning, by a 127. Turner, D. J. et al. Toward clinical proteomics on a
82. Li, J. et al. Multiplex padlock targeted sequencing novel approach, of a cDNA encoding a putative next-generation sequencing platform. Anal. Chem. 83,
reveals human hypermutable CpG variations. olfactory protein in the labial palps of the moth 666–670 (2011).
Genome Res. 19, 1606–1615 (2009). Cactoblastis cactorum. Gene 202, 39–43 (1997). 128. Salehi-Reyhani, A. et al. A first step towards practical
83. Johansson, H. et al. Targeted resequencing 104. Islam, S. et al. Characterization of the single-cell single cell proteomics: a microfluidic antibody capture
of candidate genes using selector probes. Nucleic transcriptional landscape by highly multiplex chip with TIRF detection. Lab. Chip 11, 1256–1261
Acids Res. 39, e8 (2011). RNA-seq. Genome Res. 21, 1160–1167 (2011). (2011).
84. Diaz-Horta, O. et al. Whole-exome sequencing The first demonstration of highly multiplexed 129. Shi, Q. et al. Single-cell proteomic chip for
efficiently detects rare mutations in autosomal single-cell RNA-seq showing that cell types can be profiling intracellular signaling pathways in
recessive nonsyndromic hearing loss. PLoS ONE 7, distinguished in an unbiased manner on the basis single tumor cells. Proc. Natl Acad. Sci. USA 109,
e50628 (2012). of unfiltered single-cell gene expression profiles. 419–424 (2012).

NATURE REVIEWS | GENETICS VOLUME 14 | SEPTEMBER 2013 | 629

© 2013 Macmillan Publishers Limited. All rights reserved


REVIEWS

130. Li, G. W. & Xie, X. S. Central dogma at the single- 135. Murray, J. et al. Multidimensional regulation of gene 144. Sasagawa, Y. et al. Quartz-Seq: a highly reproducible
molecule level in living cells. Nature 475, 308–315 expression in the C. elegans embryo. Genome Res. and sensitive single-cell RNA-Seq reveals non-genetic
(2011). 22, 1282–1294 (2012). gene expression heterogeneity. Genome Biol. 14, R31
A review of the central dogma of molecular biology 136. DeKosky, B. J. et al. High-throughput sequencing of the (2013).
in terms of stochastic kinetics in single cells and of paired human immunoglobulin heavy and light chain
imaging-based methods for single-cell and repertoire. Nature Biotech. 31, 166–169 (2013). Acknowledgements
single-molecule analysis. 137. Peters, B. et al. Accurate whole-genome sequencing The work of S.L. was supported by grant 261063 from the
131. Sulston, J. E., Schierenberg, E., White, J. G. & and haplotyping from 10 to 20 human cells. Nature European Research Council and by the Swedish Research
Thomson, J. N. The embryonic cell lineage of the 487, 190–195 (2012). Council STARGET consortium. The work of E.S. and T.B. was
nematode Caenorhabditis elegans. Dev. Biol. 100, 138. Nik-Zainal, S. et al. Mutational processes molding the supported by The European Union FP7‑ERC-AdG grant and
64–119 (1983). genomes of 21 breast cancers. Cell 149, 979–993 by a grant from the Kenneth and Sally Leafman Appelbaum
132. Sulston, J. E. & Horvitz, H. R. Post-embryonic cell (2012). Discovery Fund. E.S. is the Incumbent of The Harry Weinrebe
lineages of the nematode, Caenorhabditis elegans. 139. Timmermann, B. et al. Somatic mutation profiles of Professorial Chair of Computer Science and Biology. The con-
Dev. Biol. 56, 110–156 (1977). MSI and MSS colorectal cancer identified by whole tribution of E.S. to this Review was inspired by a research
The first reconstruction of a complete organism exome next generation sequencing and bioinformatics proposal prepared by E.S. in collaboration with I. Amit,
cell lineage, of the C. elegans nematode, published analysis. PLoS ONE 5, e15661 (2010). A. Tanay and M. Schwarz.
almost four decades ago. Complete cell lineage 140. Diep, D. et al. Library-free methylation sequencing
trees of higher organisms are yet to be with bisulfite padlock probes. Nature Methods 9, Competing interests statement
reconstructed. 270–272 (2012). The authors declare no competing financial interests.
133. Noctor, S. C., Martinez-Cerdeno, V., Ivic, L. & 141. Wang, E. T. et al. Alternative isoform regulation in human
Kriegstein, A. R. Cortical neurons arise in symmetric tissue transcriptomes. Nature 456, 470–476 (2008).
and asymmetric division zones and migrate through 142. Nagalakshmi, U. et al. The transcriptional landscape FURTHER INFORMATION
specific phases. Nature Neurosci. 7, 136–144 of the yeast genome defined by RNA sequencing. Nature Reviews Genetics article series on applications of
(2004). Science 320, 1344–1349 (2008). next-generation sequencing: http://www.nature.com/nrg/
134. Murray, J. et al. Automated analysis of embryonic 143. Cloonan, N. et al. Stem cell transcriptome profiling via series/nextgeneration/index.html
gene expression with cellular resolution in C. elegans. massive-scale mRNA sequencing. Nature Methods 5, ALL LINKS ARE ACTIVE IN THE ONLINE PDF
Nature Methods 5, 703–709 (2008). 613–619 (2008).

630 | SEPTEMBER 2013 | VOLUME 14 www.nature.com/reviews/genetics

© 2013 Macmillan Publishers Limited. All rights reserved

You might also like