0% found this document useful (0 votes)
27 views18 pages

Tertiary Motifs in RNA Structure

This article reviews the critical role of RNA in cellular information transfer and its ability to adopt complex three-dimensional structures for various biological functions. It discusses the hierarchical folding of RNA, the significance of tertiary structural motifs, and the methods used to elucidate RNA structures, such as X-ray crystallography and NMR spectroscopy. The review emphasizes the importance of base stacking and metal ion interactions in stabilizing RNA architecture and guiding the folding process.

Uploaded by

No one
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views18 pages

Tertiary Motifs in RNA Structure

This article reviews the critical role of RNA in cellular information transfer and its ability to adopt complex three-dimensional structures for various biological functions. It discusses the hierarchical folding of RNA, the significance of tertiary structural motifs, and the methods used to elucidate RNA structures, such as X-ray crystallography and NMR spectroscopy. The review emphasizes the importance of base stacking and metal ion interactions in stabilizing RNA architecture and guiding the folding process.

Uploaded by

No one
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

REVIEWS

Tertiary Motifs in RNA Structure and Folding


Robert T. Batey, Robert P. Rambo, and Jennifer A. Doudna*

RNA plays a critical role in mediating globular architecture. In this article we pathway in which domains assemble
every step of the cellular information present an overview of the structures sequentially. Formation of the proper
transfer pathway from DNA-encoded of these motifs and their contribution tertiary interactions between these do-
genes to functional proteins. Its diver- to the organization of large, biologi- mains leads to discrete intermediates
sity of biological functions stems from cally active RNAs. Base stacking, par- along this pathway. Advances in the
the ability of RNA to act as a carrier of ticipation of the ribose 2'-hydroxyl understanding of RNA structure have
genetic information and to adopt com- groups in hydrogen-bonding interac- facilitated improvements in the tech-
plex three-dimensional folds that cre- tions, binding of divalent metal cations, niques that are utilized in modeling the
ate sites for chemical catalysis. Atomic- noncanonical base pairing, and back- global architecture of biologically in-
resolution structures of several large bone topology all serve to stabilize the teresting RNAs that have been resist-
RNA molecules, determined by X-ray global structure of RNA and play ant to atomic-resolution structural
crystallography, have elucidated some critical roles in guiding the folding analysis.
of the means by which a global fold is process. Studies of the RNA-folding
achieved. Within these RNAs are ter- problem, which is conceptually analo- Keywords: molecular modeling ´
tiary structural motifs that enable the gous to the protein-folding problem, nucleic acids ´ ribozymes ´ RNA ´
highly anionic double-stranded helices have demonstrated that folding pri- structure elucidation
to tightly pack together to create a marily proceeds through a hierarchical

1. Introduction is the macromolecule capable of rapidly folding into the


complex three-dimensional shape necessary for catalytic
The discovery that RNA is capable of catalyzing chemical activity in spite of the nearly infinite number of potential
reactions was a watershed event that led to the realization that available conformationsÐthe famous ªLevinthal paradoxº?[3]
in many ways RNA is more akin to proteins than to its As will be detailed in this review recent insights into the
chemical cousin DNA.[1, 2] Like proteins, RNAs adopt com- ªtertiaryº level of RNA structure have revealed some of the
plex three-dimensional folds for the precise presentation of means by which RNA is capable of achieving a complex
chemical moieties that is essential for its function as a global fold and how these interactions serve to direct the
biological catalyst, translator of genetic information, and folding pathway.
structural scaffold. To understand RNA in this light we must
begin to address the same fundamental questions that are
central to the study of protein structure and enzymology. First, 2. RNA Structure
how are three-dimensional structure and function related?
This generally entails detailed mechanistic studies of the 2.1. Elements of RNA Structure
chemical reaction catalyzed by the enzyme coupled with
structural characterization of the ground state, transition RNA structure is divided into three fundamental levels of
state, and functional groups involved in catalysis. Second, how organization: primary, secondary, and tertiary structure.
[*] Dr. J. A. Doudna, Dr. R. T. Batey, R. P. Rambo Primary structure refers to the nucleotide sequence of an
Department of Molecular Biophysics and Biochemistry RNA, which can be obtained to a first approximation from the
and DNA sequence of the gene encoding the RNA. Since many
Howard Hughes Medical Institute biologically active RNAs are post-transcriptionally modified
Yale University
266 Whitney Ave., New Haven, CT 06520-8114 (USA)
the DNA sequence often does not reveal the true primary
Fax: (‡ 1) 203-432-3104 structure. These modifications include the methylation of
E-mail: doudna@[Link] nucleotide bases and 2'-hydroxyl groups of ribose sugars,

Angew. Chem. Int. Ed. 1999, 38, 2326 ± 2343  WILEY-VCH Verlag GmbH, D-69451 Weinheim, 1999 1433-7851/99/3816-2327 $ 17.50+.50/0 2327
REVIEWS J. A. Doudna et al.

formation of unusual bases such as pseudouracil (Y) and


dihydrouridine (D), insertion or deletion of nucleotides in
messenger RNA, and splicing of internal sequences (introns)
from pre-messenger RNA. Thus, to completely determine the
primary sequence the RNA must be purified from its native
source and characterized by using a combination of sequenc-
ing methods[4] and mass spectrometry.[5]
The secondary structure of RNA is presented as a two-
dimensional representation of its Watson ± Crick base pairs
(Figure 1 a) and intervening ªunpairedº regions. Commonly
described secondary structural elements are: duplexes, single-
stranded regions, hairpins, bulges, internal loops, and junc-
tions, as illustrated in Figure 1 b (a comprehensive review has
been written by Chastain and Tinoco[6] ). For example, the
secondary structure of transfer RNA (tRNA) is organized
into three hairpin structures, the D-, T-, and anticodon arms,
and the acceptor stem, which is represented as the classic
cloverleaf structure shown in Figure 2 a.
This level of RNA structure is determined from several
methods. The simplest technique, but most fraught with
uncertainty, is to use secondary structure prediction programs
that fold the input primary sequence into potential secondary
structures. These programs are all based upon finding the
secondary structure with the total lowest free energy by
calculating the free energy of a number of base-pairing
schemes, and returning the lowest energy potential secondary
structures as the most probable.[7] A more reliable means for
determining the secondary structure of biological RNAs is
comparative sequence analysis,[8] which exploits the tendency
for the global architecture of biological RNAs to be con-
served. The phylogenetic covariance of two or more nucleo-
tides that are distant in the primary sequence implies that they
interact at some level. In helical regions this covariation
manifests itself as the exchange of one type of Watson ± Crick
base pair for another (for example, a CG to an AU pair). For
many RNAs there is a sufficient database of sequences from a
broad evolutionary spectrum to allow for very accurate
secondary structures to be derived from newly sequenced
RNA genes from almost any organism.
A secondary structure generated from the above methods
must be experimentally verified by biochemical techniques Figure 1. a) The standard Watson ± Crick base pairs found in RNA, along
that probe the solution structure of the RNA. The most with the numbering scheme for the nucleotide bases and ribose sugar.
common technique involves probing the RNA with ribonu- b) Common RNA secondary structural elements.

Jennifer A. Doudna was born in 1964 in Washington DC (USA). She received her B.A. in
Chemistry with honors from Pomona College in Claremont, CA, in 1985, and her Ph.D. in
Biochemistry from Harvard University in 1989. At Harvard she worked with Prof. Jack
Szostak on the design of self-replicating RNA, which led to her interest in RNA structure and
folding. She was awarded a Lucille P. Markey Scholar fellowship in 1991 to study with Prof.
Tom Cech at the University fo Colorado, Boulder. During the next three years she developed
methods for the crystallization of RNA in collaboration with Prof. Craig Kundrot. She joined
the Yale faculty in the Departement of Molecular Biophysics and Biochemistry in 1994, and she
was promoted to Associate Professor in 1997 and to Professor in 1999. She received the
Johnson Foundation Prize for Innovative Research in 1996 for her work on the crystal
structure of the P4ÿP6 domain of the Tetrahymena self-splicing intron. She is currently an
Assistant Investigator at the Howard Hughes Medical Institute.

2328 Angew. Chem. Int. Ed. 1999, 38, 2326 ± 2343


RNA Structure REVIEWS
ments in the synthesis of milligram quantities
of large RNAs[10] and the techniques for
structural analysis such as X-ray crystallogra-
phy and nuclear magnetic resonance (NMR)
spectroscopy (for excellent reviews see refer-
ences [11, 12] respectively). Within the high
resolution structures of large RNAs solved to
date (tRNA,[13, 14] the hammerhead ribo-
zyme,[15, 16] the P4ÿP6 domain of the Tetrahy-
mena thermophila self-splicing intron,[17] and
the hepatitis delta virus ribozyme[18] ) it is
tertiary interactions that play a dominant role
in establishing the global fold of the molecule.
The tertiary interactions observed in phenyl-
alanine ± tRNA (tRNAPhe), along with a repre-
sentation of the secondary structure that re-
flects the three-dimensional structure, are
shown in Figure 2 b. This view of tRNA illus-
trates a fundamental feature of biological
RNAs. The invariant nucleotides in class I
tRNAs (shown in boldface in Figure 2 a, b
and as black bars in Figure 2 c) fall into two
distinct types: biologically functional nucleo-
tides, such as those found at the 3'-terminus of
the aminoacyl acceptor arm, and those that are
important for the establishment of the global
architecture of the RNA, such as the conserved
nucleotides in the D- and T-loops. The con-
servation of nucleotides involved in the for-
mation of tertiary structure indicates that all
tRNAs of this class have the same basic fold.
In the following section the structures of the
elements of tertiary structure that have been
Figure 2. The levels of structure in yeast tRNAPhe. a) Nucleotide sequence, modifications, and
secondary structure of tRNA as represented in the classic cloverleaf structure. Watson ±
Phe elucidated by X-ray crystallography and NMR
Crick base pairs are indicated as solid dots and the GU wobble pair with an open circle. spectroscopy will be described. For the pur-
Nucleotide positions that are invariant among class I tRNAs are highlighted in boldface. poses of this review these motifs will be
(Adapted from reference [135].) b) The tertiary interactions of tRNAPhe as represented in a classified into three general categories: inter-
modified secondary structure representation that reflects its three-dimensional structure (the
actions between two double-stranded helical
dashed lines denote tertiary interactions between bases). The invariant nucleotides are
primarily involved in tertiary interactions, clustered in the region of the tRNA responsible for regions, between a helical region and a non-
formation of the global architecture. c) Representation of the X-ray crystallographic structure double-stranded region, and between two non-
of tRNAPhe with invariant nucleotides represented as black bars (Protein Data Bank (PDB) helical regions (this classification scheme has
accession number 6tna).[136] Figures of RNA structures in this review were created with been adopted from Westhof and Michel[19] ). In
RIBBONS 2.0.[137]
this review we will focus most of our attention
on two RNA molecules whose structure and
cleases (RNases) and chemicals that target specific features of folding has been characterized in great detail: transfer RNA
the RNA.[9] Most of these reagents react at sites in the RNA (Figure 2 a ± c) and the Tetrahymena thermophila self-splicing
that are not involved in Watson ± Crick base pairing and are group I intron (Th-intron; Figure 3 a, b). We refer the
solvent accessible; sites of modification are subsequently interested reader to Wedekind and McKay for a comprehen-
revealed by sequencing 32P-end-labeled RNA or by reverse sive review of the structure and function of the hammerhead
transcription of the modified RNA with a 32P-end-labeled ribozyme.[20]
DNA primer. Comparing the results of biochemical probing
with predicted RNA secondary structures generally yields an
accurate map of the Watson ± Crick pairing scheme.
2.2. Interactions Between Helical Motifs
Tertiary structural elements primarily involve an interac-
tion between distinct secondary structural elements. Beyond
2.2.1. Coaxial Stacking
tRNA, not much was known about this level of RNA
architecture until very recently because these types of Coaxial stacking of helical regions, the most fundamental
interactions are extremely difficult to predict or experimen- method by which RNA achieves higher order organization, is
tally determine. Progress has been made possible by improve- a consequence of the highly favorable energetic contributions

Angew. Chem. Int. Ed. 1999, 38, 2326 ± 2343 2329


REVIEWS J. A. Doudna et al.

of stacking interactions between the p-electron


system of the nucleotide bases to the overall
stability of nucleic structure.[21] The dominance
of base stacking in stabilizing RNA structure is
convincingly demonstrated by tRNA, in which
only 41 of 76 bases are involved in the classic
helical structure, yet 72 bases are involved in
stacking interactions.[22] All of the other known
large RNA structures display a similar high
degree of base stacking.
The contribution of coaxial stacking to the
global fold of an RNA was first observed in the
crystal structure of tRNAPhe.[13, 14, 23] In the three-
dimensional structure the stems of the D- and
anticodon arms stack upon one another as do the
stems of the T-arm and aminoacyl acceptor arm
(Figure 2 b, c).[23] These two coaxial stacks are
oriented perpendicularly with respect to one
another by tertiary interactions between the D-
and T-loops to yield the overall L-shape of the
molecule. The predominance of coaxial stacking
in the organization of RNA structure is also
evident in the structures of the P4ÿP6 domain
(Figure 3 b) and the hepatitis delta ribozyme (see
Figure 8); each of these structures can be descri-
bed as two sets of coaxially stacked helices
packed against one another.
The organization of junctions, in which three
or more helices intersect, by coaxial stacking is
often achieved through the binding of divalent
metals near the site of the stack. The direct
influence of metal-ion binding on the folding of
this secondary structural motif is clearly demon-
strated in studies of the three-way junction at the
catalytic center of the hammerhead ribozyme. In
the crystal structure two of the helices are seen to
coaxially stack, and the third is oriented relative
to the coaxial stack by both tertiary contacts and
hydrated magnesium ions specifically bound to
the RNA (Figure 4 a, b).[15, 16, 24] The orientation
of the helices with respect to one another is
extremely sensitive to the concentration of
divalent cations, as demonstrated by native gel
electrophoresis,[25, 26] fluorescence resonance en-
ergy transfer,[27] and transient electric birefrin-
gence[28] studies. In the absence of Mg2‡ this
junction forms an extended structure in which
none of the helices are stacked (Figure 4 c).[25] At

Figure 3. a) Tetrahymena group I intron secondary structure and tertiary interac-


tions. The two principal structural domains P4ÿP6 and P3ÿP7 are boxed, the long-
range interactions are shown as green arrows, and interactions involving the binding
of the P1 substrate helix are shown as green dashed lines. The tertiary interactions in
the intron that are mentioned in this review are: 1) the A-rich bulge (see
Section 2.3.3), 2) the tetraloop ± tetraloop receptor (see Section 2.3.2), 3) loop ±
loop interactions (see Section 2.4.1), 4) the triple helical scaffold, and 5) the J8/7
triplex (see Section 4.2). In the nomenclature of the group I introns, P refers to a paired region, J refers to a ªjoiningº, or single-stranded, region, and L refers
to a hairpin loop. This figure was created using data from the group I intron comparative database[138] and Lehnert et al.[62] b) The three-dimensional structure
of the P4ÿP6 domain from this intron (PDB accession number 1gid).[17] The tertiary interactions that hold this domain together, the A-rich bulge and the
tetraloop ± tetraloop receptor, are highlighted in blue and green, respectively.

2330 Angew. Chem. Int. Ed. 1999, 38, 2326 ± 2343


RNA Structure REVIEWS

Figure 4. The hammerhead ribozyme. a) The secondary structure of the


hammerhead ribozyme that was utilized for structure determination.
Nucleotides critical for catalytic activity are highlighted in gray, the base at
the cleavage site in light gray, and the observed tertiary interactions
denoted as dashed lines. b) The three-dimensional structure of the
hammerhead ribozyme with associated cobalt ions (shown as spheres;
PDB accession number 379D)[24] . c) Schematic representation of the
magnesium-dependent conformation of the hammerhead ribozyme. (This
figure was adapted from reference [26].)

Figure 5. a) The structure of the adenosine platform motif. Two adjacent


low concentrations of divalent cations (500 mm Mg2‡) heli- adenosines (green) are placed in a side-by-side arrangement immediately
below a GU wobble base pair (red). b) Secondary and c) tertiary structure
ces II and III coaxially stack and helix I forms an acute angle
of an interaction between the J6/6a internal loop (purple) and L5c (blue) of
to helix III. At higher magnesium concentrations (5 mm), two P4ÿP6 molecules in the asymmetric unit mediated by two adenosine
helix I forms an acute angle with respect to helix II; this form platforms (green).
corresponds to the catalytically active form of this ribozyme.
Although extremely high concentrations of monovalent ions
platform and the noncanonical pair. The other face of the 3'-
(4 m LiSO4) can drive the hammerhead ribozyme into an
adenosine of the platform remains free to participate in
active structure,[29] under physiological conditions the con-
tertiary stacking interactions, as occurs in all three platforms.
formational rearrangement of the junction is dependent upon
Two of the adenosine platforms in the P4ÿP6 domain are
site-specific binding of divalent ions. Studies of other junction
involved in an intermolecular RNAÿRNA contact between
elements indicate that the binding of divalent cations
two adjacent molecules in the crystal lattice (Figure 5 b, c).
generally mediates their folding.[30±33] This mechanism is also
The A-platform within the internal loop J6/6a (the internal
observed in other tertiary motifs requiring coaxial stacking
loop between helices P6 and P6a) is formed by two adjacent
(see Sections 2.4.1 and 2.4.2).
adenosines on the 5'-side of the loop, with the Watson ± Crick
CG base pair on the 5'-side of the loop stacking upon the
2.2.2. The Adenosine Platform platform and a wobble GU pair on the 3'-side. This allows the
platform to stack within a helix connecting P6 and P6a and
The crystal structure of the P4ÿP6 domain of the Th-intron causes the three nucleotides on the 3'-side of the internal loop
reveals a motif that facilitates the interaction between helices to be extruded from the helix. Two of the extruded nucleo-
by creating a site for a base-stacking interaction.[17, 34] The tides form Watson ± Crick pairs with nucleotides in L5c of an
adenosine platform (A-platform), which occurs in three adjacent RNA, which creates a small two base pair helix that
separate locations in this RNA, consists of two sequential stacks upon the A-platform capping L5c. The other platform
adenosine groups arranged side-by-side to create a ªpseudo in the P4ÿP6 domain is involved in an intramolecular tertiary
base pairº (Figure 5 a). The 3'-adenosine position of the interaction involving a four nucleotide loop and an internal
platform can also accommodate a cytidine base, as observed loop (see Section 2.3.2). Since all three A-platforms observed
in the solution structure of the theophylline-binding RNA in P4ÿP6 promote tertiary interactions it is likely that this
aptamer by NMR spectroscopy,[35] and in an in vitro selection motif will be found in many other large RNAs.
for variants of the tetraloop ± tetraloop receptor interaction
(see Section 2.3.2).[36] Each adenosine platform has a non-
canonical base pair (a pairing interaction that is not Watson ±
2.2.3. 2''-Hydroxy-Mediated Helical Interactions
Crick) immediately below it, either a GU wobble pair or a
Hoogsteen-type A ´ U pair, which creates a local conforma- Small duplex RNAs face an extreme problem of how to
tion in the helix that enhances the stacking between the pack into a three-dimensional lattice in crystals. While natural

Angew. Chem. Int. Ed. 1999, 38, 2326 ± 2343 2331


REVIEWS J. A. Doudna et al.

RNAs have a variety of specialized interhelical packing uridine) demonstrated that the duplexes converted into
motifs, the strictly helical nature of duplexes does not present poly(U)-poly(A)-poly(U) triple helices and single-stranded
much chemical diversity with which to create these tertiary poly(A) at high ionic strength.[42, 43] Strands of poly(A) and
interactions. The interhelical contacts created in these lattices, poly(U) form standard Watson ± Crick base pairs in the
though, may provide insights in how helices pack in some triplex, while the other poly(U) strand is placed in the major
natural RNAs. In most of the crystals observed thus far, the groove of the RNA duplex to form Hoogsteen-type pairs with
duplex oligonucleotides coaxially stack to create ªpseudo- the poly(A) strand (Figure 7 a). Similarly, a protonated
infiniteº helices that pack into a three-dimensional lattice in cytosine can hydrogen bond with the Hoogsteen face of a
two distinct ways. guanosine involved in a Watson ± Crick GC pair to create an
In the case of the RNA dodecamer 5'GGCGCUUGCGUC3' isosteric (C ´ G-C)‡ triple (Figure 7 b). Structural character-
the quasi-continuous helices pack such that the backbone of ization of a model RNA triplex containing both types of triple
one helix contacts the shallow minor groove of a perpendic- base pairs by NMR spectroscopy revealed that the third
ular helix.[37] At each site of contact, extensive direct hydrogen strand can be placed into the deep major groove of RNA
bonding occurs between 2'-hydroxyl groups, the 3'-oxygen without significant conformational distortion from ideal
atom, and phosphate oxygen atoms of the backbone of one A-form helical geometry.[44]
helix and the pyrimidine O2 atom, the exocyclic amine of Despite the stability of the extended pyrimidine-purine-
guanosine, and 2'-hydroxyl groups in the minor groove of the pyrimidine triplex, this motif has not been observed in any
adjacent helix (Figure 6). The majority of the hydrogen bonds naturally occurring RNA. Instead, many RNAs employ this
are mediated by the 2'-hydroxyl groups: 8 out of 13 in the first motif in a limited fashion as isolated major and minor groove
contact site and 5 out of 8 in the second contact site between base triples. Structural examples exist for both types of
helices. Similar sets of contacts are observed in crystals with pairing. Major-groove triples are observed within the crystal
the same mode of helical packing.[38, 39] structure of tRNAPhe, where the D-arm forms two consecutive
triple base pairs.[13, 14, 23] Unlike the model triplexes, in which
only pyrimidines are used in the Hoogsteen pairing strand,
both triple base pairs involve a purine interacting with the
Hoogsteen face of a purine involved in a Watson ± Crick pair
(Figure 7 c). Minor groove triples are also observed in
tRNAPhe, where A21, which is coplanar to a U8 ´ A14
reverse-Hoogsteen base pair, interacts through the minor
groove by forming a hydrogen bond between its N1 and the 2'-
hydroxyl group of U8 (Figure 7 d). Minor groove triple base
pairs are found within the Th-intron in the GAAA tetraloop ±
tetraloop interaction[17] (see Section 2.3.2) and the triple-
helical scaffold, a structure critical for proper alignment of the
P4ÿP6 and P3ÿP7 domains. Triple base pairs also play a
Figure 6. The perpendicular packing arrangement of pseudo-infinite RNA critical role in the interactions of many small molecule ligands
helices. The backbone of one helix interacts with nucleotide bases in the
and peptides with RNA.[45]
wide, shallow minor groove of an adjacent helix in the crystal lattice. This
figure was adapted from the crystal structure of the oligonucleotide The diverse morphology of major- and minor-groove triples
5'
GGCGCUUGCGUC (Lietzke et al.;[37] PDB accession number 280D). found in biologically active RNAs suggests that the nature of
triplexes in these RNAs is quite different from that of the
model pyrimidine-purine-pyrimidine triple helices. Instead,
This set of intimate contacts between helices is in stark
extended single stranded regions appear to tie helical regions
contrast to crystal structures in which the pseudo-infinite
together through the formation of base ± triple interactions.
helices are oriented parallel with respect to one another. In
The variable loop of tRNAPhe (Figure 2 b, c) forms two base
these crystals the backbones of the RNA duplexes make a
triples with the major groove of the D-arm, as well as pairing
limited number of interhelical contacts, primarily through
interactions with the D-loop and the coaxial stack between
water-mediated backbone ± backbone contacts and 2'-hydrox-
the D- and anticodon arms. In the hepatitis delta ribozyme
yl groups of the phosphate contacts.[40, 41] The dominance of 2'-
(Figure 8) a stretch of four single-stranded nucleotides (J4/2)
hydroxyl groups in mediating helical packing in both arrange-
form extensive interactions with the two sets of coaxially
ments is reflected in many of the tertiary interactions found in
stacked helices in the region that is the proposed catalytic
natural RNAs, such as the ribose zipper motif (see Sec-
pocket.[18] Two invariant adenosines in this single-stranded
tion 2.3.4).
region interact with a helix through a minor groove triple with
a Watson ± Crick GC pair and a ribose zipper motif (see
Section 2.3.4). Cytosine 75 (shown in red in Figure 8), which is
2.3. Interactions Between Helical and Unpaired Motifs
critical for the catalytic activity of this ribozyme, forms a
hydrogen bond between its N4 atom and a phosphate group
2.3.1. Base Triples and Triplexes
within the other set of coaxially stacked helices. There is
Early biophysical studies of heteroduplexes of poly(A) extensive biochemical evidence that a minor-groove triple
(homopolymeric adenosine) and poly(U) (homopolymeric helix formed by J8/7 within the catalytic core of the Th-intron

2332 Angew. Chem. Int. Ed. 1999, 38, 2326 ± 2343


RNA Structure REVIEWS

Figure 7. Examples of triple base pairs found in RNA. a) Hoogsteen-type U ´ A-U triple base pair, with the second uracil
hydrogen atom bonding to the major-groove face of a standard Watson ± Crick AU pair. b) A protonated Hoogsteen-type C‡ ´
G-C pair. c) A purine triple base pair found in the major groove of tRNAPhe. d) A triple base pair found in tRNAPhe in which an
adenosine hydrogen atom bonds with a 2'-hydroxyl group on the minor groove face of a reverse-Hoogsteen A ´ U base pair.

is critical for bridging the P1 substrate helix and the P3 and P4 several of these tetraloops (GAGA, GCAA, and GAAA) by
helices that form the heart of the catalytic core (see Fig- NMR spectroscopy reveal a network of hydrogen-bonding
ure 17).[46] Thus, it appears from these examples that single- and base-stacking interactions that create a stable structure
stranded regions often serve to bind helical regions together (Figure 9 a).[50, 51] In all of the tetraloops the position 1
through the formation of triple interactions between bases in guanosine (G1) and the adenosine in position 4 (A4) form a
many large RNAs. sheared base pair in which G1ÿN3 hydrogen bonds with
A4ÿN6, and G1ÿN2 forms a hydrogen bond with A4ÿN7. In
the GCAA and GAAA tetraloops the G1ÿN3 to A4ÿN6
2.3.2. The Tetraloop Motif
distance is too long to be a direct hydrogen bond, and is thus
Among the most prevalent motifs observed in natural likely to be water mediated.[51] In the GAAA tetraloop this
RNAs are several four nucleotide loop sequences (tetraloops) pair is surrounded by hydrogen bonds between the phosphate
found in the 16S- and 23S-ribosomal RNAs (rRNAs), groups I of A4 and the Watson ± Crick face of G1, and the 2'-hydroxyl
and II self-splicing introns, ribonuclease P, and bacterio- group of G1 interacting with the Hoogsteen face of G/A3 and
phage T4 messenger RNA.[47] For instance, 16S-rRNA tetra- A4ÿN6. In the GAGA and GAAA tetraloops the nucleotide
loops account for about 55 % of all hairpin loops in in position 2 is stacked upon the nucleotide in position 3,
eubacterial, while five nucleotide loops, the next most whereas in the GCAA loop the cytosine in position 2 is
prevalent loop size, account for 13 % of the loops.[48] The disordered. This variability in the hydrogen-bonding and
majority of 16S-rRNA tetraloops fall into two sequence stacking interactions between different members of the
categories: the ªUNCGº motif and the ªGNRAº motif. The GNRA family does not significantly alter their thermody-
UNCG-tetraloop motif (N indicates that the second nucleo- namic stability, but may be critically important for the ability
tide position in this sequence motif may be any nucleotide) of this motif to form tertiary interactions with a diverse array
has been shown to confer unusually high thermodynamic of RNA motifs.
stability to an RNA hairpin,[47, 49] but has not yet been Covariation analysis of the groups I and II introns and
implicated in the formation of tertiary interactions. RNase P have revealed several RNA motifs that form tertiary
The GNRA sequence motif, in which any nucleotide is interactions with the GNRA-type tetraloop. The structure of
accommodated in the second position and the third position is an internal loop motif, termed a tetraloop receptor, which
a purine base (R ˆ A or G), is the most prevalent tetraloop interacts with the GAAA tetraloop with high affinity and
motif in naturally occurring RNAs.[48] Structural analyses of specificity[52±54] was elucidated as part of the Th-intron P4ÿP6

Angew. Chem. Int. Ed. 1999, 38, 2326 ± 2343 2333


REVIEWS J. A. Doudna et al.

domain (Figure 9 b, c).[17] The tetraloop receptor adopts a


conformation consisting of two Watson ± Crick GC base pairs,
a reverse-Hoogsteen AU base pair, an adenosine platform,
and a wobble GU base pair. The GAAA tetraloop bound to
the receptor maintains a conformation that is almost identical
with the unbound form, with the three adenosines stacking
upon the adenosine platform in the receptor. This allows for a
network of hydrogen bonds between the adenosines of the
tetraloop and the minor groove of the receptor to stabilize this
tertiary interaction.
A second mode of tetraloop ± RNA interaction involves a
GNRA tetraloop binding to the minor groove face of two
tandem base pairs. This interaction involves a GNAA
tetraloop contacting the minor groove of two tandem
Watson ± Crick GC base pairs or a GNGA tetraloop and
interacting with a [5'CU...AG3'] dinucleotide. This type of
interaction was observed in the crystal structure of the
hammerhead ribozyme, in which the GAAA tetraloop of
one ribozyme is observed to dock against two tandem GC
pairs in an adjacent molecule.[15, 55] Biochemical studies of the
interactions between GNRA tetraloops and RNA indicate
that both the loop and its recognition site can tolerate a
suprisingly high degree of variability without sacrificing
binding affinity or specificity.[36, 56] This suggests that there
are as yet to be identified motifs that form intramolecular
contacts with the tetraloop motif within biological RNAs.
The importance of the minor groove in mediating tertiary
interactions, such as in tetraloop docking, may be a funda-
Figure 8. The hepatitis delta virus ribozyme. a) A representation of the mental reason why RNA, but not DNA, folds into complex
secondary structure of the RNA construct used for an X-ray structure three-dimensional structures. The minor groove in A-form
determination which reflects its three dimensional fold. This fold divides
helical RNA is wide and shallow (10 ± 11 Š), which allows for
the ribozyme into two coaxially stacked helices, (blue and magenta), along
with the single-stranded region J4/2 (orange) that contains a cytosine (red) easy access to this face of the helix, unlike B-form helical
essential for catalytic activity. b) The three-dimensional structure of the DNA, which has a much narrower minor groove (5.8 Š). This
hepatitis delta ribozyme showing the overall double-pseudoknotted top- allows for functional groups, particularly the 2'-hydroxyl
ology of the backbone (PDB accession number 1drz).[18] The gray region at group, of the ribose sugar to participate in an interaction.
the bottom of the ribozyme was part of an engineered U1A protein binding
site that was utilized to facilitate the crystallization of the RNA (the protein
Although the major groove of RNA can be sufficiently
is not shown). c) A 908 rotation of the ribozyme with the P2/P3 coaxial widened at internal loops containing purine ± purine pairs or
stack in front. The highly conserved nucleotides C75, A77, and A78 of J4/2 at the end of helices to allow for recognition, the importance
are thrust deep within the catalytic cleft of the ribozyme between the two of the major groove appears to be significantly less than that
sets of coaxially stacked helices.
of the minor groove.

2.3.3. The Metal-Core Motif

Within the P4ÿP6 domain of


several subclasses of group I
introns there is a bulge motif
in helix P5 whose nucleotide
composition and distance from
helix P4 are conserved. This
bulge, which contains two in-
Figure 9. a) Tertiary structure of the GAAA tetraloop along with its variant adenosine residues
hydrogen-bonding network (shown as orange dashed lines). Note that
(A184 and A186 of the Th-
the three adenosines in the tetraloop stack upon one another to further
stabilize this motif. This figure was adapted from the energy-minimized intron) and another that is
solution structure of P5b (Kieft and Tinoco;[139] PDB accession number highly conserved (A183), is
1ajf). b) Representation of the secondary structure in the interaction critical for the folding of the
between the GAAA tetraloop (purple/blue) and the tetraloop receptor intron. The ribose ± phosphate
(gold, with the A-platform shown in green). Long range hydrogen-bonding contacts between the tetraloop and its
receptor are shown in red. c) Representation of the structure of the interaction between the GAAA tetraloop
backbone of this bulge forms a
and its receptor. The conformation of the receptor-bound tetraloop is nearly identical to that in the free form, corkscrew turn in which the
with the three stacked adenosines forming a contiguous stack with the adensoine platform within the receptor. phosphates are placed towards

2334 Angew. Chem. Int. Ed. 1999, 38, 2326 ± 2343


RNA Structure REVIEWS
the interior and the nucleotide bases are flipped outward general means by which nucleotide bases can be presented for
(Figure 10).[17] Potentially unfavorable electrostatic interac- recognition by nucleic acids and proteins.
tions created by the close packing of the phosphate backbone
are relieved by two bound magnesium ions, with the RNA
providing many of the inner sphere coordinating ligands for 2.3.4. The Ribose Zipper
Within the tertiary interactions mediated by the adenosine-
rich bulge and the GAAA tetraloop ± tetraloop receptor
interaction, the ribose ± phosphate backbone is brought into
close contact with itself as two helices are docked against one
another. At these sites of close contact the ribose sugars of
antiparallel strands become interdigitated. Bifurcated hydro-
gen bonds between the 2'-hydroxyl group of a ribose from one
helix and the 2'-hydroxyl group and the N3 atom of a purine
or the O2 atom of a pyrimidine in the opposite helix create a
ªzipperº of ribose sugars in the minor grooves of two helices
(Figure 11). Similar interactions have been observed in the

Figure 10. The adenosine-rich bulge of the Tetrahymena group I intron.


Highly conserved adenosine residues (numbered) within the A-rich bulge
interact with the P4 helix and the P5abc subdomain. The corkscrew
structure of the A-rich bulge is stabilized by two specifically bound
magnesium ions (shown as spheres).

each ion.[57] The nucleotide bases splayed out from the bulge
form tertiary contacts to two regions of the RNA. A183 and
A184 interact with functional groups within the minor groove
of the P4 helix by using the ribose zipper motif (see
Section 2.3.4). A186 extensively hydrogen bonds with the
minor groove face of the C137ÿG181 base pair of P5a and Figure 11. The hydrogen-bond network between 2'-hydroxyl groups and
G164 in the sheared A139ÿG164 pair found at the top of P5b. nucleotide bases within the ribose zipper motif found between two
Stacking interactions between A186 and the A139ÿG164 pair adenosines in the A-rich bulge and the minor-groove face of the P4 helix
of the Tetrahymena group I intron P4ÿP6 domain.
facilitate the P5aÿ5b coaxial stack at this junction. The fourth
adenosine of the bulge, A187, forms a noncanonical base pair
with U135. Thus, the adenosine-rich bulge presents nucleotide intermolecular interaction of a GAAA tetraloop and the
bases in a conformation that allows helix P4 to be tightly minor groove in the hammerhead crystal structure[55] and in
docked against the P5aÿ5b helical coaxial stack, which, along the crystal structure of the hepatitis delta virus ribozyme.[18]
with the GAAA tetraloop ± tetraloop receptor, firmly locks This interaction might be quite pervasive within many bio-
the two sets of helices comprising the P4ÿP6 domain together logical RNAs because it can potentially pack RNA strands
to form the overall fold of this RNA. and helices together with few sequence-specific requirements.
The crystal structure of the trans-activation response region However, the lack of sequence specificity of the ribose zipper
(TAR) of the human immunodeficiency virus (HIV) displays makes it extremely difficult to predict by covariation analysis
a similar motif.[58] In this RNA a three nucleotide bulge with or many biochemical probing techniques.
the sequence UCU binds three calcium ions. These ions
coordinate phosphate oxygen atoms in the backbone of the
RNA in and around the bulged nucleotides. As a consequence
2.4. Tertiary Interactions Between Unpaired Regions
the three pyrimidine bases face away from the RNA helix and
allows the stems flanking the bulge to coaxially stack upon
2.4.1. Loop ± Loop Interactions
one another. It has been demonstrated by using transient
electric birefringence measurements that Mg2‡ also binds this Hairpin loops present rich potential for the formation of
pyrimidine-rich bulge, which causes the two helices to tertiary contacts through pairing interactions between their
coaxially stack. If the bulge sequence is changed to three nucleotide bases to create new helices. This type of interaction
adenosines then magnesium ions no longer effect this change, is represented in a limited form in the structure of tRNAPhe, in
which indicates that the nucleotide sequence affects the which nucleotides in the D- and T-loops form two base pairs as
ability of the backbone to specifically bind metal ions.[59, 60] part of the extensive tertiary contacts that form the elbow of
This motif has also been observed in the recognition of single- the molecule (see Figure 2 b, c). In the Th-intron helices P13
stranded DNA of the canine parvovirus by the viral capsid and P14 are formed through the formation of five to seven
protein through chelation of a divalent metal ion by the DNA contiguous Watson ± Crick base pairs between complimentary
phosphate backbone, which organizes the nucleotide bases for hairpin loops, which tie together distinct structural domains
recognition by the protein.[61] Thus, this motif appears to be a (see Figure 3 a).[62]

Angew. Chem. Int. Ed. 1999, 38, 2326 ± 2343 2335


REVIEWS J. A. Doudna et al.

The structural nature of complimentary loop ± loop inter- Mg2‡ ions or a high concentration of Na‡ ions for complete
actions has been investigated in detail with NMR spectro- stabilization.[71] Additionally, many pseudoknots have bends
scopy, using the ªkissº complex formed between two RNAs at the coaxial stack caused by unpaired nucleotides interca-
involved in the regulation of Col E1 plasmid replication[63, 64] lating between the two helices,[72] which is an essential feature
and the TAR loop sequence of HIV-2[65, 66] as model systems. for the in vivo function of one retroviral mRNA pseudo-
Each complex is a single composite coaxially stacked helix knot.[73]
comprising the two original hairpin stems and a new helix The loops (L1 and L2) that span the two helices of the
formed between them created by the Watson ± Crick base pseudoknot are inequivalent in that L1 crosses the major
pairing of the nucleotides in the complimentary loops (Fig- groove of the 3'-stem 2 (S2) and L2 crosses the minor groove
ure 12 a, b). Notably, all of the nucleotides in each loop are of the 5'-stem (S1). In a recent structure of the pseudoknot
from the tRNA-like genomic TYMV RNA, nucleotides from
both loops were observed to interact with the helices (Fig-
ure 13 b, c).[74] Loop 2, which crosses the minor groove,
displays hydrogen-bonding interactions between some of its
nucleotides and functional groups within the minor groove of
stem 1. An adenosine within this loop, which is highly
conserved among plant viral RNAs, is proposed to make
hydrogen-bonding contacts between its N1 atom and the
exocyclic amino group of a stem 2 guanosine and between the
adenosine exocyclic amino group and the N3 atom of an
adjacent guanosine in the helix. The adjacent cytosine in
loop 2 also makes a hydrogen-bonding contact between its
exocyclic amine and a 2'-hydroxyl group in the minor groove
of stem 2. This suggests that pseudoknots may employ triple-
helical buttressing by the loops to further stabilize the
Figure 12. a) Secondary and b) tertiary structure of the kissing complex structure. A very recent high-resolution crystal structure of a
between stem loops found in the genomic RNA of the HIV-2 trans-
pseudoknot from the beet western yellow virus mRNA
activation region (PDB accession number 1kis).[66]
demonstrates extensive tertiary interactions in both the major
and the minor grooves.[75]
stacked on the 3'-side of the central helix and are involved in A remarkable example of the pseudoknot motif being
pairing interactions. Stable formation of this stacked struc- utilized to direct the global architecture of an RNA is
ture, like the coaxial stacks at junction motifs, requires demonstrated in the structure of the hepatitis delta ribo-
magnesium ions.[67] The Col E1 kiss complex is further zyme.[18] This ribozyme is exceptionally stable; both the
stabilized by a purine ± purine cross-
strand stack,[64] an RNA motif, which
has also been found in 5S rRNA,[68] the
sarcin ± ricin loop,[69] the hammerhead
ribozyme,[15, 16] and the P4ÿP6 domain.[17]
This motif involves the six-membered
ring of a purine stacking upon that of a
purine in the opposite strand of the
duplex, as opposed to stacking upon its
same-strand neighbor, as is found in a
regular A-form RNA duplex.

2.4.2. The Pseudoknot


A pseudoknot is defined as a motif in
which nucleotides of a hairpin loop base
pair with a complementary single-strand-
ed sequence. The classic pseudoknot,
whose structure has been well character-
ized by NMR spectroscopy, consists of a
hairpin loop paring with a complemen-
tary sequence next to the hairpin stem to Figure 13. The pseudoknot motif. a) The classical pseudoknot is formed through a base pairing
form a contiguous coaxially stacked helix interaction between nucleotides in the loop of a stem loop and an adjacent single-stranded region.
b) Secondary structure of the pseudoknot found in the genomic RNA of the turnip yellow mosaic
(Figure 13 a).[70] Similar to junctions and virus (TYMV). c) The three-dimensional structure of the TYMV pseudoknot (PDB accession
kissing loops, the coaxial stacking ob- number 1a60).[74] Nucleotides in both loops are situated in the major and minor grooves of the
served in the pseudoknot requires either adjacent helices, which allows them to form triple interactions with the base pairs.

2336 Angew. Chem. Int. Ed. 1999, 38, 2326 ± 2343


RNA Structure REVIEWS
genomic and antigenomic variants exhibit enhanced stability quently revealed by sequencing 32P-end-labeled RNA.[78]
and activity in chemicals that typically denature nucleic acid Most of the intron is accessible to solvent below 0.75 mm
structures (5 ± 8 m urea or 10 ± 18 m formamide) as well as Mg2‡, which indicates that no significant tertiary structure
elevated temperatures (65 ± 70 8C).[76, 77] No divalent cations or has formed. Above this concentration there are two highly
unusual base pairs were found in the crystal structure; instead cooperative transitions, the first corresponding to the forma-
the structure is stabilized entirely through Watson ± Crick tion of the P5abc subdomain and the second to the folding of
base pairing and the highly convoluted double pseudoknot the catalytic core. Subsequent studies after removing the
topology of the backbone (see Figure 8).[18] Thus, an RNA fold stabilizing peripheral extension P9.1ÿ9.2 demonstrated that
can be created by packing helices using either specific tertiary the entire P4ÿP6 domain becomes stably folded at lower
docking motifs, as in the P4ÿP6 domain, or by backbone magnesium concentrations than the catalytic core requires for
topology, as in the hepatitis delta virus ribozyme. stable folding.[80]
The ability of the P4ÿP6 domain to fold at lower divalent
ion concentrations than the rest of the intron indicates that it
3. The Folding of RNA into Higher Order is capable of forming stable tertiary structure independently
Structures of the rest of the intron. By probing an isolated 160-nucleotide
fragment corresponding to the P4ÿP6 domain with FeII-
Analogous to the ªprotein-folding problemº, there is the ÿEDTA and dimethyl sulfate, a chemical that modifies the
question of how RNA establishes a three-dimensional fold imino nitrogen atom of adenine and cytosine bases, revealed
from its primary sequence. In studies of protein folding the that in the presence of magnesium the solvent-inaccessible
unfolded state is characterized as a random coil, such that the region of this RNA is a large subset of that observed in P4ÿP6
folding process encompasses the acquisition of secondary in the intact intron.[81] This magnesium-dependent folding of
structure (a-helices, b-sheets, etc.) and tertiary structure. With the P4ÿP6 domain is critically dependent upon the GAAA
RNA, however, the unfolded state is regarded as having a tetraloop of P5b; mutations in this region significantly
majority of the native secondary structure formed, with the increase the amount of magnesium that is required to produce
folding process being concerned with the sequential forma- the pattern of FeIIÿEDTA protection observed in the wild-
tion of tertiary interactions to establish the native structure. type domain,[53] which is consistent with the tertiary inter-
Thus, the tertiary interactions described in the previous actions observed in the crystal structure of this domain.[17]
section are principal players in the folding landscape of large Similarly, mutations in the tetraloop receptor and the J5/5a
RNAs, as discussed in this section. internal loop, the flexible hinge between the two sets of
coaxially stacked set of helices in P4ÿP6, are detrimental to
the formation of the solvent-accessible core.
3.1. The Domain Architecture of RNA While these mutations destroy most of the tertiary structure
of the P4ÿP6 domain, the adenosine-rich bulge and the three-
Most large biological RNAs are organized into domains way junction J5abc remain solvent inaccessible, which sug-
that are generally reflected in their secondary structure. The gests that the P5abc subdomain folds independently of the
term domain does not have a strict definition with respect to rest of P4ÿP6. The isolated P5abc subdomain displays the
RNA, but generally refers to a region of secondary structure same pattern of FeIIÿEDTA protections as in the context of
that folds as a single unit. For instance, all group I introns have the intact intron,[81] as well as a similar pattern of magnesium
two fundamental domains necessary for formation of their ion binding as the P4ÿP6 domain.[57] Substitution of the pro-Rp
catalytic core: the P4ÿP6 domain and the P3ÿP7 domain (see oxygen atoms in the phosphate backbone of an RNA with
Figure 3 a). In the Th-intron these two domains are supple- sulfur has been shown to interfere with inner-sphere coordi-
mented by the P2ÿP2.1 and P9 peripheral extensions, as well nation by magnesium ions.[82, 83] Single atom substitution of
as the P5abc subdomain. Extensive evidence indicates that any of the four pro-Rp oxygens in the A-rich bulge and L5c
these domains and extensions on the active ribozyme interact that directly coordinate magnesium ions in the crystal
with each other almost exclusively through tertiary interac- structure of the P4ÿP6 domain significantly destabilize the
tions rather than base pairing. folding of the isolated P5abc domain.[57]
The transition between the unfolded and folded states of Other domains of the Th-intron are not capable of
the Th-intron requires the specific binding of at least three achieving an independently stable fold outside the context
magnesium ions to assemble into a fully functional catalytic of the intact molecule. Structural probing of an RNA
ribozyme; in the absence of magnesium or other divalent ions construct corresponding to the P3ÿP7 domain and the P9
only the secondary structure of the ribozyme is formed. This peripheral extension demonstrated that while its secondary
magnesium-facilitated folding can be monitored wih FeII- structure is similar to that observed in the intact group I
ÿEDTA (EDTA ˆ ethylenediamine tetraacetate), a reagent intron, tertiary interactions within this domain are absent.[84]
that generates free hydroxyl radicals that are capable of Only upon association with the P4ÿP6 domain is the P3ÿP7
cleaving the backbone of single- or double-stranded RNA at region capable of forming a stable set of intradomain tertiary
any point where it is freely accessible to bulk solvent.[78, 79] interactions, which indicates that the P4ÿP6 domain provides
Regions of the RNA become solvent inaccessible upon higher a structural scaffold for higher order folding of P3ÿP7.
order folding, and hence are protected from free radical Association of the secondary structural domains of the Th-
mediated strand scission; regions of protection are subse- intron are primarily mediated by tertiary interactions. Dele-

Angew. Chem. Int. Ed. 1999, 38, 2326 ± 2343 2337


REVIEWS J. A. Doudna et al.

tion of P5abc from the intron (!DP5abc-intron) significantly domains of group II introns are capable of efficiently binding
impairs its activity under standard splicing conditions (5 mm through a trans association to form a functional RNA (for an
MgCl2 , pH 7.5), but can be rescued at higher ionic strengths excellent review see reference [95]).
(15 mm MgCl2 , 2 mm spermine, pH 7.5).[85] Wild-type activity
can be restored by supplying the P5abc subdomain to the
DP5abc intron in a trans arrangement (a trans association
involves the binding of two separate RNA molecules, in 3.2. The Hierarchical Folding Model of RNA
contrast to a cis association in which two interacting domains
are on the same RNA chain). Formation of this bimolecular Pioneering studies on tRNA in the 1970s elucidated some
complex is mediated entirely through three tertiary interac- of the fundamental features of RNA folding pathways. By
tions: the tetraloop ± tetraloop receptor, the A-rich bulge and using temperature-jump relaxation measurements and NMR
the minor groove of P4, and the kissing-loop interaction spectroscopy it was shown that under conditions of moderate
between L5c and L2. Formation of this intermolecular ionic strength (174 mm Na‡, no Mg2‡) E. coli tRNAfMet
complex is magnesium dependent and extraordinarily tight; undergoes five distinct transitions during thermal unfold-
at 10 mm MgCl2 the apparent dissociation constant (Kd) is ing.[96] The lowest temperature transitions involve the dis-
100 pm.[86] The Th-intron can also be divided into three ruption of tertiary interactions between the D- and T-loops
separate pieces consisting of the P4ÿP6 domain, P1ÿP3 followed by the weak secondary structural elements in the
substrate, and the P3ÿP7/P9 domain, which are all capable D-stem.[97] Higher temperature transitions involve the melting
of associating solely through tertiary interactions to form a of the secondary structural elements corresponding to the T-,
functional ribozyme.[87] anticodon, and acceptor stems (Figure 14). Since thermal
It appears that a general feature of most large biological unfolding is a reversible reaction in tRNA it follows that its
RNAs is that they are constructed from independently folding structure forms hierarchically in the folding pathway, with
secondary structural domains that can associate in the trans almost all of the secondary structure forming prior to the
form, primarily through tertiary-type interactions rather than tertiary structure. Similar behavior is observed in the thermal
base pairing, to form functionally active molecules. The unfolding of two other tRNAs, which indicates that this is a
catalytic component of eubacterial RNase P, a ribonucleo- general mechanism for tRNA folding.[98, 99]
protein enzyme responsible for post-transcriptional process- Under conditions containing 3 mm Mg2‡ E. coli tRNAfMet
ing of small cellular RNAs, resides entirely within the displays a single, cooperative unfolding transition at high
RNA.[2, 88] In the absence of the protein subunit RNase P temperature such that tertiary and secondary structure are
RNA (P RNA) binds the precursor tRNA substrate by disrupted simultaneously.[100] This transition reflects the bind-
specifically recognizing the coaxially stacked T-stem/acceptor ing of a single magnesium ion to tRNA with high affinity
stem of tRNA through tertiary interactions involving a (Kd ˆ 33 mm) along with a number of weakly bound ions[101]
number of 2'-hydroxyl and phosphate groups in both the associated with the formation of tertiary structure.[100] This
P RNA[89, 90] and tRNA substrate.[91, 92] P RNA consists of two uptake of magnesium ions by RNA during the folding process
independently folding domains that show similar patterns of primarily affects the stabilization of tertiary structure rather
protection when separated from each other as they do in the than the formation of secondary structure. Since tertiary
context of the intact ribozyme, as evident from probing structure formation involves the close juxtaposition of the
experiments with FeIIÿEDTA in the presence of magnesium highly negatively charged backbone in several regions in
ions.[93, 94] Although these domains are individually unable to tRNA it is not surprising that the formation of tertiary
catalyze tRNA processing, they associate through tertiary structure creates regions for high affinity multivalent cation
interactions to form a catalytically competent complex. Thus, binding sites. The structure of tRNAPhe reveals several
interdomain interactions and substrate recognition occur specifically bound magnesium ions at sites of tertiary inter-
through tertiary interactions in the same manner as the Th- action, although no specific cation can be attributed to the
intron. Similarly, some of the six phylogenetically conserved stablization observed in the thermal melting profile.

Figure 14. The unfolding pathway of transfer RNA as determined from thermal denaturation studies with approximate time constants for each step of the
folding reaction. (Adapted from references [97, 140].)

2338 Angew. Chem. Int. Ed. 1999, 38, 2326 ± 2343


RNA Structure REVIEWS
Thus, these studies illustrate two fundamental features of
RNA folding. First, tertiary structure formation is highly
dependent upon the prior formation of secondary structure.
Sequential formation of secondary and tertiary structures in
RNA is the basis of the ªhierarchical modelº of RNA folding.
Secondly, the binding of multivalent ions to RNA during the
folding process generally serves to stabilize the formation of
tertiary structure rather than secondary structure. This
property of RNA folding is particularly evident in light of
the structures of tertiary interactions, in which metal ions are
often found intimately involved.

3.3. Kinetic Folding of the Tetrahymena Group I Intron

Recent work on RNA folding has primarily utilized the Th-


intron as a model system. Similar to tRNA, this group I intron
exhibits multiple transitions in its thermal denaturation, as
measured by UV absorbance.[102] Also, chemical probing of
the RNA at varying temperatures indicates that the lower
temperature transition involves the disruption of tertiary
structural elements rather than the opening of base pairs in
the secondary structure.[102]
In order to observe folding on the millisecond timescale a
variation of footprinting was developed in which a beam of
high energy photons from synchrotron radiation is used to
Figure 15. A model of the Mg2‡-dependent folding of the Tetrahymena
produce a burst of hydroxyl radicals in an aqueous sample that
group I intron, based upon time-dependent synchrotron hydroxyl radical
is capable of probing an RNA in a similar fashion as the footprinting. (Adapted from reference [104].)
FeIIÿEDTA reaction, with exposure times as short as 10 milli-
seconds.[103, 104] By equipping the synchrotron beamline with a
stopped-flow apparatus, the magnesium-dependent folding of
any RNA can be monitored starting at 20 ms after the other time-dependant techniques, including chemical modifi-
induction of folding. This is fast enough to potentially monitor cation,[108] UV cross-linking,[109] and FeIIÿEDTA footprint-
the tertiary folding of a tRNA, whose tertiary interactions ing.[103]
form in approximately 100 ms.[97] By using the kinetic oligonucleotide hybidization assay to
These experiments revealed that the P5abc domain folds monitor folding it has been shown that the limiting step of the
most rapidly within the P4ÿP6 domain, where nucleotides in folding reaction is a magnesium-independent step involving
the magnesium core of the adenosine-rich bulge and three- the formation of tertiary structure in the triple-helical scaffold
way junction are protected from solvent, with a rate of 2 sÿ1 that is responsible for properly orienting the two structural
(Figure 15) The remainder of the tertiary interactions in the domains.[107] In order to understand the nature of this rate-
P4ÿP6 domain form in a concerted fashion at slightly slower limiting step a selection scheme was developed to find
rates. Nucleotides in the peripheral extensions P2ÿ2.1 and P9, mutations in the intron that accelerate the RNAs passage
which had been previously shown to assist in the formation of through this step in the folding pathway.[110] This selection
the catalytic core[105] are protected from cleavage at a rate of yielded five variants with a faster-folding phenotype and wild-
about 0.3 sÿ1. These two peripheral extensions are proposed to type catalytic activity.[110] Four of the variants contained
wrap around the ribozyme core, and be stabilized by the single-point mutations in the P5abc subdomain that confered
formation of the P13 and P14 tertiary interactions. At this the fast-folding phenotype, which suggests that they affect the
point in the folding pathway, the exterior of the ribozyme, rate-limiting step of folding through a common mechanism,
which consists of the P4ÿP6 domain and the peripheral and importantly, each mutation does not affect the catalysis or
extensions, is completely folded, but the interior of the the stability of the intron. The characterized mutations cluster
ribozyme, which contains the catalytic core, remains disor- around the adenosine-rich bulge of P5abc, likely disrupting
dered. Formation of the active structure with a solvent the stable formation of this structure, just as single phosphor-
inaccessible P3ÿP7 domain is very slow and requires minutes othioate substitutions in this same region disrupt the stable
(kobs ˆ 0.02 sÿ1) to fold. Thus, the two structural domains folding of the P4ÿP6 domain.[57] Even though P5abc does not
P4ÿP6 and P3ÿP7 are also kinetic folding domains. This directly contact the P3ÿP7 domain in models of the group I
model of the kinetic folding pathway of the Th-intron, which intron,[62] formation of this domain has a significant affect
was originally proposed by Zarrinkar and Williamson from upon the rate of folding of P3ÿP7. Treiber et al.[110] conclude
studies performed with a time-dependent oligonucleotide that since the entire P4ÿP6 domain folds prior to the rate-
hybridization assay,[106, 107] has been further supported by limiting step and destabilization of the metal-core motif in

Angew. Chem. Int. Ed. 1999, 38, 2326 ± 2343 2339


REVIEWS J. A. Doudna et al.

P5abc accelerates the rate of formation of the P3ÿP7 domain parative sequence analysis of 35 sequences of subgroup IC1
that a tertiary interaction which is also present in the native introns (to which the Tetrahymena intron belongs), provided
state causes a kinetic trap in the folding pathway. crucial information for generating a consensus secondary
Further demonstration that native tertiary interactions structure and some types of tertiary interactions, and revealed
within the intron create kinetic barriers in the folding pathway invariant nucleotides likely to be essential for the formation of
was demonstrated by measuring the rate of folding of wild- the correct fold or for biochemical function. These data were
type and the fast-folding mutants in the presence of augmented by constraints provided by chemical and enzy-
urea.[110, 111] Chemical denaturants act to increase the folding matic probing of the intron RNA, and site-directed muta-
rate of proteins[112] and RNA[113, 114] by disrupting the forma- genesis experiments verified the existence of base-specific
tion of interactions that disfavor rapid progression towards tertiary interactions such as base-triples[123] and loop ± loop
the fully folded, native state. The folding of the wild-type interactions.[62] From these data a three-dimensional model
intron is accelerated by increasing the concentrations of urea of the Tetrahymena intron was constructed by using an
up to 3 m, but the folding rates of the selected faster folding approach based upon the hierarchical nature of RNA folding.
mutants are affected to a significantly lesser extent.[110, 111] A Elementary substructures and motifs were built based upon
detailed study of the effects of urea and mutations on the known structures (such as double-stranded helices, hairpin
temperature dependence of the folding rate revealed that they loops, and pseudoknots), then hooked together by using
both act to reduce the activation enthalpy for P3ÿP7 folding, interactive computer graphics, and finally the model was
which further supports the theory that a primary kinetic trap geometrically and stereochemically refined with a least-
in the folding of the wild-type intron involves native tertiary squares refinement routine.[124] The resulting model (Fig-
interactions.[111] ure 16) has provided a framework for the interpretation of
Rather than folding being a simple process involving the chemical activity and folding, has withstood extensive bio-
sequential formation of intermediates as implied by the chemical testing, and agrees with the overall fold of a group I
hierarchical folding model, it is more accurately described as intron catalytic core determined at low resolution (5 ± 6 Š) by
an ensemble of molecules following parallel folding pathways, X-ray crystallography.[125]
as represented in a ªfolding energy landscapeº.[115, 116] This is
indicated by the observation that in a given population of
group I RNAs, there is a small fraction capable of rapidly
folding to reach the native state, while the rest of the
molecules slowly fold, kinetically trapped in various inter-
mediates.[113] Moderate amounts of chemical denaturant alter
the distribution of molecules following parallel folding path-
ways by increasing the fraction of the population capable of
evading kinetic traps to rapidly fold.[113] Thus, the kinetic
intermediates described in the hierarchical folding model
represent the most populated pathway in the folding energy
landscape under standard folding conditions. Studies of the
folding of the Th-intron have revealed that both native
structures (such as the metal core of P4ÿP6)[110, 111] and non- Figure 16. a) Model of the Tetrahymena group I intron developed by
Lehnert et al.[62] The ribbon traces the backbone of the molecule, with the
native interactions (such as incorrect base-pair forma-
P4ÿP6 domain shown in black, the P3ÿP7 domain in light gray, and the P2
tion)[113, 117] play a significant role in shaping this landscape. and P9 peripheral extensions in gray. b) A 908 rotation of the model,
Changes in temperature, solvent conditions, and nucleotide highlighting the packing of the P3ÿP7 domain between the P4ÿP6 and the
sequence have a profound affect upon the energy landscape, peripheral extensions.
and have revealed new kinetic barriers and traps and allow
alternative pathways of folding to become significantly
populated.[111] Producing accurate RNA models requires prior identifica-
tion of as many tertiary interactions as possible, since these
provide critical constraints on the global fold of the RNA that
cannot be obtained from the secondary structure. For
4. Developing Working Models of Large RNAs example, the model of the hepatitis delta virus ribozyme[126]
has a global fold different from that seen in the 2.3 Š
4.1. Modeling the Tertiary Folding of RNA resolution crystal structure.[18] This difference stems mainly
from the failure to predict the helix P1.1, which is essential for
Despite significant advances in the ability to prepare and establishing the double-pseudoknot fold of the ribozyme. This
crystallize RNA,[118±121] it remains difficult to obtain crystals interaction was hard to predict from the secondary structure
that yield high-resolution structural information. In the of the ribozyme, as there are only two sequences (the genomic
absence of these structures, modeling the three-dimensional and antigenomic variants) with which to perform comparative
architecture of RNA has been a useful alternative. By far the analysis. Therefore, the key to better models of large RNAs lie
most successful RNA modeling effort to date has been that for in developing biochemical techniques for identifying tertiary
the Th-intron developed by Michel and Westhof.[62, 122] Com- contacts.

2340 Angew. Chem. Int. Ed. 1999, 38, 2326 ± 2343


RNA Structure REVIEWS
4.2. High-Resolution Probing of the Tertiary Structure critical GU wobble base pair[134] in P1 that defines the
of RNA cleavage site. It was determined from interference suppres-
sion analysis that a 2'-hydroxyl group and exocyclic amine of
A clever method for biochemically determining tertiary the guanosine in the GU pair interact with the minor groove
interactions within RNAs relies on the use of phosphoro- of two noncanonical AA pairs in J4-5.[132] The modeled
thioate-tagged nucleotide analogues randomly incorportated tertiary interaction resembles interactions observed between
into RNA transcripts. Following the separation of functional adjacent P4ÿP6 domain molecules in the crystal lattice, and
molecules from inactive (or unfolded) ones, the nucleotide between minor grooves observed in crystal structures of
positions where analogue substitution interferes with RNA perpendicularly packed RNA duplexes (see Section 2.2.3). A
structure or activity can be mapped by cleavage of the RNA second set of tertiary interactions that have been modeled in
with I2 .[127±129] The primary caveat of this technique is that if a the catalytic core involve the formation of a triple helix in the
phosphorothioate linkage interferes with RNA function, then minor groove between the J8/7 strand and P1, P3, and P4
the effect of nucleotide analogues will be masked. By using helices (Figure 17). This model suggests that the 2'-hydroxyl
the interference mapping of nucleotide analogues the 2'-
deoxy and 2'-methoxy analogues were utilized to identify the
essential 2'-hydroxyl groups in tRNA for recognition by
RNase P,[92, 128] and inosine analogues revealed the exocyclic
amines in the Th-intron that are important for catalysis.[129]
This approach has been greatly extended by Strobel and co-
workers through the synthesis of a large library of nucleotide
analogues (for a comprehensive review see reference [130]);
these analogues have been used to probe the chemical and
structural contribution of almost every chemical moiety
within every adensoine in the Th-intron.[131]
A second experimental approach needed to be developed
for the ªchemogeneticº technique outlined above to mirror
biological genetics. In genetics, after identification of a
mutation that is deleterious to function, subsequent genetic
screens are performed to find secondary mutations that
restore function. If the mutation occurs at a site distinct from
the original mutation, it is classified as a suppressor and may
reveal an intramolecular or intermolecular interaction. Sim- Figure 17. A model of the catalytic core of the Tetrahymena group I intron
ilarly, once interesting functional groups or atoms are based upon pairwise constraints (shown by dotted lines) provided by
determined from analogue interference mapping, potential nucleotide analogue interference mapping.[46] The single-stranded region
J8/7 forms a minor groove-triple helix with the P1 substrate helix as well as
tertiary interactions involving that position can be identified
minor-groove triple interactions with the P4 helix and a major-groove triple
by synthesizing an RNA molecule that contains the original interaction with the P3 helix, which organizes the catalytic core of this
functional group mutation along with another randomly ribozyme.
incorporated analogue.[132] This pool of molecules is again
subjected to a folding or activity selection, followed by
groups play a central role in the establishment of a minor
mapping of the sites of interference by I2 cleavage of the
groove triplex, as observed in a tRNAPhe and a model triplex
RNA. In this experiment all of the analogues that were
(see Section 2.3.1). Since this model is predicated from
identified in the original interference mapping would still
biochemical experiments that probe structure and function
interfere except for the analogue that interacts with the site-
at the atomic level, it may reveal details at higher resolution
specific mutation. Since the energetic cost of disruption of the
than models based upon covariation analysis and site-directed
tertiary interaction was already incurred by the site-specific
mutagenesis.
mutation that is incorporated into every RNA molecule, the
analogue that disrupts its partner does not further disrupt
RNA function, that is, the deleterious effect of the analogue is
suppressed by the presence of the site-specific mutation. 5. Summary and Outlook
Multiple tertiary interactions involving hydrogen-bonding
pairs can potentially be revealed by testing a number of RNA, being both a catalyst and a carrier of heritable
site-specific mutants, which then provide constraints for information, provides an excellent opportunity to probe
modeling the structure of an RNA. structure ± function relationships and the macromolecular
With this methodology a structurally detailed model has folding problem through the use of biochemical techniques
been built for the catalytic core of the Th-intron.[46] This not available to the protein chemist. Advances in our
model includes the tertiary interactions necessary for the understanding of RNA structure provide a basis for exploring
docking of the substrate helix (P1) into the catalytic cleft of the RNA-folding problem as well as developing robust
this ribozyme, which is primarily mediated through hydrogen protocols for modeling large RNA structures in the absence
bonding of 2'-hydroxyl groups[133] and an exocyclic amine of a of crystallographic or NMR spectroscopic data. The challenge

Angew. Chem. Int. Ed. 1999, 38, 2326 ± 2343 2341


REVIEWS J. A. Doudna et al.

that lies ahead is to extend the lessons learned from the well [20] J. E. Wedekind, D. B. McKay, Annu. Rev. Biophys. Biomol. Struct.
studied systems of tRNA, the Tetrahymena group I intron, the 1998, 27, 475 ± 502.
[21] W. Saenger, Principles of Nucleic Acid Structure, Springer, New
hammerhead ribozyme, and RNase P to other more difficult York, 1984.
systems such as the group II self-splicing intron. Furthermore, [22] S. H. Kim, Prog. Nucleic Acid Res. Mol. Biol. 1976, 17, 181 ± 216.
most RNA in the cellular environment interacts with numer- [23] A. Jack, J. E. Lander, A. Klug, J. Mol. Biol. 1976, 108, 619 ± 649.
ous proteins to create ribonucleoprotein (RNP) enzymes such [24] J. B. Murray, D. P. Terwey, L. Maloney, A. Karpeisky, N. Usman, L.
Beigelman, W. G. Scott, Cell 1998, 92, 665 ± 673.
as the ribosome, spliceosome, telomerase, and signal recog-
[25] G. S. Bassi, N.-E. Mollegaard, A. I. H. Murchie, E. von Kitzing,
nition particle. Our understanding of structure, folding, and D. M. J. Lilley, Nat. Struct. Biol. 1995, 2, 45 ± 55.
assembly, and functional mechanisms of these RNPs lags far [26] G. S. Bassi, A. I. H. Murchie, D. M. J. Lilley, RNA 1996, 2, 756 ± 768.
behind that of the ribozymes discussed here. Future work will [27] T. Tuschl, C. Gohlke, T. M. Jovin, E. Westhof, F. Eckstein, Science
focus increasingly on RNA ± protein interactions, and will be 1994, 266, 785 ± 789.
[28] K. M. A. Amiri, P. J. Hagerman, Biochemistry 1994, 33, 13 172 ±
guided by themes that have emerged from these studies of 13 177.
RNA structure and function. [29] J. B. Murray, A. A. Seyhan, N. G. Walter, J. M. Burke, W. G. Scott,
Chem. Biol. 1998, 5, 587 ± 595.
The authors would like to thank Elizabeth Doherty, Patrick [30] Z. Shen, P. J. Hagerman, J. Mol. Biol. 1994, 241, 415 ± 430.
[31] D. R. Duckett, A. I. H. Murchie, D. M. Lilley, Cell 1995, 83, 1027 ±
Zarrinkar, and Scott Strobel for helpful discussions. Support
1036.
for this work was provided by a postdoctoral fellowship to [32] J. W. Orr, P. J. Hagerman, J. R. Williamson, J. Mol. Biol. 1997, 275,
R.T.B. from the Jane Coffin Childs Memorial Medical 453 ± 464.
Research Fund and through grants from the Beckman [33] R. T. Batey, J. R. Williamson, RNA 1998, 4, 984 ± 997.
Foundation, the Packard Foundation, the National Institutes, [34] J. H. Cate, A. R. Gooding, E. Podell, K. Zhou, B. L. Golden, A. A.
Szewczak, C. E. Kundrot, T. R. Cech, J. A. Doudna, Science 1996,
and the Howard Hughes Medical Institute (J.A.D.). 273, 1696 ± 1699.
[35] G. R. Zimmerman, R. D. Jenison, C. L. Wick, J.-P. Simorre, A. Pardi,
Nat. Struct. Biol. 1997, 4, 644 ± 649.
Received: December 8, 1998 [A 317 IE] [36] M. Costa, F. Michel, EMBO J. 1997, 16, 3289 ± 3302.
German version: Angew. Chem. 1999, 111, 2472 ± 2491 [37] S. E. Lietzke, C. L. Barnes, J. A. Berglund, C. E. Kundrot, Structure
1996, 4, 917 ± 930.
[38] S. R. Holbrook, C. Cheong, I. Tinoco, Jr., S.-H. Kim, Nature 1991,
353, 579 ± 581.
[1] K. Kruger, P. J. Grabowski, A. J. Zaug, J. Sands, D. E. Gottschling, [39] K. J. Baeyens, H. L. De Bondt, S. R. Holbrook, Nat. Struct. Biol.
T. R. Cech, Cell 1982, 31, 147 ± 157. 1995, 2, 56 ± 62.
[2] C. Guerrier-Takada, K. Gardiner, T. Marsh, N. Pace, S. Altman, Cell [40] G. A. Leonard, K. E. McAuley-Hecht, S. Ebel, D. M. Lough, T.
1983, 35, 849 ± 857. Brown, W. N. Hanter, Structure 1994, 2, 483 ± 494.
[3] C. Levinthal, J. Chim. Phys. Phys. Chim. Biol. 1968, 65, 44 ± 45. [41] M. C. Wahl, S. T. Rao, M. Sundaralingam, Nat. Struct. Biol. 1996, 3,
[4] D. J. Lane, B. Pace, G. J. Olsen, D. A. Stahl, M. Sogin, N. Pace, Proc. 24 ± 31.
Natl. Acad. Sci. USA 1985, 82, 6955 ± 6959. [42] R. D. Blake, J. MassoulieÂ, J. R. Fresco, J. Mol. Biol. 1967, 30, 291 ±
[5] J. A. Kowalak, S. C. Pomerantz, P. F. Crain, J. A. McCloskey, Nucleic 308.
Acids Res. 1993, 21, 4577 ± 4585. [43] J. MassoulieÂ, Eur. J. Biochem. 1968, 3, 439 ± 447.
[6] M. Chastain, I. Tinoco, Jr., Prog. Nucleic Acids Res. Mol. Biol. 1991, [44] R. Klinck, J. Liquier, E. Taillandier, C. Gouyette, T. Huynh-Dinh, E.
41, 131 ± 177. Guittet, Eur. J. Biochem. 1995, 233, 544 ± 553.
[7] Review: D. H. Turner, N. Sugimoto, S. M. Freir, Annu. Rev. Biophys. [45] Review: J. D. Puglisi, J. R. Williamson, The RNA World, 2nd ed.
Biophys. Chem. 1988, 17, 167 ± 192. (Eds.: R. F. Gesteland, T. R. Cech, J. F. Atkins), Cold Spring Harbor
[8] review: C. R. Woese, N. R. Pace, The RNA World (Eds.: R. Geste- Laboratory Press, Cold Spring Harbor, NY, 1999, pp. 403 ± 425.
land, J. F. Atkins), Cold Spring Harbor Laboratory Press, Cold [46] A. A. Szewczak, L. Ortoleva-Donnelly, S. P. Ryder, E. Moncoeur,
Spring Harbor, NY, 1993, pp. 91 ± 117. S. A. Strobel, Nat. Struct. Biol. 1998, 5, 1037 ± 1042.
[9] Review: C. Ehresmann, F. Baudin, M. Mougel, P. Romby, J.-P. Ebel, [47] C. Turek, P. Gauss, C. Thermes, D. R. Groebe, M. Gayle, N. Guild, G.
B. Ehresmann, Nucleic Acids Res. 1987, 15, 53 ± 72. Stormo, Y. DAubenton-Carafa, O. C. Uhlenbeck, I. Tinoco, Jr.,
[10] J. F. Milligan, D. R. Groebe, G. W. Witherell, O. C. Uhlenbeck, E. N. Brody, L. Gold, Proc. Natl. Acad. Sci. USA 1988, 85, 1364 ±
Nucleic Acids Res. 1987, 15, 8783 ± 8798. 1368.
[11] S. R. Holbrook, RNA structure and function (Eds.: R. W. Simons, M. [48] C. R. Woese, S. Winker, R. R. Gutell, Proc. Natl. Acad. Sci. USA
Grunberg-Manago), Cold Spring Harbor Press, Cold Spring Harbor, 1990, 87, 8467 ± 8471.
NY, 1998, pp. 147 ± 174. [49] M. Molinaro, I. Tinoco, Jr., Nucleic Acids Res. 1995, 23, 3056 ± 3063.
[12] E. V. Puglisi, J. D. Puglisi, RNA structure and function (Eds.: R. W. [50] H. Heus, A. Pardi, Science 1991, 253, 191 ± 194.
Simons, M. Grunberg-Manago), Cold Spring Harbor Laboratory [51] F. M. Jucker, H. A. Heus, P. F. Yip, E. H. Moors, A. Pardi, J. Mol.
Press, Cold Spring Harbor, NY, 1998, pp. 117 ± 146. Biol. 1996, 264, 968 ± 980.
[13] S. H. Kim, F. L. Suddath, G. J. Quigley, A. McPherson, J. L. Sussman, [52] L. Jaeger, F. Michel, E. Westhof, J. Mol. Biol. 1994, 236, 1271 ± 1276.
A. Wang, N. C. Seeman, A. Rich, Science 1974, 185, 435 ± 440. [53] F. L. Murphy, T. R. Cech, J. Mol. Biol. 1994, 236, 49 ± 63.
[14] J. D. Robertus, J. E. Ladner, J. T. Finch, D. Rhodes, R. D. Brown, [54] M. Costa, F. Michel, EMBO J. 1995, 14, 1276 ± 1285.
B. F. C. Clark, A. Klug, Nature 1974, 250, 546 ± 551. [55] H. Pley, K. Flaherty, D. McKay, Nature 1994, 372, 111 ± 113.
[15] H. W. Pley, K. M. Flaherty, D. B. McKay, Nature 1994, 372, [56] D. L. Abramovitz, A. M. Pyle, J. Mol. Biol. 1997, 266, 493 ± 506.
68 ± 74. [57] J. H. Cate, R. L. Hanna, J. A. Doudna, Nat. Struct. Biol. 1997, 4, 553 ±
[16] W. G. Scott, J. T. Finch, A. Klug, Cell 1995, 81, 991 ± 1002. 558.
[17] J. H. Cate, A. R. Gooding, E. Podell, K. Zhou, B. L. Golden, C. E. [58] J. A. Ippolito, T. A. Steitz, Proc. Natl. Acad. Sci. USA 1998, 95,
Kundrot, T. R. Cech, J. A. Doudna, Science 1996, 273, 1678 ± 1685. 9819 ± 9824.
[18] A. R. FerreÂ-DAmareÂ, K. Zhou, J. A. Doudna, Nature 1998, 395, [59] M. Zacharias, P. J. Hagerman, Proc. Natl. Acad. Sci. USA 1995, 92,
567 ± 574. 6052 ± 6056.
[19] E. Westhof, F. Michel, RNA-Protein interactions (Eds.: K. Nagai, [60] M. Zacharias, P. J. Hagerman, J. Mol. Biol. 1995, 247, 486 ± 500.
I. W. Mattaj), IRL Press, New York, 1994, pp. 25 ± 51. [61] M. S. Chapman, M. G. Rossmann, Structure 1995, 3, 151 ± 162.

2342 Angew. Chem. Int. Ed. 1999, 38, 2326 ± 2343


RNA Structure REVIEWS
[62] V. Lehnert, L. Jaeger, F. Michel, E. Westhof, Chem. Biol. 1996, 3, [101] A. Stein, D. M. Crothers, Biochemistry 1976, 15, 157 ± 160.
993 ± 1009. [102] A. R. Banerjee, J. A. Jaeger, D. H. Turner, Biochemistry 1993, 32,
[63] J. P. Marino, R. S. Gregorian, Jr., G. Csankovszki, D. M. Crothers, 153 ± 163.
Science 1995, 268, 1448 ± 1454. [103] B. Sclavi, S. Woodson, M. Sullivan, M. R. Chance, M. Brenowitz, J.
[64] A. J. Lee, D. M. Crothers, Structure 1998, 6, 993 ± 1005. Mol. Biol. 1997, 266, 144 ± 159.
[65] K.-Y. Chang, I. Tinoco, Jr., Proc. Natl. Acad. Sci. USA 1994, 91, [104] B. Sclavi, M. Sullivan, M. R. Chance, M. Brenowitz, S. A. Woodson,
8705 ± 8709. Science 1998, 279, 1940 ± 1943.
[66] K.-Y. Chang, I. Tinoco, Jr., J. Mol. Biol. 1997, 269, 52 ± 66. [105] P. P. Zarrinkar, J. R. Williamson, Nucleic Acids Res. 1996, 24, 854 ±
[67] R. S. Gregorian, Jr., D. M. Crothers, J. Mol. Biol. 1995, 248, 968 ± 984. 858.
[68] C. C. Correll, B. Freeborn, P. B. Moore, T. A. Steitz, Cell 1997, 91, [106] P. P. Zarrinkar, J. R. Williamson, Science 1994, 265, 918 ± 924.
705 ± 712. [107] P. P. Zarrinkar, J. R. Williamson, Nat. Struct. Biol. 1996, 3, 432 ± 438.
[69] A. A. Szewczak, P. B. Moore, J. Mol. Biol. 1995, 247, 81 ± 98. [108] A. R. Banerjee, D. H. Turner, Biochemistry 1995, 34, 6504 ± 6512.
[70] J. D. Puglisi, J. R. Wyatt, I. Tinoco, Jr., J. Mol. Biol. 1990, 214, 437 ± [109] W. D. Downs, T. R. Cech, RNA 1996, 2, 718 ± 732.
453. [110] D. K. Treiber, M. S. Rook, P. P. Zarrinkar, J. R. Williamson, Science
[71] J. R. Wyatt, J. D. Puglisi, I. Tinoco, Jr., J. Mol. Biol. 1990, 214, 455 ± 1998, 279, 1943 ± 1946.
470. [111] M. S. Rook, D. K. Treiber, J. R. Williamson, J. Mol. Biol. 1998, 281,
[72] L. X. Shen, I. Tinoco, Jr., J. Mol. Biol. 1995, 247, 963 ± 978. 609 ± 620.
[73] X. Chen, H. Kang, L. X. Shen, M. Chamorro, H. E. Varmus, I. [112] J. S. Weissman, P. S. Kim, Science 1991, 253, 1386 ± 1393.
Tinoco, Jr., J. Mol. Biol. 1996, 260, 479 ± 483. [113] J. Pan, D. Thirumalai, S. A. Woodson, J. Mol. Biol. 1997, 273, 7 ± 13.
[74] M. H. Kolk, M. van der Graaf, S. S. Wijmenga, C. W. A. Pleij, H. A. [114] T. Pan, T. R. Sosnick, Nat. Struct. Biol. 1997, 4, 931 ± 938.
Heus, C. W. Hilbers, Science 1998, 280, 434 ± 438. [115] D. Thirumalai, S. A. Woodson, Acc. Chem. Res. 1996, 29, 433 ± 439.
[75] L. Su, L. Chen, M. Egli, J. M. Berger, A. Rich, Nat. Struct. Biol. 1999, [116] K. A. Dill, H. S. Chan, Nat. Struct. Biol. 1997, 4, 10 ± 19.
6, 285 ± 292. [117] J. Pan, S. A. Woodson, J. Mol. Biol. 1998, 280, 597 ± 609.
[76] S. P. Rosenstein, M. D. Been, Biochemistry 1990, 29, 8011 ± 8016. [118] S. R. Price, N. Ito, C. Oubridge, J. M. Avis, K. Nagai, J. Mol. Biol.
[77] J. B. Smith, G. Dinter-Gottlieb, Nucleic Acids Res. 1991, 19, 1285 ± 1995, 249, 398 ± 408.
1289. [119] A. R. FerreÂ-DAmareÂ, J. A. Doudna, Nucleic Acids Res. 1996, 24,
[78] J. A. Latham, T. R. Cech, Science 1989, 245, 276 ± 282. 977 ± 978.
[79] D. C. Celander, T. R. Cech, Biochemistry 1990, 29, 1355 ± 1361. [120] J. A. Doudna, C. Grosshans, A. Gooding, C. E. Kundrot, Proc. Natl.
[80] B. Laggerbauer, F. L. Murphy, T. R. Cech, EMBO J. 1994, 13, 2669 ± Acad. Sci. USA 1993, 90, 7829 ± 7833.
2676. [121] W. G. Scott, J. T. Finch, F. Grenfell, J. Fogg, T. Smith, M. J. Gait, A.
[81] F. L. Murphy, T. R. Cech, Biochemistry 1993, 32, 5291 ± 5300. Klug, J. Mol. Biol. 1995, 250, 327 ± 332.
[82] E. L. Christian, M. Yarus, J. Mol. Biol. 1992, 228, 743 ± 758. [122] F. Michel, E. Westhof, J. Mol. Biol. 1990, 216, 585 ± 610.
[83] E. L. Christian, M. Yarus, Biochemistry 1993, 32, 4477 ± 4480. [123] M. A. Tanner, T. R. Cech, Science 1997, 275, 847 ± 849.
[84] E. A. Doherty, J. A. Doudna, Biochemistry 1997, 36, 3159 ± 3169. [124] E. Westhof, J. Mol. Struct. (THEOCHEM) 1993, 286, 203 ± 210.
[85] G. Van Der Horst, A. Christian, T. Inoue, Proc. Natl. Acad. Sci. USA [125] B. L. Golden, A. R. Gooding, E. R. Podell, T. R. Cech, Science 1998,
1991, 88, 184 ± 188. 282, 259 ± 264.
[86] E. A. Doherty, D. Herschlag, J. A. Doudna, Biochemistry 1999, 38, [126] N. K. Tanner, S. Schaff, G. Thill, E. Petit-Koskas, A. Crain-
2982 ± 2990. Denoyelle, E. Westhof, Curr. Biol. 1994, 4, 488 ± 498.
[87] J. A. Doudna, T. R. Cech, RNA 1995, 1, 36 ± 45. [127] G. Gish, F. Eckstein, Science 1988, 240, 1520 ± 1522.
[88] C. Guerrier-Takada, S. Altman, Science 1984, 223, 285 ± 286. [128] R. K. Gaur, G. Krupp, Nucleic Acids Res. 1993, 21, 21 ± 26.
[89] W. D. Hardt, J. M. Warnecke, V. A. Erdmann, R. K. Hartmann, [129] S. A. Strobel, K. Shetty, Proc. Natl. Acad. Sci. USA 1997, 94, 2903 ±
EMBO J. 1995, 14, 2935 ± 2944. 2908.
[90] W. D. Hardt, V. A. Erdmann, R. K. Hartmann, RNA 1996, 2, 1189 ± [130] S. A. Strobel, Biopolymers, in press.
1198. [131] L. Ortoleva-Donnelly, A. A. Szewczak, R. R. Gutell, S. A. Strobel,
[91] T. Pan, A. Loria, K. Zhong, Proc. Natl. Acad. Sci. USA 1995, 92, RNA 1998, 4, 498 ± 519.
12 510 ± 12 514. [132] S. A. Strobel, L. Ortoleva-Donnelly, S. P. Ryder, J. H. Cate, E.
[92] F. Conrad, A. Hanne, R. K. Gaur, G. Krupp, Nucleic Acids Res. 1995, Moncoeur, Nat. Struct. Biol. 1998, 5, 60 ± 66.
23, 1845 ± 1853. [133] S. A. Strobel, T. R. Cech, Biochemistry 1993, 32, 13 593 ± 13 604.
[93] T. Pan, Biochemistry 1995, 34, 902 ± 909. [134] S. A. Strobel, T. R. Cech, Science 1995, 267, 675 ± 679.
[94] A. Loria, T. Pan, RNA 1996, 2, 551 ± 563. [135] D. Söll, The RNA world (Eds.: R. Gesteland, J. F. Atkins), Cold
[95] P. Z. Qin, A. M. Pyle, Curr. Opin. Struct. Biol. 1998, 8, 301 ± 308. Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1993,
[96] P. E. Cole, D. M. Crothers, Biochemistry 1972, 11, 4368 ± 4374. pp. 157 ± 184.
[97] D. M. Crothers, P. E. Cole, C. W. Hilbers, R. G. Shulman, J. Mol. [136] J. L. Sussman, S. R. Holbrook, R. W. Warrant, G. M. Church, S. H.
Biol. 1974, 87, 63 ± 88. Kim, J. Mol. Biol. 1978, 123, 607 ± 630.
[98] C. W. Hilbers, G. T. Robillard, R. G. Shulman, R. D. Blake, P. K. [137] M. Carson, J. Appl. Crystallogr. 1991, 47, 110.
Webb, R. Fresco, D. Riesner, Biochemistry 1976, 15, 1874 ± 1882. [138] S. H. Damberger, R. R. Gutell, Nucleic Acids Res. 1994, 22, 3508 ±
[99] E. R. Hawkins, S. H. Chang, W. L. Mattice, Biopolymers, 1977, 16, 3510.
1557 ± 1566. [139] J. S. Kieft, I. J. Tinoco, Structure 1997, 5, 713 ± 721.
[100] A. Stein, D. M. Crothers, Biochemistry 1976, 15, 160 ± 168. [140] D. E. Draper, Nat. Struct. Biol. 1996, 3, 397 ± 400.

Angew. Chem. Int. Ed. 1999, 38, 2326 ± 2343 2343

You might also like