Tertiary Motifs in RNA Structure
Tertiary Motifs in RNA Structure
RNA plays a critical role in mediating globular architecture. In this article we pathway in which domains assemble
every step of the cellular information present an overview of the structures sequentially. Formation of the proper
transfer pathway from DNA-encoded of these motifs and their contribution tertiary interactions between these do-
genes to functional proteins. Its diver- to the organization of large, biologi- mains leads to discrete intermediates
sity of biological functions stems from cally active RNAs. Base stacking, par- along this pathway. Advances in the
the ability of RNA to act as a carrier of ticipation of the ribose 2'-hydroxyl understanding of RNA structure have
genetic information and to adopt com- groups in hydrogen-bonding interac- facilitated improvements in the tech-
plex three-dimensional folds that cre- tions, binding of divalent metal cations, niques that are utilized in modeling the
ate sites for chemical catalysis. Atomic- noncanonical base pairing, and back- global architecture of biologically in-
resolution structures of several large bone topology all serve to stabilize the teresting RNAs that have been resist-
RNA molecules, determined by X-ray global structure of RNA and play ant to atomic-resolution structural
crystallography, have elucidated some critical roles in guiding the folding analysis.
of the means by which a global fold is process. Studies of the RNA-folding
achieved. Within these RNAs are ter- problem, which is conceptually analo- Keywords: molecular modeling ´
tiary structural motifs that enable the gous to the protein-folding problem, nucleic acids ´ ribozymes ´ RNA ´
highly anionic double-stranded helices have demonstrated that folding pri- structure elucidation
to tightly pack together to create a marily proceeds through a hierarchical
Angew. Chem. Int. Ed. 1999, 38, 2326 ± 2343 WILEY-VCH Verlag GmbH, D-69451 Weinheim, 1999 1433-7851/99/3816-2327 $ 17.50+.50/0 2327
REVIEWS J. A. Doudna et al.
Jennifer A. Doudna was born in 1964 in Washington DC (USA). She received her B.A. in
Chemistry with honors from Pomona College in Claremont, CA, in 1985, and her Ph.D. in
Biochemistry from Harvard University in 1989. At Harvard she worked with Prof. Jack
Szostak on the design of self-replicating RNA, which led to her interest in RNA structure and
folding. She was awarded a Lucille P. Markey Scholar fellowship in 1991 to study with Prof.
Tom Cech at the University fo Colorado, Boulder. During the next three years she developed
methods for the crystallization of RNA in collaboration with Prof. Craig Kundrot. She joined
the Yale faculty in the Departement of Molecular Biophysics and Biochemistry in 1994, and she
was promoted to Associate Professor in 1997 and to Professor in 1999. She received the
Johnson Foundation Prize for Innovative Research in 1996 for her work on the crystal
structure of the P4ÿP6 domain of the Tetrahymena self-splicing intron. She is currently an
Assistant Investigator at the Howard Hughes Medical Institute.
RNAs have a variety of specialized interhelical packing uridine) demonstrated that the duplexes converted into
motifs, the strictly helical nature of duplexes does not present poly(U)-poly(A)-poly(U) triple helices and single-stranded
much chemical diversity with which to create these tertiary poly(A) at high ionic strength.[42, 43] Strands of poly(A) and
interactions. The interhelical contacts created in these lattices, poly(U) form standard Watson ± Crick base pairs in the
though, may provide insights in how helices pack in some triplex, while the other poly(U) strand is placed in the major
natural RNAs. In most of the crystals observed thus far, the groove of the RNA duplex to form Hoogsteen-type pairs with
duplex oligonucleotides coaxially stack to create ªpseudo- the poly(A) strand (Figure 7 a). Similarly, a protonated
infiniteº helices that pack into a three-dimensional lattice in cytosine can hydrogen bond with the Hoogsteen face of a
two distinct ways. guanosine involved in a Watson ± Crick GC pair to create an
In the case of the RNA dodecamer 5'GGCGCUUGCGUC3' isosteric (C ´ G-C) triple (Figure 7 b). Structural character-
the quasi-continuous helices pack such that the backbone of ization of a model RNA triplex containing both types of triple
one helix contacts the shallow minor groove of a perpendic- base pairs by NMR spectroscopy revealed that the third
ular helix.[37] At each site of contact, extensive direct hydrogen strand can be placed into the deep major groove of RNA
bonding occurs between 2'-hydroxyl groups, the 3'-oxygen without significant conformational distortion from ideal
atom, and phosphate oxygen atoms of the backbone of one A-form helical geometry.[44]
helix and the pyrimidine O2 atom, the exocyclic amine of Despite the stability of the extended pyrimidine-purine-
guanosine, and 2'-hydroxyl groups in the minor groove of the pyrimidine triplex, this motif has not been observed in any
adjacent helix (Figure 6). The majority of the hydrogen bonds naturally occurring RNA. Instead, many RNAs employ this
are mediated by the 2'-hydroxyl groups: 8 out of 13 in the first motif in a limited fashion as isolated major and minor groove
contact site and 5 out of 8 in the second contact site between base triples. Structural examples exist for both types of
helices. Similar sets of contacts are observed in crystals with pairing. Major-groove triples are observed within the crystal
the same mode of helical packing.[38, 39] structure of tRNAPhe, where the D-arm forms two consecutive
triple base pairs.[13, 14, 23] Unlike the model triplexes, in which
only pyrimidines are used in the Hoogsteen pairing strand,
both triple base pairs involve a purine interacting with the
Hoogsteen face of a purine involved in a Watson ± Crick pair
(Figure 7 c). Minor groove triples are also observed in
tRNAPhe, where A21, which is coplanar to a U8 ´ A14
reverse-Hoogsteen base pair, interacts through the minor
groove by forming a hydrogen bond between its N1 and the 2'-
hydroxyl group of U8 (Figure 7 d). Minor groove triple base
pairs are found within the Th-intron in the GAAA tetraloop ±
tetraloop interaction[17] (see Section 2.3.2) and the triple-
helical scaffold, a structure critical for proper alignment of the
P4ÿP6 and P3ÿP7 domains. Triple base pairs also play a
Figure 6. The perpendicular packing arrangement of pseudo-infinite RNA critical role in the interactions of many small molecule ligands
helices. The backbone of one helix interacts with nucleotide bases in the
and peptides with RNA.[45]
wide, shallow minor groove of an adjacent helix in the crystal lattice. This
figure was adapted from the crystal structure of the oligonucleotide The diverse morphology of major- and minor-groove triples
5'
GGCGCUUGCGUC (Lietzke et al.;[37] PDB accession number 280D). found in biologically active RNAs suggests that the nature of
triplexes in these RNAs is quite different from that of the
model pyrimidine-purine-pyrimidine triple helices. Instead,
This set of intimate contacts between helices is in stark
extended single stranded regions appear to tie helical regions
contrast to crystal structures in which the pseudo-infinite
together through the formation of base ± triple interactions.
helices are oriented parallel with respect to one another. In
The variable loop of tRNAPhe (Figure 2 b, c) forms two base
these crystals the backbones of the RNA duplexes make a
triples with the major groove of the D-arm, as well as pairing
limited number of interhelical contacts, primarily through
interactions with the D-loop and the coaxial stack between
water-mediated backbone ± backbone contacts and 2'-hydrox-
the D- and anticodon arms. In the hepatitis delta ribozyme
yl groups of the phosphate contacts.[40, 41] The dominance of 2'-
(Figure 8) a stretch of four single-stranded nucleotides (J4/2)
hydroxyl groups in mediating helical packing in both arrange-
form extensive interactions with the two sets of coaxially
ments is reflected in many of the tertiary interactions found in
stacked helices in the region that is the proposed catalytic
natural RNAs, such as the ribose zipper motif (see Sec-
pocket.[18] Two invariant adenosines in this single-stranded
tion 2.3.4).
region interact with a helix through a minor groove triple with
a Watson ± Crick GC pair and a ribose zipper motif (see
Section 2.3.4). Cytosine 75 (shown in red in Figure 8), which is
2.3. Interactions Between Helical and Unpaired Motifs
critical for the catalytic activity of this ribozyme, forms a
hydrogen bond between its N4 atom and a phosphate group
2.3.1. Base Triples and Triplexes
within the other set of coaxially stacked helices. There is
Early biophysical studies of heteroduplexes of poly(A) extensive biochemical evidence that a minor-groove triple
(homopolymeric adenosine) and poly(U) (homopolymeric helix formed by J8/7 within the catalytic core of the Th-intron
Figure 7. Examples of triple base pairs found in RNA. a) Hoogsteen-type U ´ A-U triple base pair, with the second uracil
hydrogen atom bonding to the major-groove face of a standard Watson ± Crick AU pair. b) A protonated Hoogsteen-type C ´
G-C pair. c) A purine triple base pair found in the major groove of tRNAPhe. d) A triple base pair found in tRNAPhe in which an
adenosine hydrogen atom bonds with a 2'-hydroxyl group on the minor groove face of a reverse-Hoogsteen A ´ U base pair.
is critical for bridging the P1 substrate helix and the P3 and P4 several of these tetraloops (GAGA, GCAA, and GAAA) by
helices that form the heart of the catalytic core (see Fig- NMR spectroscopy reveal a network of hydrogen-bonding
ure 17).[46] Thus, it appears from these examples that single- and base-stacking interactions that create a stable structure
stranded regions often serve to bind helical regions together (Figure 9 a).[50, 51] In all of the tetraloops the position 1
through the formation of triple interactions between bases in guanosine (G1) and the adenosine in position 4 (A4) form a
many large RNAs. sheared base pair in which G1ÿN3 hydrogen bonds with
A4ÿN6, and G1ÿN2 forms a hydrogen bond with A4ÿN7. In
the GCAA and GAAA tetraloops the G1ÿN3 to A4ÿN6
2.3.2. The Tetraloop Motif
distance is too long to be a direct hydrogen bond, and is thus
Among the most prevalent motifs observed in natural likely to be water mediated.[51] In the GAAA tetraloop this
RNAs are several four nucleotide loop sequences (tetraloops) pair is surrounded by hydrogen bonds between the phosphate
found in the 16S- and 23S-ribosomal RNAs (rRNAs), groups I of A4 and the Watson ± Crick face of G1, and the 2'-hydroxyl
and II self-splicing introns, ribonuclease P, and bacterio- group of G1 interacting with the Hoogsteen face of G/A3 and
phage T4 messenger RNA.[47] For instance, 16S-rRNA tetra- A4ÿN6. In the GAGA and GAAA tetraloops the nucleotide
loops account for about 55 % of all hairpin loops in in position 2 is stacked upon the nucleotide in position 3,
eubacterial, while five nucleotide loops, the next most whereas in the GCAA loop the cytosine in position 2 is
prevalent loop size, account for 13 % of the loops.[48] The disordered. This variability in the hydrogen-bonding and
majority of 16S-rRNA tetraloops fall into two sequence stacking interactions between different members of the
categories: the ªUNCGº motif and the ªGNRAº motif. The GNRA family does not significantly alter their thermody-
UNCG-tetraloop motif (N indicates that the second nucleo- namic stability, but may be critically important for the ability
tide position in this sequence motif may be any nucleotide) of this motif to form tertiary interactions with a diverse array
has been shown to confer unusually high thermodynamic of RNA motifs.
stability to an RNA hairpin,[47, 49] but has not yet been Covariation analysis of the groups I and II introns and
implicated in the formation of tertiary interactions. RNase P have revealed several RNA motifs that form tertiary
The GNRA sequence motif, in which any nucleotide is interactions with the GNRA-type tetraloop. The structure of
accommodated in the second position and the third position is an internal loop motif, termed a tetraloop receptor, which
a purine base (R A or G), is the most prevalent tetraloop interacts with the GAAA tetraloop with high affinity and
motif in naturally occurring RNAs.[48] Structural analyses of specificity[52±54] was elucidated as part of the Th-intron P4ÿP6
each ion.[57] The nucleotide bases splayed out from the bulge
form tertiary contacts to two regions of the RNA. A183 and
A184 interact with functional groups within the minor groove
of the P4 helix by using the ribose zipper motif (see
Section 2.3.4). A186 extensively hydrogen bonds with the
minor groove face of the C137ÿG181 base pair of P5a and Figure 11. The hydrogen-bond network between 2'-hydroxyl groups and
G164 in the sheared A139ÿG164 pair found at the top of P5b. nucleotide bases within the ribose zipper motif found between two
Stacking interactions between A186 and the A139ÿG164 pair adenosines in the A-rich bulge and the minor-groove face of the P4 helix
of the Tetrahymena group I intron P4ÿP6 domain.
facilitate the P5aÿ5b coaxial stack at this junction. The fourth
adenosine of the bulge, A187, forms a noncanonical base pair
with U135. Thus, the adenosine-rich bulge presents nucleotide intermolecular interaction of a GAAA tetraloop and the
bases in a conformation that allows helix P4 to be tightly minor groove in the hammerhead crystal structure[55] and in
docked against the P5aÿ5b helical coaxial stack, which, along the crystal structure of the hepatitis delta virus ribozyme.[18]
with the GAAA tetraloop ± tetraloop receptor, firmly locks This interaction might be quite pervasive within many bio-
the two sets of helices comprising the P4ÿP6 domain together logical RNAs because it can potentially pack RNA strands
to form the overall fold of this RNA. and helices together with few sequence-specific requirements.
The crystal structure of the trans-activation response region However, the lack of sequence specificity of the ribose zipper
(TAR) of the human immunodeficiency virus (HIV) displays makes it extremely difficult to predict by covariation analysis
a similar motif.[58] In this RNA a three nucleotide bulge with or many biochemical probing techniques.
the sequence UCU binds three calcium ions. These ions
coordinate phosphate oxygen atoms in the backbone of the
RNA in and around the bulged nucleotides. As a consequence
2.4. Tertiary Interactions Between Unpaired Regions
the three pyrimidine bases face away from the RNA helix and
allows the stems flanking the bulge to coaxially stack upon
2.4.1. Loop ± Loop Interactions
one another. It has been demonstrated by using transient
electric birefringence measurements that Mg2 also binds this Hairpin loops present rich potential for the formation of
pyrimidine-rich bulge, which causes the two helices to tertiary contacts through pairing interactions between their
coaxially stack. If the bulge sequence is changed to three nucleotide bases to create new helices. This type of interaction
adenosines then magnesium ions no longer effect this change, is represented in a limited form in the structure of tRNAPhe, in
which indicates that the nucleotide sequence affects the which nucleotides in the D- and T-loops form two base pairs as
ability of the backbone to specifically bind metal ions.[59, 60] part of the extensive tertiary contacts that form the elbow of
This motif has also been observed in the recognition of single- the molecule (see Figure 2 b, c). In the Th-intron helices P13
stranded DNA of the canine parvovirus by the viral capsid and P14 are formed through the formation of five to seven
protein through chelation of a divalent metal ion by the DNA contiguous Watson ± Crick base pairs between complimentary
phosphate backbone, which organizes the nucleotide bases for hairpin loops, which tie together distinct structural domains
recognition by the protein.[61] Thus, this motif appears to be a (see Figure 3 a).[62]
The structural nature of complimentary loop ± loop inter- Mg2 ions or a high concentration of Na ions for complete
actions has been investigated in detail with NMR spectro- stabilization.[71] Additionally, many pseudoknots have bends
scopy, using the ªkissº complex formed between two RNAs at the coaxial stack caused by unpaired nucleotides interca-
involved in the regulation of Col E1 plasmid replication[63, 64] lating between the two helices,[72] which is an essential feature
and the TAR loop sequence of HIV-2[65, 66] as model systems. for the in vivo function of one retroviral mRNA pseudo-
Each complex is a single composite coaxially stacked helix knot.[73]
comprising the two original hairpin stems and a new helix The loops (L1 and L2) that span the two helices of the
formed between them created by the Watson ± Crick base pseudoknot are inequivalent in that L1 crosses the major
pairing of the nucleotides in the complimentary loops (Fig- groove of the 3'-stem 2 (S2) and L2 crosses the minor groove
ure 12 a, b). Notably, all of the nucleotides in each loop are of the 5'-stem (S1). In a recent structure of the pseudoknot
from the tRNA-like genomic TYMV RNA, nucleotides from
both loops were observed to interact with the helices (Fig-
ure 13 b, c).[74] Loop 2, which crosses the minor groove,
displays hydrogen-bonding interactions between some of its
nucleotides and functional groups within the minor groove of
stem 1. An adenosine within this loop, which is highly
conserved among plant viral RNAs, is proposed to make
hydrogen-bonding contacts between its N1 atom and the
exocyclic amino group of a stem 2 guanosine and between the
adenosine exocyclic amino group and the N3 atom of an
adjacent guanosine in the helix. The adjacent cytosine in
loop 2 also makes a hydrogen-bonding contact between its
exocyclic amine and a 2'-hydroxyl group in the minor groove
of stem 2. This suggests that pseudoknots may employ triple-
helical buttressing by the loops to further stabilize the
Figure 12. a) Secondary and b) tertiary structure of the kissing complex structure. A very recent high-resolution crystal structure of a
between stem loops found in the genomic RNA of the HIV-2 trans-
pseudoknot from the beet western yellow virus mRNA
activation region (PDB accession number 1kis).[66]
demonstrates extensive tertiary interactions in both the major
and the minor grooves.[75]
stacked on the 3'-side of the central helix and are involved in A remarkable example of the pseudoknot motif being
pairing interactions. Stable formation of this stacked struc- utilized to direct the global architecture of an RNA is
ture, like the coaxial stacks at junction motifs, requires demonstrated in the structure of the hepatitis delta ribo-
magnesium ions.[67] The Col E1 kiss complex is further zyme.[18] This ribozyme is exceptionally stable; both the
stabilized by a purine ± purine cross-
strand stack,[64] an RNA motif, which
has also been found in 5S rRNA,[68] the
sarcin ± ricin loop,[69] the hammerhead
ribozyme,[15, 16] and the P4ÿP6 domain.[17]
This motif involves the six-membered
ring of a purine stacking upon that of a
purine in the opposite strand of the
duplex, as opposed to stacking upon its
same-strand neighbor, as is found in a
regular A-form RNA duplex.
tion of P5abc from the intron (!DP5abc-intron) significantly domains of group II introns are capable of efficiently binding
impairs its activity under standard splicing conditions (5 mm through a trans association to form a functional RNA (for an
MgCl2 , pH 7.5), but can be rescued at higher ionic strengths excellent review see reference [95]).
(15 mm MgCl2 , 2 mm spermine, pH 7.5).[85] Wild-type activity
can be restored by supplying the P5abc subdomain to the
DP5abc intron in a trans arrangement (a trans association
involves the binding of two separate RNA molecules, in 3.2. The Hierarchical Folding Model of RNA
contrast to a cis association in which two interacting domains
are on the same RNA chain). Formation of this bimolecular Pioneering studies on tRNA in the 1970s elucidated some
complex is mediated entirely through three tertiary interac- of the fundamental features of RNA folding pathways. By
tions: the tetraloop ± tetraloop receptor, the A-rich bulge and using temperature-jump relaxation measurements and NMR
the minor groove of P4, and the kissing-loop interaction spectroscopy it was shown that under conditions of moderate
between L5c and L2. Formation of this intermolecular ionic strength (174 mm Na, no Mg2) E. coli tRNAfMet
complex is magnesium dependent and extraordinarily tight; undergoes five distinct transitions during thermal unfold-
at 10 mm MgCl2 the apparent dissociation constant (Kd) is ing.[96] The lowest temperature transitions involve the dis-
100 pm.[86] The Th-intron can also be divided into three ruption of tertiary interactions between the D- and T-loops
separate pieces consisting of the P4ÿP6 domain, P1ÿP3 followed by the weak secondary structural elements in the
substrate, and the P3ÿP7/P9 domain, which are all capable D-stem.[97] Higher temperature transitions involve the melting
of associating solely through tertiary interactions to form a of the secondary structural elements corresponding to the T-,
functional ribozyme.[87] anticodon, and acceptor stems (Figure 14). Since thermal
It appears that a general feature of most large biological unfolding is a reversible reaction in tRNA it follows that its
RNAs is that they are constructed from independently folding structure forms hierarchically in the folding pathway, with
secondary structural domains that can associate in the trans almost all of the secondary structure forming prior to the
form, primarily through tertiary-type interactions rather than tertiary structure. Similar behavior is observed in the thermal
base pairing, to form functionally active molecules. The unfolding of two other tRNAs, which indicates that this is a
catalytic component of eubacterial RNase P, a ribonucleo- general mechanism for tRNA folding.[98, 99]
protein enzyme responsible for post-transcriptional process- Under conditions containing 3 mm Mg2 E. coli tRNAfMet
ing of small cellular RNAs, resides entirely within the displays a single, cooperative unfolding transition at high
RNA.[2, 88] In the absence of the protein subunit RNase P temperature such that tertiary and secondary structure are
RNA (P RNA) binds the precursor tRNA substrate by disrupted simultaneously.[100] This transition reflects the bind-
specifically recognizing the coaxially stacked T-stem/acceptor ing of a single magnesium ion to tRNA with high affinity
stem of tRNA through tertiary interactions involving a (Kd 33 mm) along with a number of weakly bound ions[101]
number of 2'-hydroxyl and phosphate groups in both the associated with the formation of tertiary structure.[100] This
P RNA[89, 90] and tRNA substrate.[91, 92] P RNA consists of two uptake of magnesium ions by RNA during the folding process
independently folding domains that show similar patterns of primarily affects the stabilization of tertiary structure rather
protection when separated from each other as they do in the than the formation of secondary structure. Since tertiary
context of the intact ribozyme, as evident from probing structure formation involves the close juxtaposition of the
experiments with FeIIÿEDTA in the presence of magnesium highly negatively charged backbone in several regions in
ions.[93, 94] Although these domains are individually unable to tRNA it is not surprising that the formation of tertiary
catalyze tRNA processing, they associate through tertiary structure creates regions for high affinity multivalent cation
interactions to form a catalytically competent complex. Thus, binding sites. The structure of tRNAPhe reveals several
interdomain interactions and substrate recognition occur specifically bound magnesium ions at sites of tertiary inter-
through tertiary interactions in the same manner as the Th- action, although no specific cation can be attributed to the
intron. Similarly, some of the six phylogenetically conserved stablization observed in the thermal melting profile.
Figure 14. The unfolding pathway of transfer RNA as determined from thermal denaturation studies with approximate time constants for each step of the
folding reaction. (Adapted from references [97, 140].)
P5abc accelerates the rate of formation of the P3ÿP7 domain parative sequence analysis of 35 sequences of subgroup IC1
that a tertiary interaction which is also present in the native introns (to which the Tetrahymena intron belongs), provided
state causes a kinetic trap in the folding pathway. crucial information for generating a consensus secondary
Further demonstration that native tertiary interactions structure and some types of tertiary interactions, and revealed
within the intron create kinetic barriers in the folding pathway invariant nucleotides likely to be essential for the formation of
was demonstrated by measuring the rate of folding of wild- the correct fold or for biochemical function. These data were
type and the fast-folding mutants in the presence of augmented by constraints provided by chemical and enzy-
urea.[110, 111] Chemical denaturants act to increase the folding matic probing of the intron RNA, and site-directed muta-
rate of proteins[112] and RNA[113, 114] by disrupting the forma- genesis experiments verified the existence of base-specific
tion of interactions that disfavor rapid progression towards tertiary interactions such as base-triples[123] and loop ± loop
the fully folded, native state. The folding of the wild-type interactions.[62] From these data a three-dimensional model
intron is accelerated by increasing the concentrations of urea of the Tetrahymena intron was constructed by using an
up to 3 m, but the folding rates of the selected faster folding approach based upon the hierarchical nature of RNA folding.
mutants are affected to a significantly lesser extent.[110, 111] A Elementary substructures and motifs were built based upon
detailed study of the effects of urea and mutations on the known structures (such as double-stranded helices, hairpin
temperature dependence of the folding rate revealed that they loops, and pseudoknots), then hooked together by using
both act to reduce the activation enthalpy for P3ÿP7 folding, interactive computer graphics, and finally the model was
which further supports the theory that a primary kinetic trap geometrically and stereochemically refined with a least-
in the folding of the wild-type intron involves native tertiary squares refinement routine.[124] The resulting model (Fig-
interactions.[111] ure 16) has provided a framework for the interpretation of
Rather than folding being a simple process involving the chemical activity and folding, has withstood extensive bio-
sequential formation of intermediates as implied by the chemical testing, and agrees with the overall fold of a group I
hierarchical folding model, it is more accurately described as intron catalytic core determined at low resolution (5 ± 6 ) by
an ensemble of molecules following parallel folding pathways, X-ray crystallography.[125]
as represented in a ªfolding energy landscapeº.[115, 116] This is
indicated by the observation that in a given population of
group I RNAs, there is a small fraction capable of rapidly
folding to reach the native state, while the rest of the
molecules slowly fold, kinetically trapped in various inter-
mediates.[113] Moderate amounts of chemical denaturant alter
the distribution of molecules following parallel folding path-
ways by increasing the fraction of the population capable of
evading kinetic traps to rapidly fold.[113] Thus, the kinetic
intermediates described in the hierarchical folding model
represent the most populated pathway in the folding energy
landscape under standard folding conditions. Studies of the
folding of the Th-intron have revealed that both native
structures (such as the metal core of P4ÿP6)[110, 111] and non- Figure 16. a) Model of the Tetrahymena group I intron developed by
Lehnert et al.[62] The ribbon traces the backbone of the molecule, with the
native interactions (such as incorrect base-pair forma-
P4ÿP6 domain shown in black, the P3ÿP7 domain in light gray, and the P2
tion)[113, 117] play a significant role in shaping this landscape. and P9 peripheral extensions in gray. b) A 908 rotation of the model,
Changes in temperature, solvent conditions, and nucleotide highlighting the packing of the P3ÿP7 domain between the P4ÿP6 and the
sequence have a profound affect upon the energy landscape, peripheral extensions.
and have revealed new kinetic barriers and traps and allow
alternative pathways of folding to become significantly
populated.[111] Producing accurate RNA models requires prior identifica-
tion of as many tertiary interactions as possible, since these
provide critical constraints on the global fold of the RNA that
cannot be obtained from the secondary structure. For
4. Developing Working Models of Large RNAs example, the model of the hepatitis delta virus ribozyme[126]
has a global fold different from that seen in the 2.3
4.1. Modeling the Tertiary Folding of RNA resolution crystal structure.[18] This difference stems mainly
from the failure to predict the helix P1.1, which is essential for
Despite significant advances in the ability to prepare and establishing the double-pseudoknot fold of the ribozyme. This
crystallize RNA,[118±121] it remains difficult to obtain crystals interaction was hard to predict from the secondary structure
that yield high-resolution structural information. In the of the ribozyme, as there are only two sequences (the genomic
absence of these structures, modeling the three-dimensional and antigenomic variants) with which to perform comparative
architecture of RNA has been a useful alternative. By far the analysis. Therefore, the key to better models of large RNAs lie
most successful RNA modeling effort to date has been that for in developing biochemical techniques for identifying tertiary
the Th-intron developed by Michel and Westhof.[62, 122] Com- contacts.
that lies ahead is to extend the lessons learned from the well [20] J. E. Wedekind, D. B. McKay, Annu. Rev. Biophys. Biomol. Struct.
studied systems of tRNA, the Tetrahymena group I intron, the 1998, 27, 475 ± 502.
[21] W. Saenger, Principles of Nucleic Acid Structure, Springer, New
hammerhead ribozyme, and RNase P to other more difficult York, 1984.
systems such as the group II self-splicing intron. Furthermore, [22] S. H. Kim, Prog. Nucleic Acid Res. Mol. Biol. 1976, 17, 181 ± 216.
most RNA in the cellular environment interacts with numer- [23] A. Jack, J. E. Lander, A. Klug, J. Mol. Biol. 1976, 108, 619 ± 649.
ous proteins to create ribonucleoprotein (RNP) enzymes such [24] J. B. Murray, D. P. Terwey, L. Maloney, A. Karpeisky, N. Usman, L.
Beigelman, W. G. Scott, Cell 1998, 92, 665 ± 673.
as the ribosome, spliceosome, telomerase, and signal recog-
[25] G. S. Bassi, N.-E. Mollegaard, A. I. H. Murchie, E. von Kitzing,
nition particle. Our understanding of structure, folding, and D. M. J. Lilley, Nat. Struct. Biol. 1995, 2, 45 ± 55.
assembly, and functional mechanisms of these RNPs lags far [26] G. S. Bassi, A. I. H. Murchie, D. M. J. Lilley, RNA 1996, 2, 756 ± 768.
behind that of the ribozymes discussed here. Future work will [27] T. Tuschl, C. Gohlke, T. M. Jovin, E. Westhof, F. Eckstein, Science
focus increasingly on RNA ± protein interactions, and will be 1994, 266, 785 ± 789.
[28] K. M. A. Amiri, P. J. Hagerman, Biochemistry 1994, 33, 13 172 ±
guided by themes that have emerged from these studies of 13 177.
RNA structure and function. [29] J. B. Murray, A. A. Seyhan, N. G. Walter, J. M. Burke, W. G. Scott,
Chem. Biol. 1998, 5, 587 ± 595.
The authors would like to thank Elizabeth Doherty, Patrick [30] Z. Shen, P. J. Hagerman, J. Mol. Biol. 1994, 241, 415 ± 430.
[31] D. R. Duckett, A. I. H. Murchie, D. M. Lilley, Cell 1995, 83, 1027 ±
Zarrinkar, and Scott Strobel for helpful discussions. Support
1036.
for this work was provided by a postdoctoral fellowship to [32] J. W. Orr, P. J. Hagerman, J. R. Williamson, J. Mol. Biol. 1997, 275,
R.T.B. from the Jane Coffin Childs Memorial Medical 453 ± 464.
Research Fund and through grants from the Beckman [33] R. T. Batey, J. R. Williamson, RNA 1998, 4, 984 ± 997.
Foundation, the Packard Foundation, the National Institutes, [34] J. H. Cate, A. R. Gooding, E. Podell, K. Zhou, B. L. Golden, A. A.
Szewczak, C. E. Kundrot, T. R. Cech, J. A. Doudna, Science 1996,
and the Howard Hughes Medical Institute (J.A.D.). 273, 1696 ± 1699.
[35] G. R. Zimmerman, R. D. Jenison, C. L. Wick, J.-P. Simorre, A. Pardi,
Nat. Struct. Biol. 1997, 4, 644 ± 649.
Received: December 8, 1998 [A 317 IE] [36] M. Costa, F. Michel, EMBO J. 1997, 16, 3289 ± 3302.
German version: Angew. Chem. 1999, 111, 2472 ± 2491 [37] S. E. Lietzke, C. L. Barnes, J. A. Berglund, C. E. Kundrot, Structure
1996, 4, 917 ± 930.
[38] S. R. Holbrook, C. Cheong, I. Tinoco, Jr., S.-H. Kim, Nature 1991,
353, 579 ± 581.
[1] K. Kruger, P. J. Grabowski, A. J. Zaug, J. Sands, D. E. Gottschling, [39] K. J. Baeyens, H. L. De Bondt, S. R. Holbrook, Nat. Struct. Biol.
T. R. Cech, Cell 1982, 31, 147 ± 157. 1995, 2, 56 ± 62.
[2] C. Guerrier-Takada, K. Gardiner, T. Marsh, N. Pace, S. Altman, Cell [40] G. A. Leonard, K. E. McAuley-Hecht, S. Ebel, D. M. Lough, T.
1983, 35, 849 ± 857. Brown, W. N. Hanter, Structure 1994, 2, 483 ± 494.
[3] C. Levinthal, J. Chim. Phys. Phys. Chim. Biol. 1968, 65, 44 ± 45. [41] M. C. Wahl, S. T. Rao, M. Sundaralingam, Nat. Struct. Biol. 1996, 3,
[4] D. J. Lane, B. Pace, G. J. Olsen, D. A. Stahl, M. Sogin, N. Pace, Proc. 24 ± 31.
Natl. Acad. Sci. USA 1985, 82, 6955 ± 6959. [42] R. D. Blake, J. MassoulieÂ, J. R. Fresco, J. Mol. Biol. 1967, 30, 291 ±
[5] J. A. Kowalak, S. C. Pomerantz, P. F. Crain, J. A. McCloskey, Nucleic 308.
Acids Res. 1993, 21, 4577 ± 4585. [43] J. MassoulieÂ, Eur. J. Biochem. 1968, 3, 439 ± 447.
[6] M. Chastain, I. Tinoco, Jr., Prog. Nucleic Acids Res. Mol. Biol. 1991, [44] R. Klinck, J. Liquier, E. Taillandier, C. Gouyette, T. Huynh-Dinh, E.
41, 131 ± 177. Guittet, Eur. J. Biochem. 1995, 233, 544 ± 553.
[7] Review: D. H. Turner, N. Sugimoto, S. M. Freir, Annu. Rev. Biophys. [45] Review: J. D. Puglisi, J. R. Williamson, The RNA World, 2nd ed.
Biophys. Chem. 1988, 17, 167 ± 192. (Eds.: R. F. Gesteland, T. R. Cech, J. F. Atkins), Cold Spring Harbor
[8] review: C. R. Woese, N. R. Pace, The RNA World (Eds.: R. Geste- Laboratory Press, Cold Spring Harbor, NY, 1999, pp. 403 ± 425.
land, J. F. Atkins), Cold Spring Harbor Laboratory Press, Cold [46] A. A. Szewczak, L. Ortoleva-Donnelly, S. P. Ryder, E. Moncoeur,
Spring Harbor, NY, 1993, pp. 91 ± 117. S. A. Strobel, Nat. Struct. Biol. 1998, 5, 1037 ± 1042.
[9] Review: C. Ehresmann, F. Baudin, M. Mougel, P. Romby, J.-P. Ebel, [47] C. Turek, P. Gauss, C. Thermes, D. R. Groebe, M. Gayle, N. Guild, G.
B. Ehresmann, Nucleic Acids Res. 1987, 15, 53 ± 72. Stormo, Y. DAubenton-Carafa, O. C. Uhlenbeck, I. Tinoco, Jr.,
[10] J. F. Milligan, D. R. Groebe, G. W. Witherell, O. C. Uhlenbeck, E. N. Brody, L. Gold, Proc. Natl. Acad. Sci. USA 1988, 85, 1364 ±
Nucleic Acids Res. 1987, 15, 8783 ± 8798. 1368.
[11] S. R. Holbrook, RNA structure and function (Eds.: R. W. Simons, M. [48] C. R. Woese, S. Winker, R. R. Gutell, Proc. Natl. Acad. Sci. USA
Grunberg-Manago), Cold Spring Harbor Press, Cold Spring Harbor, 1990, 87, 8467 ± 8471.
NY, 1998, pp. 147 ± 174. [49] M. Molinaro, I. Tinoco, Jr., Nucleic Acids Res. 1995, 23, 3056 ± 3063.
[12] E. V. Puglisi, J. D. Puglisi, RNA structure and function (Eds.: R. W. [50] H. Heus, A. Pardi, Science 1991, 253, 191 ± 194.
Simons, M. Grunberg-Manago), Cold Spring Harbor Laboratory [51] F. M. Jucker, H. A. Heus, P. F. Yip, E. H. Moors, A. Pardi, J. Mol.
Press, Cold Spring Harbor, NY, 1998, pp. 117 ± 146. Biol. 1996, 264, 968 ± 980.
[13] S. H. Kim, F. L. Suddath, G. J. Quigley, A. McPherson, J. L. Sussman, [52] L. Jaeger, F. Michel, E. Westhof, J. Mol. Biol. 1994, 236, 1271 ± 1276.
A. Wang, N. C. Seeman, A. Rich, Science 1974, 185, 435 ± 440. [53] F. L. Murphy, T. R. Cech, J. Mol. Biol. 1994, 236, 49 ± 63.
[14] J. D. Robertus, J. E. Ladner, J. T. Finch, D. Rhodes, R. D. Brown, [54] M. Costa, F. Michel, EMBO J. 1995, 14, 1276 ± 1285.
B. F. C. Clark, A. Klug, Nature 1974, 250, 546 ± 551. [55] H. Pley, K. Flaherty, D. McKay, Nature 1994, 372, 111 ± 113.
[15] H. W. Pley, K. M. Flaherty, D. B. McKay, Nature 1994, 372, [56] D. L. Abramovitz, A. M. Pyle, J. Mol. Biol. 1997, 266, 493 ± 506.
68 ± 74. [57] J. H. Cate, R. L. Hanna, J. A. Doudna, Nat. Struct. Biol. 1997, 4, 553 ±
[16] W. G. Scott, J. T. Finch, A. Klug, Cell 1995, 81, 991 ± 1002. 558.
[17] J. H. Cate, A. R. Gooding, E. Podell, K. Zhou, B. L. Golden, C. E. [58] J. A. Ippolito, T. A. Steitz, Proc. Natl. Acad. Sci. USA 1998, 95,
Kundrot, T. R. Cech, J. A. Doudna, Science 1996, 273, 1678 ± 1685. 9819 ± 9824.
[18] A. R. FerreÂ-DAmareÂ, K. Zhou, J. A. Doudna, Nature 1998, 395, [59] M. Zacharias, P. J. Hagerman, Proc. Natl. Acad. Sci. USA 1995, 92,
567 ± 574. 6052 ± 6056.
[19] E. Westhof, F. Michel, RNA-Protein interactions (Eds.: K. Nagai, [60] M. Zacharias, P. J. Hagerman, J. Mol. Biol. 1995, 247, 486 ± 500.
I. W. Mattaj), IRL Press, New York, 1994, pp. 25 ± 51. [61] M. S. Chapman, M. G. Rossmann, Structure 1995, 3, 151 ± 162.