0% found this document useful (0 votes)
12 views37 pages

Nihms 1740454

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views37 pages

Nihms 1740454

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

HHS Public Access

Author manuscript
Nat Rev Neurosci. Author manuscript; available in PMC 2022 October 17.
Author Manuscript

Published in final edited form as:


Nat Rev Neurosci. 2021 August ; 22(8): 472–487. doi:10.1038/s41583-021-00479-z.

Navigating for reward


Marielena Sosa1,*, Lisa M Giocomo1,*
1Department of Neurobiology, Stanford University School of Medicine, Stanford CA USA.

Abstract
An organism’s survival can depend on its ability to recall and navigate to spatial locations
Author Manuscript

associated with rewards, such as food or a home. Accumulating research has revealed that
computations of reward and its prediction occur on multiple levels across a complex set of
interacting brain regions, including those that support memory and navigation. Yet, how the brain
coordinates the encoding, recall, and use of reward information to guide navigation remains
incompletely understood. In this Review, we propose that the brain’s classical navigation centres
— the hippocampus (HPC) and entorhinal cortex (EC) — are ideally suited to coordinate this
larger network, by representing both physical and mental space as a series of states. These
states may be linked to reward via neuromodulatory inputs to the HPC–EC system. Hippocampal
outputs can then broadcast sequences of states to the rest of the brain to store reward associations
or to facilitate decision-making, potentially engaging additional value signals downstream. This
proposal is supported by recent advances in both experimental and theoretical neuroscience. By
discussing the neural systems traditionally tied to navigation and reward at their intersection, we
Author Manuscript

aim to offer an integrated framework for understanding navigation to reward as a fundamental


feature of many cognitive processes.

Introduction
Strong recall of rewarding experiences is crucial for survival. To navigate to a remembered
food source or safe home, the brain must search in memory to retrieve predictions about
where reward is located, given environmental features and the animal’s past experience.
At the same time, human reward memory can become pathological in mental illnesses
such as drug addiction1. For example, the spatial context of an initial drug experience can
invoke relapsed drug use2. Investigating how spatial experience becomes associated with
reward has therefore been a traditional pursuit of the addiction field, often focusing on the
Author Manuscript

midbrain dopaminergic system. Although understanding spatial reward memory is key to the
treatment of addiction and other mental illnesses, it is also crucial for our basic knowledge
of how the brain amplifies specific information for future use.

*
[email protected] (L.M.G.); [email protected] (M.S.).
Author contributions
The authors contributed equally to all aspects of the article.
Competing interests
The authors declare no competing interests.
Sosa and Giocomo Page 2

To localize reward within a given experience, the brain must create a neural map of the
external environment3. One brain region thought to play a critical role in forming this
Author Manuscript

neural map is the medial entorhinal cortex (MEC)4,5. Neurons in the MEC encode variables
required for the computation of an animal’s position and movement in a spatial reference
frame, including: spatial position4,6, head direction7-9, movement speed10, relative proximity
to objects11 and environmental borders12,13. Complementing the MEC neural code, neurons
in the lateral entorhinal cortex (LEC) encode variables such as time14 and the presence
or absence of objects15,16. Among these physiologically defined ‘cell types’, grid cells
[G] in the MEC seem particularly poised to support navigation, as they tile environments
via periodic, hexagonally organized firing fields4. An animal’s position can be precisely
encoded with only a handful of grid cells17, and an intact grid network is crucial for
optimally performing path integration — the process of calculating direction and distance
travelled based on perceived self-motion18,19. Grid cells have been observed in species
ranging from rodents to humans (for review, see 20), suggesting that MEC neurons may
Author Manuscript

provide an evolutionarily conserved coding scheme to support navigation and the creation of
spatial memories.

The MEC and LEC are highly interconnected with the hippocampus (HPC). Hippocampal
subregions CA3, CA1 and the dentate gyrus (DG) primarily receive input from the
superficial layers of EC (layers II and III), whereas CA1 and subiculum send outputs
primarily to the deep layers of EC (layers V and VI). The microcircuitry between layers
of the EC completes the HPC–EC loop21. In the HPC, many neurons are maximally active
at one or few specific spatial locations in an environment, earning them the name place
cells [G] 22. Place cells form tightly organized sequences that represent specific trajectories
through space during two main behavioural modes. During movement, spike timing is
sequentially organized across place cells in theta sequences [G] relative to the hippocampal
Author Manuscript

theta rhythm23-25. During immobility, high-frequency oscillations known as sharp-wave


ripples [G] 26 contain bursts of sequential spikes that reactivate spatial trajectories in
replay events [G] 27-30. Multiple works have linked the activity of place cells31-33, theta
sequences34-36 and replay events37-41 to memory-dependent behaviours. Recent work
has further demonstrated that optogenetically stimulating specific place cell populations
representing either reward locations or trajectory starts evokes behaviours associated with
those locations42. Together, these findings establish a causal role for hippocampal place cell
activity in the execution of navigational behaviors.

In recent years, accumulating evidence has demonstrated that hippocampal and MEC
activity generalizes to non-spatial tasks. HPC and MEC neurons encode sequential structure
in time43-49, providing a potential neural code for elapsed time. HPC and MEC neurons also
Author Manuscript

fire at discrete points along progressions of sensory stimuli that change with the proximity
to a goal (such as particular frequencies of sound), without requiring changes in physical
position50-52. Moreover, evidence from human imaging53 and computational modelling54,55
suggests that MEC and HPC coding schemes generalize to relationships between elements
of abstract spaces. It remains unclear to which abstract spaces this generalization applies, as
well as how individual neurons might compute such abstract codes. Nevertheless, the HPC–
EC system is clearly engaged in cognitive function beyond spatial navigation, consistent

Nat Rev Neurosci. Author manuscript; available in PMC 2022 October 17.
Sosa and Giocomo Page 3

with evidence that the HPC–EC system in humans encodes the order of events in episodic
memory56.
Author Manuscript

In this Review, we focus on spatial navigation as a model for the mnemonic function of
the HPC–EC system in associating rewards with the locations and events that surround
them. We advocate that the HPC–EC system is ideally situated to connect outcomes to a
sequence of states [G] 54 that discretize an experience, in which each state is an instance
in physical or abstract space. We suggest that the HPC–EC system specifically encodes
the order of these states as a general sequence of events in an episodic experience. We
review several lines of experimental evidence that the HPC–EC is involved in transforming
these event sequences into predictions and memories of reward at the physiological and
representational levels. First, neuronal representations of reward are present in multiple
forms within the HPC and EC. Second, neuromodulatory centres, including those of the
dopaminergic system, directly innervate the HPC and EC, shaping local plasticity and
Author Manuscript

representations of space and reward. Third, the unique laminar circuitry of the HPC,
especially the recurrent excitatory connectivity in CA3, allows the HPC–EC network to
rapidly generate sequences of neuronal firing that correspond to both remembered and
hypothetical orders of events. These bursts of activity can quickly propagate across the brain
in both task-engaged and resting states, broadcasting sequences of events to downstream
structures for decision-making computations or memory formation. Although reward-related
firing patterns and neuromodulatory innervation are common to many brain areas, their
union with sequence generation is unique to the HPC–EC system. The assignment of reward
value [G] to these sequences may occur locally or through associative firing in downstream
targets such as the prefrontal cortex (PFC) or striatum. With these features in place, the
HPC–EC system is ideally poised to store and retrieve reward-related signals at multiple
levels throughout the brain.
Author Manuscript

Reward in the hippocampal formation


Single cell representations of reward in the hippocampus.
A challenge to understanding how the hippocampus represents reward is that reward can
be represented in multiple ways. For example, reward signals in dopaminergic neurons
often relate to reward consumption or to reward-predicting cues. In the hippocampus,
however, neurons that fire selectively during reward consumption have been elusive. This
is partly because the prevalence of SWRs is enhanced by reward57 in both CA357,58 and
CA159-61. As SWRs excite neurons across the hippocampal circuit and occur during times
of reward consumption, hippocampal spiking specific to reward consumption is difficult to
dissociate from spiking during SWRs. Moreover, spatial tasks often lack sensory cues that
Author Manuscript

directly predict reward, or involve linearized environments in which the animal’s movement
direction cannot be dissociated from reward-prediction. Nevertheless, hippocampal reward
signals have been observed at various behavioral timepoints with respect to when an animal
receives reward. For the purposes of this Review, we subdivide these behavioural timepoints
as follows: goal approach, goal arrival, time at the goal location (which may include reward
consumption) and signals of reward history following reward consumption (Fig. 1a).

Nat Rev Neurosci. Author manuscript; available in PMC 2022 October 17.
Sosa and Giocomo Page 4

One of the earliest demonstrations of reward-related hippocampal coding was the description
Author Manuscript

of ‘goal-approach cells’, which increased their firing rate when rats moved toward odour
cue and reward ports during an odour-discrimination task62. Running toward a known goal
location was subsequently shown to induce place-specific firing along paths to goals that
was distinct from random foraging in the same environment63. This goal-approach activity
occurs irrespective of the direction from which the animal approaches the goal63,64 (Fig.
1b) and persists temporarily even after goals are removed64. However, it remains difficult
to distinguish goal-approach related firing from prospective firing, the modulation of the in-
field firing rate of a place cell according to the animal’s future route65-67. Prospective firing
is stronger in CA1 than CA368, and CA1 place cells that fire prospectively additionally
migrate their fields toward reward locations across behavioural trials69. This CA1 firing
activity is further modulated by the motivational state of the animal70, as well as the
probabilistic value [G] 71 and novelty of the goal72. An approach signal is encoded more
Author Manuscript

explicitly in hippocampal cells of bats, via a vectoral representation of direction and distance
to goals during flight73. Thus, signals of goal approach, as well as the predictive value of
locations that precede reward, may be layered onto a representation of the animal’s intended
destination.

One of the most robustly reported effects of reward on the hippocampal map occurs at times
encompassing both goal approach and goal arrival: place fields often cluster near reward
locations, resulting in an overrepresentation of those locations by the neural population32,74
(Fig. 1c). Reward-related place field clustering appears specific to hippocampal subregion
CA1 compared with CA332 and is observed across different types of environments and
different rewards, including food32,75,76, water77-80, intracranial stimulation of the medial
forebrain bundle81,82 and an escape opportunity from the water maze74. Similar to goal-
approach signals71, place field clustering is influenced by the probability with which reward
Author Manuscript

is delivered at known goal locations, with large, unexpected rewards yielding a greater
overrepresentation83. Overrepresentation of goals requires learning, during which existing
place fields shift toward reward locations32,72,79,80,84 and non-place cells are newly recruited
to represent the reward site84,85. After learning, place cells near the reward location are
selectively stabilized84,85, and certain place cells will respond to multiple goals as opposed
to a single goal location86. The learning-dependent increase in place field density near goals
suggests that the hippocampal map retains a prediction of where reward will be located.

Notably, place field clustering related to goal arrival is not observed in tasks that dissociate
the location of the goal from reward delivery. Instead, the goal location elicits an increase
in firing outside the primary field of a place cell, during the delay between goal arrival
and reward delivery87,88. This out-of-field firing occurs in both CA1 and CA388 and is
Author Manuscript

probably distinct from the firing that occurs during SWRs28,29,57, as the rate of SWRs tends
to be lowest during delays in which an animal waits for reward59,60,86,87. One potential
explanation for the absence of place field clustering in such tasks is the lack of a clear
predictive spatial relationship between the goal and the reward location, as the reward pellet
in these studies was released randomly into the environment87,88. This hypothesis points
to the certainty of the spatial location of the reward as an important driver in reorganizing
place fields. Another possibility is that the reward itself acts as the primary trigger for

Nat Rev Neurosci. Author manuscript; available in PMC 2022 October 17.
Sosa and Giocomo Page 5

hippocampal place cell reorganization, in which case place field clustering would not be
Author Manuscript

observed in tasks with reward locations that randomly vary.

Evidence of firing specific to times of reward consumption at the goal location has
been limited. Recently however, two-photon imaging uncovered a small population of
hippocampal neurons in CA1 and subiculum that seem to be specialized for encoding
reward89. These cells fired selectively at rewards regardless of the reward’s spatial location
or the environmental context, distinguishing them from traditional place cells, which shift
their fields but remain context-specific (Fig. 1d). The reward-specific activity was not
restricted to the period of reward consumption, but instead spanned the period from goal
arrival to departure89, dissociating the reward site activity from spatially specific firing
during immobility90 and probably also from firing that occurs during SWRs26. However,
when multiple reward sites are present, even highly reward-specific cells tend to fire for only
one reward site, suggesting these cells signal a combination of reward and position rather
Author Manuscript

than exclusively reward91.

Evidence of reward history signals following reward consumption has also been limited, and
may be more prominent in downstream areas that receive hippocampal input60. As a notable
exception, hippocampal cells modulate their firing activity after probabilistic reward delivery
and after departure from the reward site, according to the reward outcome (Fig. 1e) and
its probabilistic value71,92. As with place field reorganization32, this value-modulated firing
is observed primarily in CA1 but not CA392. Similar to firing rate changes during reward
approach, this reward-history-dependent firing reflects the animal’s choice71. Together, this
collection of work indicates that the hippocampus processes reward-related signals across
multiple behavioural epochs that surround navigation to goals.
Author Manuscript

Representations of reward across hippocampal subcircuits.


How reward-related dynamics vary across hippocampal subregions and heterogeneous
subcircuits93 remains an active area of investigation. Within CA1, cells in the deep
pyramidal sublayer shift and restabilize their fields during goal-directed learning, whereas
superficial cells maintain their spatial selectivity77. Subpopulations of inhibitory CA1
interneurons are likewise differentially engaged in goal-directed behaviours, reorganizing
their activity to coordinate with newly learned pyramidal cell patterns94. For example,
interneurons expressing vasoactive intestinal polypeptide (VIP) show modulation near
learned goal locations that is crucial for reward-related shifts in the fields of pyramidal
cells78. Local circuit dynamics are therefore likely to critically shape CA1 representations of
reward.
Author Manuscript

In the DG, two subpopulations of excitatory cells known to exhibit distinct spatial coding
properties95-97 — granule cells and mossy cells — have recently been shown to also exhibit
distinct properties related to reward. DG granule cells respond to reward-predicting olfactory
stimuli98 and are also required for the reward-dependent enhancement of SWR reactivation
in CA3, particularly in a working memory task58. Mossy cells expressing the dopamine
D2 receptor in the DG hilus [G], the region between the granule cell layers, respond to
food cues and can suppress food intake when active99. Together, these findings suggest that
receipt of reward engages much of the dorsal hippocampal network.

Nat Rev Neurosci. Author manuscript; available in PMC 2022 October 17.
Sosa and Giocomo Page 6

An open question remains as to what degree dorsal and ventral HPC act as distinct circuits in
Author Manuscript

navigation and memory processes. Historically, the dorsal HPC (dHPC) has been proposed
to primarily encode spatial details, whereas the ventral hippocampus (vHPC) has been
considered a centre for emotion and valence processing. These ideas are reinforced by
denser innervation of the vHPC by catecholaminergic inputs, as well as stronger anatomical
outputs from the vHPC to regions implicated in reward processing, such as the PFC
and nucleus accumbens (NAc; for review, see refs 100,101). In addition, vHPC place cells
show modulation of firing around reward locations more often than dHPC cells do102,103,
and manipulations of vHPC projections to the NAc can drive or suppress reward-seeking
behaviours104-106. However, the behavioral effects of dHPC or vHPC inactivation on reward
memory are mixed107,108, and recent evidence has suggested a strong role for the dHPC in
processing reward information. For example, reward increases the rate of SWRs only in the
dHPC60, and the dHPC seems to engage reward-related activity patterns in the NAc60,109-111
more so than the vHPC does60. It is worth noting, however, that the vHPC is incredibly
Author Manuscript

heterogeneous in its cell types and targeting of downstream structures93,112. The possibility
of vHPC subcircuits dedicated to rewarding or aversive aspects of experience100,101 remains
to be investigated further.

Single-cell representations of reward in the medial entorhinal cortex.


As both a primary input and output of the HPC, the medial entorhinal cortex (MEC) is
uniquely poised to supply fundamental components of the hippocampal code and read
out the transformation of these components. These functions of the MEC have been
classically considered in the context of physical navigation. Very few studies, particularly
in rodents, have investigated MEC coding with respect to reward or higher cognitive
functions. However, recent work has revealed that the navigational codes of MEC neurons
are flexible depending on movement state113 and can reflect future destinations and past
Author Manuscript

route origins66,114,115 (Fig. 1f), thus implying the flexibility to encode goal approach and
reward history. Moreover, the firing of MEC grid cells does not explicitly require physical
movement, allowing for goal-related firing during immobility. Grid cells in non-human
primates respond to changes in visuospatial attention116, and both hippocampal place cells
and MEC grid cells in rats respond at discrete points along a manipulable auditory tone
axis50. Importantly, these responses are absent to passive tone playback without reward, but
show weak tuning to constant tones as long as reward is subsequently provided50. Taken
together, these findings indicate that engagement in a rewarded task substantially contributes
to MEC firing patterns, even in circumstances outside spatial navigation.

Recent studies have advanced this understanding, demonstrating changes in MEC firing
related to goal arrival during spatial memory tasks117,118. When reward is delivered in an
Author Manuscript

unmarked zone of an open field, grid cells increase the firing rates of their fields near the
reward zone (Fig. 1g), and non-grid spatial cells change the locations where they are active,
likewise yielding a population increase in firing near the reward location117. When reward
is delivered in multiple remembered locations of a cheeseboard maze [G], grid cells instead
shift their firing fields toward the reward locations118 (Fig. 1h). Although many factors
may contribute to the differences between results of these studies, one possibility is that
the increased stereotypy of behavioural trajectories in the cheeseboard maze yields a shift

Nat Rev Neurosci. Author manuscript; available in PMC 2022 October 17.
Sosa and Giocomo Page 7

much like the shift in place fields toward reward32, whereas approach to an unmarked zone
Author Manuscript

from multiple directions yields a greater number of spikes near reward without changing the
overall spatial distribution of firing fields (for example, as in ref. 88). In addition, the holes in
the cheeseboard maze provide location cues that themselves seem to distort the grid pattern
seen in open arenas118. Despite these differences, both patterns of modulation could serve to
amplify representations of specific locations in spatial reward memories.

Reward coding in the dopamine system


The reward-related modulation of hippocampal and entorhinal neurons naturally raises
the question of where this reward information originates. Among other neuromodulatory
inputs119, the midbrain dopaminergic system is a clear candidate for supplying reward
signals to the navigational system. Here, we briefly review the current understanding of the
dopaminergic system (for reviews, see refs 120-124) to provide context for how these reward
Author Manuscript

computations might play out in reward-directed navigation and spatial memory.

Dopamine neurons of the ventral tegmental area (VTA) are well known to signal reward
prediction error [G] (RPE)125-128, defined as the difference between expected and received
reward129. RPE serves as a fundamental teaching signal in a type of reinforcement learning
[G] known as temporal difference learning [G] (TD-RL)130, which can model the shift in
dopamine neuron firing from the reward to a given predictor over time (Fig. 2a). With each
outcome, the value of the predictor — how much it is ‘worth’ — is updated by RPE to
improve the accuracy of future performance120,122. This predictor can be a sensory stimulus,
choice, action, location or environmental context. In addition to RPE, VTA neurons encode
different levels of confidence about the reward outcome131,132, representing a distribution of
possible expected rewards as a population133. Moreover, dopamine neurons encode various
Author Manuscript

task parameters and decision variables not immediately evident from classical conditioning
tasks134,135.

RPE signals are also reflected in dopamine release in downstream areas such as the NAc136,
where neurons encode rewards and the value of stimuli and actions that lead to reward (for
review see ref. 137). Concentrations of dopamine in the NAc ramp up as animals get closer
to reward in time and space, reflecting a reward expectation signal138-140. Additionally,
however, there is strong evidence for roles of dopamine in movement and motivation [G] to
work for future rewards121,123. Dopamine release increases at the onset of reward-seeking
actions141,142 and is tightly correlated with both the value of each action and the vigour
with which those actions are executed138,143 (Fig. 2b). These release dynamics seem to
be dissociable from VTA spiking143 (but see ref. 140), perhaps owing to local regulation
of release via receptors on dopaminergic axon terminals in the NAc (for review see ref.
Author Manuscript

144). In the context of navigation, dopaminergic RPE signals could facilitate learning in

response to changes in reward presence or location, whereas value signals could help
invigorate performance of learned routes. As others have eloquently described, however,
value (how much reward is expected in each state) and RPE (change in value between
adjacent states) are difficult to distinguish121,140. Further work remains to reconcile these
distinctions across tasks, as the relative contribution of value-specific and RPE-specific
computations to dopaminergic activity may vary greatly by task and subcircuit121.

Nat Rev Neurosci. Author manuscript; available in PMC 2022 October 17.
Sosa and Giocomo Page 8

Although substantial work has focused on the VTA, the VTA is not the only source of
dopamine in the brain. In addition to the substantia nigra pars compacta123, dopamine is
Author Manuscript

co-released with noradrenaline from neurons of the locus coeruleus (LC)145, a noradrenergic
centre commonly implicated in arousal, salience detection and cognitive flexibility (for
review see refs 146,147). Similar to VTA dopamine neurons, LC neurons respond to
unexpected reward, rapidly shift their firing to reward predictors148,149 and show firing
correlated with movement effort150,151. LC neurons also respond selectively to relevant
stimuli when the rules of a task change148,150, facilitating behavioural adaptation146.

Finally, dopamine is not the sole arbiter of reward information in the brain, nor is it the only
neurotransmitter released from dopaminergic neurons152. Multiple other neuromodulators
have roles in reward processing and memory119, including opioids153, noradrenaline146,147,
serotonin154-156 and acetylcholine157. In the remainder of this Review, we focus on the role
of dopamine in the HPC–EC system given the more extensive research on this subject,
Author Manuscript

but note that other neuromodulators could be equally fundamental to spatial memory and
deserve further study.

Dopamine in plasticity and learning


Dopamine is known to modulate hippocampal and entorhinal synaptic plasticity (Box 1),
and infusion of dopamine receptor antagonists into the HPC prevents rapid learning of
novel locations and contexts158-160. Yet, the principal source of dopaminergic input remains
unclear. Both the VTA and the LC innervate the HPC161,162 and EC163. Although little is
known about how these inputs affect functions of the EC, both VTA and LC inputs have
been shown to regulate hippocampal memory retention158,159,164. VTA axons to the dHPC
are sparse (compared with the dense innervation of the vHPC) and primarily target the CA1
Author Manuscript

and CA2 pyramidal layer and stratum oriens [G] 161,165,166 (Fig. 2c). LC axon terminals are
prominent in the dHPC and are uniformly distributed across the hippocampal laminae, most
densely innervating CA3158,159,162,164 (Fig. 2c). Complementing VTA dopamine, dopamine
co-released with noradrenaline from LC axons145 provides a large fraction of hippocampal
dopaminergic tone164. Moreover, dopamine mediates the memory changes observed after
LC manipulations, as these effects are blocked by dopamine receptor antagonists but not
noradrenaline receptor antagonists158,159,164. The relative contribution of VTA and LC
dopamine to hippocampal plasticity is still unclear, and may vary based on cellular target or
behavioural demand 167.

Both VTA and LC dopaminergic inputs to the HPC have been implicated in shaping
and stabilizing spatial representations. Stability of the hippocampal map is facilitated by
VTA axon stimulation168 and reduced by inactivation of the VTA166 and LC159 as well
Author Manuscript

as by dopamine receptor antagonists31,169 (Fig. 2d). Complementing a role in stability,


dopaminergic transmission is required to flexibly update the hippocampal map to reflect
information most relevant to the current task169. Blockade of hippocampal dopamine
receptors during learning impairs memory of reward–location associations170 and prevents
animals from learning to find reward relative to a new set of sensory cues169, whereas
activation of VTA input enhances goal location memory168. Supporting this memory
function, dopaminergic input is involved in shifting place cell representations toward learned

Nat Rev Neurosci. Author manuscript; available in PMC 2022 October 17.
Sosa and Giocomo Page 9

reward locations (Fig. 2d). Optogenetic activation of VTA axons can shift the firing of
Author Manuscript

place cells toward the location of stimulation, whereas inhibition of these axons tends to
shift place fields away75. Recently, LC axon activity in the HPC was shown to signal
upcoming reward when the reward location moved79, in line with previous reports that LC
neurons signal unexpected reward contingencies148,171. Activating this LC projection can
reorganize place fields around a fixed reward location, but does not cause reorganization
at unrewarded locations or when the reward location is unpredictable, as with random
foraging79. Consistent with findings that place field reorganization can occur without
VTA input166, this suggests that dopaminergic input alone is not sufficient to drive the
hippocampal representation. Together, these studies point to multiple sources of dopamine
converging with spatial learning demands to shape hippocampal representations.

Hippocampal sequences in reward memory


Author Manuscript

How hippocampal activity is influenced by neuromodulation remains to be explored further.


However, the HPC can generate sequences that support reward-driven navigation regardless
of how reward is represented locally. Hippocampal network oscillations organize sequential
activity on multiple timescales (for reviews, see refs 172,173), including theta sequences
during movement and replay events during immobility. Such sequential activity is thought to
support the brain’s ability to store reward-related memories and retrieve them for decision-
making.

Theta sequences.
As an animal moves through space, hippocampal cells fire at progressively earlier phases
of the theta rhythm, a phenomenon called theta phase precession23,24. Spikes from cells
with overlapping place fields are nested into a theta sequence25,174,175, such that the neural
Author Manuscript

representation of space ‘sweeps’ from behind to ahead of the animal within a theta cycle
(Fig. 3a). Theta sequences have been observed in spatial and non-spatial tasks51 and are
thought to provide a mechanism for deliberation during navigation.

Hippocampal theta yields two patterns before navigational decisions. First, when the animal
pauses at a maze junction and looks side to side, theta sequences sweep ahead of the animal
in one direction and then in the other176, putatively helping the animal evaluate each path
before making a decision. These sequences extend the future locations represented, known
as the ‘look ahead distance’, predicting the chosen goal177. Second, during movement
before a maze junction, future choices are represented on alternating theta cycles178 (Fig.
3b). Future-signalling spikes occur on late phases of theta, whereas spikes signalling the
current178 or past179 location occur on the early phases. On the same theta time-scale,
Author Manuscript

neural representations can also alternate between current and previous goal configurations in
MEC118. MEC cells that fire on alternate theta cycles also tend to have distinct directional
preferences180, which could help represent bifurcating navigational choices. Moreover, an
ability to look ahead along future paths to goals has been proposed for MEC grid cells181,
which tend to represent locations just in front of the animal10. Together, these phenomena
imply that the HPC–EC system has access to a snapshot of the present, the immediate past
and a hypothetical future scenario, all within a time window of approximately 125 ms. Such

Nat Rev Neurosci. Author manuscript; available in PMC 2022 October 17.
Sosa and Giocomo Page 10

compression of experience would allow rapid predictions about upcoming states dependent
on recent experience182. An important avenue of future study is to understand how theta
Author Manuscript

sequences that alternate between future choices engage neurons representing the value of
each choice.

Sharp-wave ripples and replay.


During pauses in movement and during sleep, SWRs26 encompass approximately 50–
200 ms-long183 bursts of hippocampal spiking, ‘replaying’ sequences of place cells
that recapitulate paths taken through space (for reviews, see refs 184,185). SWRs during
immobility in awake behaviour have been causally linked to accurate memory retrieval and
learning37,38. Recent work further demonstrated that replay of a specific environment during
sleep is required to subsequently recall goal locations in that environment39.

The order and content of replay at different moments in goal-directed behaviour may depend
Author Manuscript

on task demands. Replay events occur in both forward and reverse directions relative to the
order of neuronal firing during the original experience28,29 (Fig. 3c). Forward replay occurs
more often before goal-directed trajectories and during pauses at decision points29,72,186,187,
suggesting a role in retrieving past experience to inform current decisions (Fig. 3d). Reverse
replay occurs primarily following receipt of reward after the completion of a path29,59,72
and when working memory is required72, suggesting a role in storing associations learned
from recent experience. Consistent with a role in decision-making, greater amounts of replay
of future alternatives predicts better performance188 and the replayed sequence can predict
specific routes taken to goals186. However, pre-decision replay may instead reflect how
often and how far in the past the replayed trajectory was rewarded, rather than reflect
planning per se189. Consistent with a role for memory maintenance, in tasks with divergent
motivational demands (such as a choice between food and water), replay corresponds to the
Author Manuscript

unchosen option190. Collectively, these studies demonstrate a dynamic interplay between the
requirements of goal-oriented tasks and the structure of hippocampal replay.

After receipt of reward, an increased rate of SWRs57 in dHPC60 coincides with reward-
evoked dopamine release, which could cement associations between recently taken paths
and their reward outcomes28. Consistent with this hypothesis, larger reward sizes augment
the rate of reverse replay59, and reward enhances how closely forward replay sequences
match the experienced sequence61. These studies suggest that replay aligns with a dopamine
signal (Box 2) to link place cells along rewarded paths with the outcome. Replay has also
been recently suggested to facilitate the inferred association of reward-predicting cues and
outcomes191, suggesting that the ability of SWRs to link reward with preceding events
generalizes to nonspatial tasks.
Author Manuscript

The characteristics of MEC replay are not as well understood as in the HPC. Coherent
replay between place and grid cells has been sparsely observed in MEC layers V/VI192 (but
see ref. 193). By contrast, superficial MEC layers exhibit replay in both the forward and
reverse directions independently of the HPC114. At the same time, there is evidence that
coordinated activation of MEC layer III is required for extended SWRs in CA1194, which
have been shown to contain longer replay sequences183. These findings suggest that MEC

Nat Rev Neurosci. Author manuscript; available in PMC 2022 October 17.
Sosa and Giocomo Page 11

activity may help propagate hippocampal replay sequences over longer representational
Author Manuscript

distances.

Extrahippocampal interactions
Hippocampal interactions with downstream structures enable the navigational code to be
combined with additional behaviourally relevant information. In this section, we focus
on studies using simultaneous recordings across brain regions to examine how sequences
broadcast by the dHPC, where spatial specificity is highest100, are coordinated with neural
activity patterns related to goal-directed behaviour in downstream regions (for reviews, see
refs 195,196).

Neocortex.
Hippocampal SWRs activate widespread neocortical regions197, including neurons in the
Author Manuscript

primary visual and auditory cortices during sleep198,199. These cortices may in turn provide
an upstream input to SWRs199,200, as sound-responsive auditory cells can bias the HPC
to replay spatial information associated with a sound cue199,201. Cortical cells encoding
reward-predicting cues could therefore influence the HPC to reactivate cue representations
with spatial sequences and reward outcomes, forming a complete memory of a rewarding
experience.

Hippocampal sequences also strongly engage the PFC, a set of cortical regions implicated
in decision-making202. PFC cells show goal-related coding in spatial tasks203-205, spiking
near remembered goal locations even when goals are dissociated from reward delivery206. In
addition, PFC cells encode both spatial information and behaviourally relevant similarities
across spatial trajectories, such as junctions or endpoints203-205,207. This activity can
Author Manuscript

be thought of as representing discrete task states208 along trajectories to goals, which


may be important for generalizing task knowledge across similar experiences. During
periods of working memory preceding spatial choices, PFC cells align their spiking
to the hippocampal theta rhythm203,209,210, exhibiting theta phase precession211. This
theta coordination is enhanced by dopamine210, perhaps reflecting reward-predictive
computations preceding choice points. Recent work supports this possibility: HPC and PFC
theta sequences concurrently represent spatial trajectories and goals212,213, with the goal
location represented in the PFC predicting upcoming spatial choices214. These behaviourally
relevant PFC sequences reactivate during SWRs in both sleep and wake205,215. Coordinated
HPC–PFC replay during wake may help associate spatial locations to the more generalized
states represented by the PFC and retrieve these associations for decision-making205.
Consistent with this hypothesized function, cohesive replay across HPC–PFC ensembles
Author Manuscript

predicts an animal’s upcoming or recently traversed path to a greater degree than


hippocampal replay alone216.

Subcortical structures
Both theta oscillations and SWRs have been shown to engage subcortical structures
implicated in reward and value processing. In the VTA, reward-encoding neurons spike
during SWRs in sleep, coordinated with hippocampal neurons representing reward

Nat Rev Neurosci. Author manuscript; available in PMC 2022 October 17.
Sosa and Giocomo Page 12

location217. These VTA neurons are preferentially reactivated during replay sequences that
move away from the animal’s current position217, consistent with the idea that replay
Author Manuscript

propagates reward value information backwards across locations218.

In the NAc, neurons that fire at reward sites during behaviour are likewise reactivated
following hippocampal replay in sleep219. During awake immobility, however, dHPC SWRs
instead reactivate NAc neurons that encode relative distances along spatial paths to goals60.
Similar to neurons in the PFC (for example, see ref. 207), these NAc neurons putatively
encode generalized states that may be couple to goal-directed actions, such as trajectory
initiation. NAc neurons reactivated during SWRs additionally exhibit a reward history
signal, firing more along spatial trajectories after the animal has received a reward60. The
firing of these neurons may be modulated by local dopamine release that tracks reward
history143. Many task-responsive NAc neurons also fire according to hippocampal theta
phase60,220,221 or show theta phase precession111. This theta coordination of HPC–NAc
Author Manuscript

ensembles is well suited to associate spatial and reward information across the circuit
during spatial exploration. Rewarded contexts enhance this theta coupling109,110, and HPC–
NAc neuron pairs active together during a rewarded experience are reactivated together
in post-experience replay60,109,219. Upon re-exposure to a rewarded context, direct dorsal
CA1 innervation of NAc neurons, especially fast-spiking interneurons, is needed to organize
and reinstate the spiking of NAc ensembles associated with the reward memory110. These
ensembles may be recruited at times of decision-making, as hippocampal theta sequences at
choice points recruit the spiking of reward-related NAc neurons222, potentially facilitating
predictions of upcoming reward given each spatial choice.

Additional subcortical circuits are candidates for linking hippocampal sequences to reward
information. SWRs engage neurons in the lateral septum223 and basolateral amygdala
Author Manuscript

(BLA)224, preferentially recruiting a subpopulation of lateral septal neurons that respond


to reward and reward-predicting cues223. Of note, the BLA, lateral septum and NAc
each comprise a potential conduit that translates hippocampal inputs into outputs to the
VTA225, facilitating a loop between VTA and the HPC–EC system226. How computations
are transformed at each step of this loop remains to be explored. The collective
evidence suggests that hippocampal sequences join representations of space to generalized
representations of task states and reward outcomes.

A model for reward and navigation


Recent computational work provides a model for the intersection of spatial navigation
and reward prediction. This work posits that the HPC–EC system encodes a ‘successor
representation’ (SR)54, which quantifies the extent to which the current state predicts that
Author Manuscript

the animal will occupy other states in the environment, discounted by how far in the
future the other states are54,227,228 (Fig. 4a,b). The SR model is a generalized form of
temporal difference reinforcement learning (TD-RL), in that it uses prediction errors about
the occupancy of states to update transition probabilities between states, just as TD-RL uses
errors in predicted reward. Under the SR model, the value function [G] learned in TD-RL
is decomposed into the SR matrix of predictive states, multiplied by the reward expected
in each state228 (Fig. 4c). This factorization allows changes in reward in any given state to

Nat Rev Neurosci. Author manuscript; available in PMC 2022 October 17.
Sosa and Giocomo Page 13

easily propagate to the entire series of connected states. Further, the transition probabilities
Author Manuscript

between states can be learned even in the absence of reward (such as exploring a spatial
environment before receiving food in it )227. The SR framework can therefore model both
the dopaminergic system and the HPC–EC navigational system.

The SR model is compatible with multiple experimental findings in describing the HPC–
EC system53,54. Hippocampal place cells are proposed to encode SR as a rate code,
reaching their peak firing rate when the animal is physically located in the state that is
best predicted by a given cell (Fig. 4c). The SR model accurately predicts a higher density
of place fields near goals74 and may explain prospective firing rate changes based on future
destination65,66, as states on the centre stem of an alternation task could be dissociable
based on how much they predict states just past the junction on a given trial. The EC
is proposed to encode a low-dimensional readout of the SR, representing the underlying
correlation structure of the relationships between states54,229. Importantly, the SR develops
Author Manuscript

through learning: the more a state is occupied over experience, the more it is predicted230.
This means that, in open environments where reward is randomly scattered, the structure
of the SR is represented as an evenly spaced grid, because the animal does not repeatedly
visit any particular trajectory (that is, any particular sets of transitions between states)230.
If certain trajectories are navigated many times, for example when running between fixed
goal locations, increased occupancy of states along the trajectory shifts grid fields closer to
the goals118,230. If certain states are occupied more but not via repeated trajectories, such
as when navigating to a hidden goal in the open field, transition probabilities only increase
locally and may be reflected as an increased firing rate of grid cell fields117.

SR theory is useful in conceptualizing how state transitions (that is, the order of episodic
events) learned in the HPC–EC could be used to make value predictions, consistent with
Author Manuscript

a role for the HPC in value-based decision making231-234. Replay has been proposed to
support the assignment of value to sequences of states218. Under the ‘prioritized replay’
proposal, forward replays are prioritized at moments of decision-making to compute the
value of states along upcoming possible routes, increasing the animal’s probability of
making a correct choice218. Theta sequences could also perform this function by serially
sampling alternative trajectories and estimating the value of each one196,235. Once reward
is received, reverse replay is prioritized to propagate any positive RPE backwards to update
the value of preceding states218. Alternatively, estimating values and storing newly updated
estimates may be simultaneous processes that occur in both types of replay184,185.

How value gets assigned to each state in the SR at the neural level remains an open question.
If hippocampal cells solely represent SRs, they would need to combine their firing with
Author Manuscript

an external reward prediction signal to compute value. Dopaminergic release in the NAc
and the spiking of reward-related NAc or VTA neurons coincident with HPC–EC sequences
may link states to value representations across the neural circuit. In this scenario, sequences
broadcast by the HPC would be evaluated by basal ganglia circuitry in the process of
selecting an action to achieve the desired outcome196,236. To apply learned state–reward
predictions in similar contexts, the spiking of PFC neurons may help to generalize value
assignments from individual spatial states to task states that share similar features53,208,230.
Alternatively, reward-related firing patterns in the HPC–EC suggest that value could also

Nat Rev Neurosci. Author manuscript; available in PMC 2022 October 17.
Sosa and Giocomo Page 14

be computed locally, perhaps via dopaminergic modulation. For example, neuromodulatory


Author Manuscript

inputs could potentiate the synapses between CA3 and CA1 neurons to amplify the CA1
spike rate for higher-valued sequences237. In either case, the HPC–EC serves as an interface
in a network of brain structures to link individual successor states (place), value and task
states to facilitate the learning and performance of goal-directed actions (Fig. 4d).

Conclusion
Here, we hypothesize that a key role of the HPC–EC system in rewarded navigation is to
generate sequences of events to link the beginning of an experience with its outcome. We
propose that neuromodulatory inputs may sculpt HPC–EC representations to either provide
local value information or shift the distribution of states, ‘weighting’ rewarded regions of
space more heavily than unrewarded regions. Subsequently, hippocampal sequences may
broadcast these states in a compressed manner to downstream regions, to be linked with
Author Manuscript

episodic details in memory formation, generalized across task knowledge or utilized for
action generation.

Framing HPC–EC activity as encoding states allows the known navigational codes of this
system to be applied more flexibly in non-spatial domains235. Yet, many open questions
remain (Box 3). Moreover, reward is intrinsically difficult to disentangle from studies of
HPC–EC physiology, as laboratory animals are unlikely to traverse spatial environments
without receiving reward. This necessity of motivation for behaviour may help to explain
why signals of reward and value seem redundant throughout the brain. In this way, reward
is reminiscent of other signals critical for survival, such as thirst and movement, which
modulate activity in nearly all brain areas238-240. Despite these challenges to understanding
reward, the tractable nature of navigational codes makes the HPC–EC circuit a candidate
Author Manuscript

model system to learn how reward drives both adaptive and maladaptive memory-dependent
behaviours.

Acknowledgements
The authors thank A. Mohebi for feedback on the manuscript, M. Plitt for insightful discussions and E. Duvelle
for helpful correspondence. This work was supported by the US Office of Naval Research N00141812690, Simons
Foundation 542987SPI, the Vallee Foundation and James S McDonnell Foundation to L.M.G.

Glossary:
Grid cell
An entorhinal cortical cell that fires in triangularly spaced fields that tile the whole
environment.
Author Manuscript

Place cell
A hippocampal cell that fires maximally in one or few discrete regions of space (its ‘place
field’).

Theta sequences

Nat Rev Neurosci. Author manuscript; available in PMC 2022 October 17.
Sosa and Giocomo Page 15

Sequential spikes of multiple place cells that together encode a trajectory through space,
Author Manuscript

ordered by the theta phase of each spike. Theta sequences occur during times of high theta
power, typically during movement.

Sharp-wave ripples
(SWRs). High-frequency oscillations (about 150–250 Hz) in the local field potential (LFP)
coincident with a sharp, low-frequency deflection in the LFP. These events reflect the
coincident activation of many hippocampal cells in a short time period (about 50–200 ms)
and typically occur during immobility.

Replay events
Sequential spikes of multiple place cells that typically occur locked to SWRs during
immobility, and that together encode a trajectory through space. In high-fidelity replay
events, place cells in the sequence are reactivated according to the order in which they fired
Author Manuscript

during a previous run.

State
A snapshot of a situation, discretizing a longer continuous process that comprises an
experience. As an analogy, if this snapshot were taken by a camera, the duration of the
state would be the exposure time and would vary depending on the situation (for example,
how dark it is outside).

Value
How much an outcome, or state that predicts an outcome, is ‘worth’. This worth includes the
amount and likelihood of reward predicted.

Probabilistic value
Author Manuscript

The probability that reward will be delivered given a certain choice. Even if a choice is
correct according to the task, changing the probability of reward delivery can modulate the
value of preceding states.

Cheeseboard maze
A spatial task in which rewards are hidden in a subset of holes or wells in the floor of
an open arena. This task is used as a spatial memory paradigm because the animal has to
remember which wells are rewarded based on their position in the environment, and the
reward locations can change across sessions or days.

Reward prediction error


The difference between reward received and reward expected. Positive RPEs indicate larger
rewards than expected (including reward when none was expected), whereas negative RPEs
Author Manuscript

indicate smaller rewards than expected (including the absence of reward when it was
expected).

Reinforcement learning
A set of computational theories, often used for machine learning, to describe how states and
actions are assigned values that inform how an agent can receive maximal reward.

Temporal difference learning

Nat Rev Neurosci. Author manuscript; available in PMC 2022 October 17.
Sosa and Giocomo Page 16

A type of reinforcement learning in which values are updated by a reward prediction error
Author Manuscript

between temporally adjacent states, such that states preceding the reward receive a ‘cached’
value prediction

Motivation
The impetus an agent feels to perform reward-seeking actions. Value is used to inform
motivation and invigorate reward-seeking actions (make them faster and more efficient).

Value function
A function of adjacent states, or states paired with actions, that computes the expected future
reward in each state.

References
1. Robinson TE & Berridge KC The psychology and neurobiology of addiction: an incentive-
Author Manuscript

sensitization view. Addiction 95 Suppl 2, S91–117, (2000). [PubMed: 11002906]


2. Crombag HS & Shaham Y Renewal of drug seeking by contextual cues after prolonged extinction in
rats. Behavioral neuroscience 116, 169–173, (2002). [PubMed: 11895178]
3. O'Keefe J & Nadel L The hippocampus as a cognitive map. (Oxford University Press, 1978).
4. Hafting T, Fyhn M, Molden S, Moser MB & Moser EI Microstructure of a spatial map in the
entorhinal cortex. Nature 436, 801–806, (2005). [PubMed: 15965463]
5. McNaughton BL, Battaglia FP, Jensen O, Moser EI & Moser MB Path integration and the neural
basis of the 'cognitive map'. Nat Rev Neurosci 7, 663–678, (2006). [PubMed: 16858394]
6. Diehl GW, Hon OJ, Leutgeb S & Leutgeb JK Grid and nongrid cells in medial entorhinal cortex
represent spatial location and environmental features with complementary coding schemes. Neuron
94, 83–92, (2017). [PubMed: 28343867]
7. Taube JS The head direction signal: origins and sensory-motor integration. Annu Rev Neurosci 30,
181–207, (2007). [PubMed: 17341158]
8. Taube JS, Muller RU & Ranck JB Jr. Head-direction cells recorded from the postsubiculum in freely
Author Manuscript

moving rats. II. Effects of environmental manipulations. J Neurosci 10, 436–447, (1990). [PubMed:
2303852]
9. Sargolini F et al. Conjunctive representation of position, direction, and velocity in entorhinal cortex.
Science 312, 758–762, (2006). [PubMed: 16675704]
10. Kropff E, Carmichael JE, Moser MB & Moser EI Speed cells in the medial entorhinal cortex.
Nature 523, 419–424, (2015). [PubMed: 26176924]
11. Hoydal OA, Skytoen ER, Andersson SO, Moser MB & Moser EI Object-vector coding in the
medial entorhinal cortex. Nature 568, 400–404, (2019). [PubMed: 30944479]
12. Solstad T, Boccara CN, Kropff E, Moser MB & Moser EI Representation of geometric borders in
the entorhinal cortex. Science 322, 1865–1868, (2008). [PubMed: 19095945]
13. Savelli F, Yoganarasimha D & Knierim JJ Influence of boundary removal on the spatial
representations of the medial entorhinal cortex. Hippocampus 18, 1270–1282, (2008). [PubMed:
19021262]
14. Tsao A et al. Integrating time from experience in the lateral entorhinal cortex. Nature 561, 57–62,
Author Manuscript

(2018). [PubMed: 30158699]


15. Deshmukh SS & Knierim JJ Representation of non-spatial and spatial information in the lateral
entorhinal cortex. Front Behav Neurosci 5, doi 10.3389/fnnrh.2011.00069, (2011).
16. Tsao A, Moser MB & Moser EI Traces of experience in the lateral entorhinal cortex. Curr Biol 23,
399–405, (2013). [PubMed: 23434282]
17. Fiete IR, Burak Y & Brookings T What Grid Cells Convey About Rat Location. J Neurosci 28,
6858–6871, (2008). [PubMed: 18596161]
18. Allen K et al. Impaired path integration and grid cell spatial periodicity in mice lacking GluA1-
containing AMPA receptors. J Neurosci 34, 6245–6259, (2014). [PubMed: 24790195]

Nat Rev Neurosci. Author manuscript; available in PMC 2022 October 17.
Sosa and Giocomo Page 17

19. Gil M et al. Impaired path integration in mice with disrupted grid cell firing. Nature Neuroscience
1, 81–91, (2018).
Author Manuscript

20. Rowland DC, Roudi Y, Moser MB & Moser EI Ten years of grid cells. Annu Rev Neurosci 39,
19–40, (2016). [PubMed: 27023731]
21. Burwell RD & Witter MP in The Parahippocampal Region: organization and role in cognitive
function (eds Witter MP & Wouterlood FG) (Oxford University Press, 2002).
22. O'Keefe J & Dostrovsky J The hippocampus as a spatial map. Preliminary evidence from unit
activity in the freely-moving rat. Brain research 34, 171–175, (1971). [PubMed: 5124915]
23. O'Keefe J & Recce ML Phase relationship between hippocampal place units and the EEG theta
rhythm. Hippocampus 3, 317–330, (1993). [PubMed: 8353611]
24. Skaggs WE, McNaughton BL, Wilson MA & Barnes CA Theta phase precession in hippocampal
neuronal populations and the compression of temporal sequences. Hippocampus 6, 149–172,
(1996). [PubMed: 8797016]
25. Dragoi G & Buzsaki G Temporal encoding of place sequences by hippocampal cell assemblies.
Neuron 50, 145–157, (2006). [PubMed: 16600862]
26. Buzsaki G Hippocampal sharp wave-ripple: A cognitive biomarker for episodic memory and
Author Manuscript

planning. Hippocampus 25, 1073–1188, (2015). [PubMed: 26135716]


27. Wilson MA & McNaughton BL Reactivation of hippocampal ensemble memories during sleep.
Science 265, 676–679, (1994). [PubMed: 8036517]
28. Foster DJ & Wilson MA Reverse replay of behavioural sequences in hippocampal place cells
during the awake state. Nature 440, 680–683, (2006). [PubMed: 16474382]
29. Diba K & Buzsaki G Forward and reverse hippocampal place-cell sequences during ripples. Nat.
Neurosci 10, 1241–1242, (2007). [PubMed: 17828259]
30. Lee AK & Wilson MA Memory of sequential experience in the hippocampus during slow wave
sleep. Neuron 36, 1183–1194, (2002). [PubMed: 12495631]
31. Kentros CG, Agnihotri NT, Streater S, Hawkins RD & Kandel ER Increased attention to spatial
context increases both place field stability and spatial memory. Neuron 42, 283–295, (2004).
[PubMed: 15091343]
32. Dupret D, O'Neill J, Pleydell-Bouverie B & Csicsvari J The reorganization and reactiation of
hippocampal maps predict spatial memory performance. Nat Neurosci 13, 995–1002, (2010).
Author Manuscript

[PubMed: 20639874] This landmark study established that the clustering of hippocampal
place fields near reward locations requires plasticity during learning to retain the reorganized
representation during memory retrieval, and that reward memory is supported by reactivation of
the reorganized representation during sharp-wave ripples.
33. de Lavilleon G, Lacroix MM, Rondi-Reig L & Benchenane K Explicit memory creation during
sleep demonstrates a causal role of place cells in navigation. Nat Neurosci 18, 493–495, (2015).
[PubMed: 25751533]
34. Robbe D & Buzsaki G Alteration of theta timescale dynamics of hippocampal place cells by
a cannabinoid is associated with memory impairment. J Neurosci 29, 12597–12605, (2009).
[PubMed: 19812334]
35. Petersen PC & Buzsaki G Cooling of Medial Septum Reveals Theta Phase Lag Coordination of
Hippocampal Cell Assemblies. Neuron 107, 731–744 e733, (2020). [PubMed: 32526196]
36. Bolding KA, Ferbinteanu J, Fox SE & Muller RU Place cell firing cannot support navigation
without intact septal circuits. Hippocampus 30, 175–191, (2020). [PubMed: 31301167]
37. Jadhav SP, Kemere C, German PW & Frank LM Awake hippocampal sharp-wave ripples support
Author Manuscript

spatial memory. Science 336, 1454–1458, (2012). [PubMed: 22555434]


38. Fernandez-Ruiz A et al. Long-duration hippocampal sharp wave ripples improve memory. Science
364, 1082–1086, (2019). [PubMed: 31197012]
39. Gridchyn I, Schoenenberger P, O'Neill J & Csicsvari J Assembly-Specific Disruption of
Hippocampal Replay Leads to Selective Memory Deficit. Neuron, (2020).
40. Ego-Stengel V & Wilson MA Disruption of ripple-associated hippocampal activity during rest
impairs spatial learning in the rat. Hippocampus 20, 1–10, (2010). [PubMed: 19816984]

Nat Rev Neurosci. Author manuscript; available in PMC 2022 October 17.
Sosa and Giocomo Page 18

41. Girardeau G, Benchenane K, Wiener SI, Buzsaki G & Zugaro MB Selective suppression of
hippocampal ripples impairs spatial memory. Nat.Neurosci 12, 1222–1223, (2009). [PubMed:
Author Manuscript

19749750]
42. Robinson NTM et al. Targeted Activation of Hippocampal Place Cells Drives Memory-Guided
Spatial Behavior. Cell 183, 2041–2042, (2020). [PubMed: 33357402]
43. Heys JG & Dombeck DA Evidence for a subcircuit in medial entorhinal cortex representing
elapsed time during immobility. Nat Neurosci 21, 1574–1582, (2018). [PubMed: 30349104]
44. Sun C, Yang W, Martin J & Tonegawa S Hippocampal neurons represent events as transferable
units of experience. Nat Neurosci, (2020).
45. Taxidis J et al. Differential Emergence and Stability of Sensory and Temporal Representations
in Context-Specific Hippocampal Sequences. Neuron 108, 984–998 e989, (2020). [PubMed:
32949502]
46. MacDonald CJ, Lepage KQ, Eden UT & Eichenbaum H Hippocampal "time cells" bridge the gap
in memory for discontiguous events. Neuron 71, 737–749, (2011). [PubMed: 21867888]
47. Pastalkova E, Itskov V, Amarasingham A & Buzsaki G Internally generated cell assembly
sequences in the rat hippocampus. Science 321, 1322–1327, (2008). [PubMed: 18772431]
Author Manuscript

48. Kraus BJ et al. Grid cells are time cells. SFN Neurosci. Abstr 769.19, (2013).
49. Shimbo A, Izawa EI & Fujisawa S Scalable representation of time in the hippocampus. Sci Adv 7,
(2021).
50. Aronov D, Nevers R & Tank DW Mapping of a non-spatial dimension by the hippocampal-
entorhinal circuit. Nature 543, 719–722, (2017). [PubMed: 28358077]
51. Terada S, Sakurai Y, Nakahara H & Fujisawa S Temporal and Rate Coding for Discrete Event
Sequences in the Hippocampus. Neuron 94, 1248–1262 e1244, (2017). [PubMed: 28602691]
52. Radvansky BA & Dombeck DA An olfactory virtual reality system for mice. Nat Commun 9, 839,
(2018). [PubMed: 29483530]
53. Behrens TEJ et al. What Is a Cognitive Map? Organizing Knowledge for Flexible Behavior.
Neuron 100, 490–509, (2018). [PubMed: 30359611]
54. Stachenfeld KL, Botvinick MM & Gershman SJ The hippocampus as a predictive map. Nat
Neurosci 20, 1643–1653, (2017). [PubMed: 28967910] This computational modelling paper
proposes that the hippocampal-entorhinal system encodes a successor representation of predicted
Author Manuscript

future states, unifying findings made during spatial navigation studies with a reinforcement
learning framework.
55. Klukas M, Lewis M & Fiete I Efficient and flexible representation of higher-dimensional cognitive
variables with grid cells. PLoS Comput Biol 16, e1007796, (2020). [PubMed: 32343687]
56. Burgess N, Maguire EA & O’Keefe J The human hippocampus and spatial and episodic memory.
Neuron 35, 625–641, (2002). [PubMed: 12194864]
57. Singer AC & Frank LM Rewarded outcomes enhance reactivation of experience in the
hippocampus. Neuron 64, 910–921, (2009). [PubMed: 20064396] This key set of findings
demonstrated a specific enhancement of hippocampal sharp-wave ripples by receipt of reward
in the awake state, with reward increasing both the prevalence of sharp-wave ripple events and the
reactivation of place cells involved in the task.
58. Sasaki T et al. Dentate network activity is necessary for spatial working memory by supporting
CA3 sharp-wave ripple generation and prospective firing of CA3 neurons. Nat Neurosci 21, 258–
269, (2018). [PubMed: 29335604]
59. Ambrose RE, Pfeiffer BE & Foster DJ Reverse Replay of Hippocampal Place Cells Is Uniquely
Author Manuscript

Modulated by Changing Reward. Neuron 91, 1124–1136, (2016). [PubMed: 27568518]


60. Sosa M, Joo HR & Frank LM Dorsal and Ventral Hippocampal Sharp-Wave Ripples Activate
Distinct Nucleus Accumbens Networks. Neuron 105, 725–741 e728, (2020). [PubMed: 31864947]
61. Bhattarai B, Lee JW & Jung MW Distinct effects of reward and navigation history on hippocampal
forward and reverse replays. Proc Natl Acad Sci U S A 117, 689–697, (2020). [PubMed:
31871185]
62. Eichenbaum H, Kuperstein M, Fagan A & Nagode J Cue-sampling and goal-approach correlates of
hippocampal unit-activity in rats performing an odor-discrimination task. Journal Of Neuroscience
7, 716–732, (1987). [PubMed: 3559709]

Nat Rev Neurosci. Author manuscript; available in PMC 2022 October 17.
Sosa and Giocomo Page 19

63. Markus EJ et al. Interactions between location and task affect the spatial and directional firing of
hippocampal neurons. J Neurosci 15, 7079–7094, (1995). [PubMed: 7472463]
Author Manuscript

64. Aoki Y, Igata H, Ikegaya Y & Sasaki T The Integration of Goal-Directed Signals onto Spatial
Maps of Hippocampal Place Cells. Cell Rep 27, 1516–1527 e1515, (2019). [PubMed: 31042477]
65. Wood ER, Dudchenko PA, Robitsek RJ & Eichenbaum H Hippocampal neurons encode
information about different types of memory episodes occurring in the same location. Neuron
27, 623–633, (2000). [PubMed: 11055443]
66. Frank LM, Brown EN & Wilson M Trajectory encoding in the hippocampus and entorhinal cortex.
Neuron 27, 169–178, (2000). [PubMed: 10939340] This study is one of the first (see also Wood et
al. 2000) to demonstrate prospective and retrospective coding in both the hippocampus and medial
entorhinal cortex, indicating that cells previously thought to code only spatial locations can reflect
mnemonic processing of the animal’s future or past route.
67. Grieves RM, Wood ER & Dudchenko PA Place cells on a maze encode routes rather than
destinations. Elife 5, (2016).
68. Ito HT, Zhang S, Witter MP, Moser EI & Moser MB A prefrontal-thalamo-hippocampal circuit for
goal directed spatial navigation. Nature 522, 50–55, (2015). [PubMed: 26017312]
Author Manuscript

69. Lee I, Griffin AL, Zilli EA, Eichenbaum H & Hasselmo ME Gradual translocation of spatial
correlates of neuronal firing in the hippocampus toward prospective reward locations. Neuron 51,
639–650, (2006). [PubMed: 16950161]
70. Kennedy PJ & Shapiro ML Motivational states activate distinct hippocampal representations to
guide goal-directed behaviors. Proc. Natl. Acad. Sci. USA 106, 10805–10810, (2009). [PubMed:
19528659]
71. Lee H, Ghim JW, Kim H, Lee D & Jung M Hippocampal neural correlates for values of
experienced events. J Neurosci 32, 15053–15065, (2012). [PubMed: 23100426]
72. Xu H, Baracskay P, O'Neill J & Csicsvari J Assembly Responses of Hippocampal CA1 Place Cells
Predict Learned Behavior in Goal-Directed Spatial Tasks on the Radial Eight-Arm Maze. Neuron
101, 119–132 e114, (2019). [PubMed: 30503645]
73. Sarel A, Finkelstein A, Las L & Ulanovksy N Vectorial representation of spatial goals in the
hippocampus of bats. Science 355, 176–180, (2017). [PubMed: 28082589]
74. Hollup SA, Molden S, Donnett JG, Moser MB & Moser EI Accumulation of hippocampal place
Author Manuscript

fields at the goal location in an annular watermaze task. J Neurosci 21, 1635–1644, (2001).
[PubMed: 11222654] This paper was the first to clearly demonstrate that hippocampal place fields
cluster near goal locations, using a ring-shaped water maze.
75. Mamad O et al. Place field assembly distribution encodes preferred locations. PLoS Biol 15,
e2002365, (2017). [PubMed: 28898248] This study found that optogenetic manipulation of ventral
tegmental area inputs to the dorsal hippocampus can drive a behavioural place preference as well
as a shift in place fields toward the location of stimulation.
76. Xiao Z, Lin K & Fellous JM Conjunctive reward-place coding properties of dorsal distal CA1
hippocampus cells. Biol Cybern 114, 285–301, (2020). [PubMed: 32266474]
77. Danielson NB et al. Sublayer-specific coding dynamics during spatial navigation and learning in
hippocampal area CA1. Neuron 91, 652–665, (2016). [PubMed: 27397517]
78. Turi GF et al. Vasoactive Intestinal Polypeptide-Expressing Interneurons in the Hippocampus
Support Goal-Oriented Spatial Learning. Neuron 101, 1150–1165 e1158, (2019). [PubMed:
30713030]
79. Kaufman AM, Geiller T & Losonczy A A Role for the Locus Coeruleus in Hippocampal CA1
Author Manuscript

Place Cell Reorganization during Spatial Reward Learning. Neuron 105, 1018–1026 e1014,
(2020). [PubMed: 31980319] This elegant two-photon imaging work demonstrated for the first
time that the activity of locus coeruleus axons in the dorsal hippocampus signals changes in
a reward location, and that manipulating these inputs can modify the hippocampal population
representation of reward. Together with the work by Mamad et al. 2017, these studies implicate
dopaminergic inputs in reorganizing the hippocampal map around reward sites.
80. Zaremba JD et al. Impaired hippocampal place cell dynamics in a mouse model of the 22q11.2
deletion. Nat Neurosci 20, 1612–1623, (2017). [PubMed: 28869582]

Nat Rev Neurosci. Author manuscript; available in PMC 2022 October 17.
Sosa and Giocomo Page 20

81. Kobayashi T, Nishijo H, Fukuda M, Bures J & Ono T Task-dependent representations in rat
hippocampal place neurons. J. Neurophysiol 78, 597–613, (1997). [PubMed: 9307098]
Author Manuscript

82. Kobayashi T, Tran AH, Nishijo H, Ono T & Matsumoto G Contribution of hippocampal place
cell activity to learning and formation of goal-directed navigation in rats. Neuroscience 117, 1025–
1035, (2003). [PubMed: 12654354]
83. Tryon VL et al. Hippocampal neural activity reflects the economy of choices during goal-directed
navigation. Hippocampus 27, 743–758, (2017). [PubMed: 28241404]
84. Mizuta K, Nakai J, Hayashi Y & Sato M Multiple coordinated cellular dynamics mediate CA1 map
plasticity. Hippocampus 31, 235–243, (2021). [PubMed: 33452849]
85. Sato M et al. Distinct Mechanisms of Over-Representation of Landmarks and Rewards in the
Hippocampus. Cell Rep 32, 107864, (2020). [PubMed: 32640229]
86. McKenzie S, Robinson NT, Herrera L, Churchill JC & Eichenbaum H Learning causes
reorganization of neuronal firing patterns to represent related experiences within a hippocampal
schema. J Neurosci 33, 10243–10256, (2013). [PubMed: 23785140]
87. Hok V et al. Goal-related activity in hippocampal place cells. J. Neurosci 27, 472–482, (2007).
[PubMed: 17234580]
Author Manuscript

88. Duvelle E et al. Insensitivity of Place Cells to the Value of Spatial Goals in a Two-Choice Flexible
Navigation Task. J Neurosci 39, 2522–2541, (2019). [PubMed: 30696727]
89. Gauthier JL & Tank DW A dedicated population for reward coding in the hippocampus.
Neuron 99, 179–193, (2018). [PubMed: 30008297] This two-photon imaging study uncovered
a subpopulation of hippocampal neurons specialized for encoding reward locations despite
changes in location or environmental context, suggesting that a hippocampal reward signal can
be dissociated from place firing.
90. Kay K et al. A hippocampal network for spatial coding during immobility and sleep. Nature 531,
185–190, (2016). [PubMed: 26934224]
91. Lee JS, Briguglio JJ, Cohen JD, Romani S & Lee AK The Statistical Structure of the Hippocampal
Code for Space as a Function of Time, Context, and Value. Cell 183, 620–635 e622, (2020).
[PubMed: 33035454]
92. Lee SH et al. Neural Signals Related to Outcome Evaluation Are Stronger in CA1 than CA3. Front
Neural Circuits 11, 40, (2017). [PubMed: 28638322]
Author Manuscript

93. Cembrowski MS & Spruston N Heterogeneity within classical cell types is the rule: lessons from
hippocampal pyramidal neurons. Nat Rev Neurosci 20, 193–204, (2019). [PubMed: 30778192]
94. Dupret D, O'Neill J & Csicsvari J Dynamic reconfiguration of hippocampal interneuron circuits
during spatial learning. Neuron 78, 166–180, (2013). [PubMed: 23523593]
95. Danielson NB et al. In Vivo Imaging of Dentate Gyrus Mossy Cells in Behaving Mice. Neuron 93,
552–559 e554, (2017). [PubMed: 28132825]
96. Senzai Y & Buzsaki G Physiological Properties and Behavioral Correlates of Hippocampal
Granule Cells and Mossy Cells. Neuron 93, 691–704 e695, (2017). [PubMed: 28132824]
97. GoodSmith D et al. Spatial representations of granule cells and mossy cells of the dentate gyrus.
Neuron 93, 677–690, (2017). [PubMed: 28132828]
98. Woods NI et al. The Dentate Gyrus Classifies Cortical Representations of Learned Stimuli. Neuron
107, 173–184 e176, (2020). [PubMed: 32359400]
99. Azevedo EP et al. A Role of Drd2 Hippocampal Neurons in Context-Dependent Food Intake.
Neuron 102, 873–886 e875, (2019). [PubMed: 30930044]
Author Manuscript

100. Strange BA, Witter MP, Lein ES & Moser EI Functional organization of the hippocampal
longitudinal axis. Nat Rev Neurosci 15, 655–669, (2014). [PubMed: 25234264]
101. Bryant KG & Barker JM Arbitration of Approach-Avoidance Conflict by Ventral Hippocampus.
Front Neurosci 14, 615337, (2020). [PubMed: 33390895]
102. Royer S, Sirota A, Patel J & Buzsaki G Distinct representations and theta dynamics in dorsal and
ventral hippocampus. J. Neurosci 30, 1777–1787, (2010). [PubMed: 20130187]
103. Ciocchi S, Passecker J, Malagon-Vina H, Mikus N & Klausberger T Brain computation. Selective
information routing by ventral hippocampal CA1 projection neurons. Science 348, 560–563,
(2015). [PubMed: 25931556]

Nat Rev Neurosci. Author manuscript; available in PMC 2022 October 17.
Sosa and Giocomo Page 21

104. Britt JP et al. Synaptic and behavioral profile of multiple glutamatergic inputs to the nucleus
accumbens. Neuron 76, 790–803, (2012). [PubMed: 23177963]
Author Manuscript

105. LeGates TA et al. Reward behaviour is regulated by the strength of hippocampus-nucleus


accumbens synapses. Nature 564, 258–262, (2018). [PubMed: 30478293]
106. Zhou Y et al. A ventral CA1 to nucleus accumbens core engram circuit mediates conditioned
place preference for cocaine. Nat Neurosci 22, 1986–1999, (2019). [PubMed: 31719672]
107. Meyers RA, Zavala AR & Neisewander JL Dorsal, but not ventral, hippocampal lesions disrupt
cocaine place conditioning. Neuroreport 14, 2127–2131, (2003). [PubMed: 14600510]
108. Riaz S, Schumacher A, Sivagurunathan S, Van Der Meer M & Ito R Ventral, but not dorsal,
hippocampus inactivation impairs reward memory expression and retrieval in contexts defined by
proximal cues. Hippocampus 27, 822–836, (2017). [PubMed: 28449268]
109. Sjulson L, Peyrache A, Cumpelik A, Cassataro D & Buzsaki G Cocaine Place Conditioning
Strengthens Location-Specific Hippocampal Coupling to the Nucleus Accumbens. Neuron 98,
926–934.e925, (2018). [PubMed: 29754750]
110. Trouche S et al. A Hippocampus-Accumbens Tripartite Neuronal Motif Guides Appetitive
Memory in Space. Cell 176, 1393–1406 e1316, (2019). [PubMed: 30773318]
Author Manuscript

111. van der Meer MA & Redish AD Theta phase precession in rat ventral striatum links place and
reward information. J. Neurosci 31, 2843–2854, (2011). [PubMed: 21414906]
112. Gergues MM et al. Circuit and molecular architecture of a ventral hippocampal network. Nat
Neurosci 23, 1444–1452, (2020). [PubMed: 32929245]
113. Hardcastle K, Maheswaranathan N, Ganguli S & Giocomo LM A multiplexed, heterogeneous,
and adaptive code for navigation in medial entorhinal cortex. Neuron 94, 375–387, (2017).
[PubMed: 28392071]
114. O'Neill J, Boccara CN, Stella F, Schoenenberger P & Csicsvari J Superficial layers of the medial
entorhinal cortex replay independently of the hippocampus. Science 355, 184–188, (2017).
[PubMed: 28082591]
115. Lipton PA, White JA & Eichenbaum H Disambiguation of overlapping experiences by neurons in
the medial entorhinal cortex. J Neurosci 27, 5787–5795, (2007). [PubMed: 17522322]
116. Wilming N, Konig P, Konig S & Buffalo EA Entorhinal cortex receptive fields are modulated by
spatial attention, even without movement. Elife 7, (2018).
Author Manuscript

117. Butler WN, Hardcastle K & Giocomo LM Remembered reward locations restructure entorhinal
spatial maps. Science 363, 1447–1452, (2019). [PubMed: 30923222]
118. Boccara CN, Nardin M, Stella F, O'Neill J & Csicsvari J The entorhinal cognitive map is attracted
to goals. Science 363, 1443–1447, (2019). [PubMed: 30923221] Using a memory-guided
cheeseboard maze, this study found that individuals fields of MEC grid cells can shift toward
reward locations through learning, indicating that grid cells are more dynamically modulated by
task demands than previously appreciated (See also Butler, Hardcastle & Giocomo, 2019).
119. Palacios-Filardo J & Mellor JR Neuromodulation of hippocampal long-term synaptic plasticity.
Curr Opin Neurobiol 54, 37–43, (2019). [PubMed: 30212713]
120. Watabe-Uchida M, Eshel N & Uchida N Neural Circuitry of Reward Prediction Error. Annu Rev
Neurosci 40, 373–394, (2017). [PubMed: 28441114]
121. Berke JD What does dopamine mean? Nat Neurosci 21, 787–793, (2018). [PubMed: 29760524]
122. Keiflin R & Janak PH Dopamine Prediction Errors in Reward Learning and Addiction: From
Theory to Neural Circuitry. Neuron 88, 247–263, (2015). [PubMed: 26494275]
Author Manuscript

123. Bromberg-Martin ES, Matsumoto M & Hikosaka O Dopamine in motivational control: rewarding,
aversive, and alerting. Neuron 68, 815–834, (2010). [PubMed: 21144997]
124. Fields HL, Hjelmstad GO, Margolis EB & Nicola SM Ventral tegmental area neurons in
learned appetitive behavior and positive reinforcement. Annu Rev Neurosci 30, 289–316, (2007).
[PubMed: 17376009]
125. Schultz W, Apicella P & Ljungberg T Responses of monkey dopamine neurons to reward and
conditioned stimuli during successive steps of learning a delayed response task. J Neurosci 13,
900–913, (1993). [PubMed: 8441015]

Nat Rev Neurosci. Author manuscript; available in PMC 2022 October 17.
Sosa and Giocomo Page 22

126. Schultz W, Dayan P & Montague PR A neural substrate of prediction and reward. Science 275,
1593–1599, (1997). [PubMed: 9054347]
Author Manuscript

127. Cohen JY, Haesler S, Vong L, Lowell BB & Uchida N Neuron-type-specific signals for reward
and punishment in the ventral tegmental area. Nature 482, 85–88, (2012). [PubMed: 22258508]
128. Fiorillo CD, Tobler PN & Schultz W Discrete coding of reward probability and uncertainty by
dopamine neurons. Science 299, 1898–1902, (2003). [PubMed: 12649484]
129. Montague PR, Dayan P & Sejnowski TJ A framework for mesencephalic dopamine systems
based on predictive Hebbian learning. J Neurosci 16, 1936–1947, (1996). [PubMed: 8774460]
130. Sutton RS & Barto AG Reinforcement learning (adaptive computation and machine learning).
(MIT Press, 1998).
131. Starkweather CK, Babayan BM, Uchida N & Gershman SJ Dopamine reward prediction
errors reflect hidden-state inference across time. Nat Neurosci 20, 581–589, (2017). [PubMed:
28263301]
132. Lak A, Nomoto K, Keramati M, Sakagami M & Kepecs A Midbrain Dopamine Neurons
Signal Belief in Choice Accuracy during a Perceptual Decision. Curr Biol 27, 821–832, (2017).
[PubMed: 28285994]
Author Manuscript

133. Dabney W et al. A distributional code for value in dopamine-based reinforcement learning.
Nature 577, 671–675, (2020). [PubMed: 31942076]
134. Engelhard B et al. Specialized coding of sensory, motor and cognitive variables in VTA dopamine
neurons. Nature 570, 509–513, (2019). [PubMed: 31142844]
135. Morris G, Nevet A, Arkadir D, Vaadia E & Bergman H Midbrain dopamine neurons encode
decisions for future action. Nat Neurosci 9, 1057–1063, (2006). [PubMed: 16862149]
136. Day JJ, Roitman MF, Wightman RM & Carelli RM Associative learning mediates dynamic shifts
in dopamine signaling in the nucleus accumbens. Nat Neurosci 10, 1020–1028, (2007). [PubMed:
17603481]
137. Floresco SB The nucleus accumbens: an interface between cognition, emotion, and action. Annu
Rev Psychol 66, 25–52, (2015). [PubMed: 25251489]
138. Hamid AA et al. Mesolimbic dopamine signals the value of work. Nat Neurosci 19, 117–126,
(2016). [PubMed: 26595651]
139. Howe MW, Tierney PL, Sandberg SG, Phillips PE & Graybiel AM Prolonged dopamine
Author Manuscript

signalling in striatum signals proximity and value of distant rewards. Nature 500, 575–579,
(2013). [PubMed: 23913271]
140. Kim HR et al. A Unified Framework for Dopamine Signals across Timescales. Cell 183, 1600–
1616 e1625, (2020). [PubMed: 33248024]
141. Phillips PE, Stuber GD, Heien ML, Wightman RM & Carelli RM Subsecond dopamine release
promotes cocaine seeking. Nature 422, 614–-618, (2003). [PubMed: 12687000]
142. Wassum KM, Ostlund SB & Maidment NT Phasic mesolimbic dopamine signaling precedes
and predicts performance of a self-initiated action sequence task. Biol Psychiatry 71, 846–854,
(2012). [PubMed: 22305286]
143. Mohebi A et al. Dissociable dopamine dynamics for learning and motivation. Nature 570, 65–70,
(2019). [PubMed: 31118513]
144. Nolan SO et al. Direct dopamine terminal regulation by local striatal microcircuitry. J Neurochem
155, 475–493, (2020). [PubMed: 32356315]
145. Smith CC & Greene RW CNS dopamine transmission mediated by noradrenergic innervation. J
Author Manuscript

Neurosci 32, 6072–6080, (2012). [PubMed: 22553014]


146. Poe GR et al. Locus coeruleus: a new look at the blue spot. Nat Rev Neurosci 21, 644–659,
(2020). [PubMed: 32943779]
147. Sara SJ & Bouret S Orienting and reorienting: the locus coeruleus mediates cognition through
arousal. Neuron 76, 130–141, (2012). [PubMed: 23040811]
148. Bouret S & Sara SJ Reward expectation, orientation of attention and locus coeruleus-medial
frontal cortex interplay during learning. The European journal of neuroscience 20, 791–802,
(2004). [PubMed: 15255989]

Nat Rev Neurosci. Author manuscript; available in PMC 2022 October 17.
Sosa and Giocomo Page 23

149. Bouret S & Richmond BJ Sensitivity of locus ceruleus neurons to reward value for goal-directed
actions. J Neurosci 35, 4005–4014, (2015). [PubMed: 25740528]
Author Manuscript

150. Xiang L et al. Behavioral correlates of activity of optogenetically identified locus coeruleus
noradrenergic neurons in rats performing T-maze tasks. Sci Rep 9, 1361, (2019). [PubMed:
30718532]
151. Varazzani C, San-Galli A, Gilardeau S & Bouret S Noradrenaline and dopamine neurons in
the reward/effort trade-off: a direct electrophysiological comparison in behaving monkeys. J
Neurosci 35, 7866–7877, (2015). [PubMed: 25995472]
152. Trudeau LE et al. The multilingual nature of dopamine neurons. Prog Brain Res 211, 141–164,
(2014). [PubMed: 24968779]
153. Fields HL & Margolis EB Understanding opioid reward. Trends Neurosci 38, 217–225, (2015).
[PubMed: 25637939]
154. Fischer AG & Ullsperger M An Update on the Role of Serotonin and its Interplay with Dopamine
for Reward. Frontiers in Human Neuroscience 11, (2017).
155. Teixeira CM et al. Hippocampal 5-HT Input Regulates Memory Formation and Schaffer
Collateral Excitation. Neuron 98, 992–1004 e1004, (2018). [PubMed: 29754752]
Author Manuscript

156. Luchetti A et al. Two Functionally Distinct Serotonergic Projections into Hippocampus. J
Neurosci 40, 4936–4944, (2020). [PubMed: 32414785]
157. Hangya B, Ranade SP, Lorenc M & Kepecs A Central Cholinergic Neurons Are Rapidly
Recruited by Reinforcement Feedback. Cell 162, 1155–1168, (2015). [PubMed: 26317475]
158. Takeuchi T et al. Locus coeruleus and dopaminergic consolidation of everyday memory. Nature
537, 357–362, (2016). [PubMed: 27602521]
159. Wagatsuma A et al. Locus coeruleus input to hippocampal CA3 drives single-trial learning of a
novel context. Proc Natl Acad Sci U S A 115, E310–E316, (2018). [PubMed: 29279390]
160. O'Carroll CM, Martin SJ, Sandin J, Frenguelli B & Morris RG Dopaminergic modulation of
the persistence of one-trial hippocampus-dependent memory. Learning & memory (Cold Spring
Harbor, N.Y.) 13, 760–769, (2006).
161. Gasbarri A, Packard MG, Campana E & Pacitti C Anterograde and retrograde tracing of
projections from the ventral tegmental area to the hippocampal formation in the rat. Brain Res.
Bull 33, 445–452, (1994). [PubMed: 8124582]
Author Manuscript

162. Loughlin SE, Foote SL & Grzanna R Efferent projections of nucleus locus coeruleus:
morphologic subpopulations have different efferent targets. Neuroscience 18, 307–319, (1986).
[PubMed: 3736861]
163. Fallon JH, Koziell DA & Moore RY Catecholamine innervation of the basal forebrain. II.
Amygdala, suprarhinal cortex and entorhinal cortex. The Journal of comparative neurology 180,
509–532, (1978). [PubMed: 659673]
164. Kempadoo KA, Mosharov EV, Choi SJ, Sulzer D & Kandel ER Dopamine release from the locus
coeruleus to the dorsal hippocampus promotes spatial learning and memory. Proc Natl Acad Sci
U S A 113, 14835–14840, (2016). [PubMed: 27930324]
165. Rosen ZB, Cheung S & Siegelbaum SA Midbrain dopamine neurons bidirectionally regulate
CA3-CA1 synaptic drive. Nat Neurosci 18, 1763–1771, (2015). [PubMed: 26523642]
166. Martig AK & Mizumori SJ Ventral tegmental area disruption selectively affects CA1/CA2 but not
CA3 place fields during a differential reward working memory task. Hippocampus 21, 172–184,
(2011). [PubMed: 20082295]
167. McNamara CG & Dupret D Two sources of dopamine for the hippocampus. Trends Neurosci 40,
Author Manuscript

383–384, (2017). [PubMed: 28511793]


168. McNamara CG, Tejero-Cantero A, Trouche S, Campo-Urriza N & Dupret D Dopaminergic
neurons promote hippocampal reactivation and spatial memory persistence. Nat Neurosci 17,
1658–1660, (2014). [PubMed: 25326690] This paper found that optogenetic stimulation of
ventral tegmental area axons in dorsal hippocampus increases the reactivation of place cell
ensembles in subsequent sharp-wave ripples during sleep, improving memory for reward
locations.
169. Retailleau A & Morris G Spatial Rule Learning and Corresponding CA1 Place Cell Reorientation
Depend on Local Dopamine Release. Curr Biol 28, 836–846 e834, (2018). [PubMed: 29502949]

Nat Rev Neurosci. Author manuscript; available in PMC 2022 October 17.
Sosa and Giocomo Page 24

170. Bethus I, Tse D & Morris RG Dopamine and memory: modulation of the persistence of memory
for novel hippocampal NMDA receptor-dependent paired associates. J Neurosci 30, 1610–1618,
Author Manuscript

(2010). [PubMed: 20130171]


171. Sara SJ & Segal M Plasticity of sensory responses of locus-ceruleus neurons in the behaving rat -
implications for cognition. Progress In Brain Research 88, 571–585, (1991). [PubMed: 1813935]
172. Sosa M, Gillespie AK & Frank LM Neural Activity Patterns Underlying Spatial Coding in
the Hippocampus. Current topics in behavioral neurosciences 37, 43–100, (2018). [PubMed:
27885550]
173. Buzsaki G & Tingley D Space and Time: The Hippocampus as a Sequence Generator. Trends
Cogn Sci 22, 853–869, (2018). [PubMed: 30266146]
174. Gupta AS, van der Meer MA, Touretzky DS & Redish AD Segmentation of spatial experience by
hippocampal theta sequences. Nat Neurosci 15, 1032–1039, (2012). [PubMed: 22706269]
175. Foster DJ & Wilson MA Hippocampal theta sequences. Hippocampus 17, 1093–1099, (2007).
[PubMed: 17663452]
176. Johnson A & Redish AD Neural ensembles in CA3 transiently encode paths forward of the
animal at a decision point. J Neurosci 27, 12176–12189, (2007). [PubMed: 17989284]
Author Manuscript

177. Wikenheiser AM & Redish AD Hippocampal theta sequences reflect current goals. Nat Neurosci
18, 289–294, (2015). [PubMed: 25559082] This key study established theta sequences as a
putative mechanism in spatial planning, finding that when an animal initiates approach to goals
at different distances, theta sequences flexibly extend their ‘look ahead distance’ to predict the
animal’s chosen goal.
178. Kay K et al. Constant Sub-second Cycling between Representations of Possible Futures in the
Hippocampus. Cell 180, 552–567 e525, (2020). [PubMed: 32004462]
179. Wang M, Foster DJ & Pfeiffer BE Alternating sequences of future and past behavior encoded
within hippocampal theta oscillations. Science 370, 247–250, (2020). [PubMed: 33033222]
180. Brandon MP, Bogaard AR, Schultheiss NW & Hasselmo ME Segregation of cortical head
direction cell assemblies on alternating theta cycles. Nat Neurosci 16, 739–748, (2013).
[PubMed: 23603709]
181. Kubie JL & Fenton AA Linear look-ahead in conjunctive cells: an entorhinal mechanism for
vector-based navigation. Front Neural Circuits 6, 20, (2012). [PubMed: 22557948]
Author Manuscript

182. Hasselmo ME, Bodelon C & Wyble BP A proposed function for hippocampal theta rhythm:
Separate phases of encoding and retrieval enhance reversal of prior learning. Neural Computation
14, 793–817, (2002). [PubMed: 11936962]
183. Davidson TJ, Kloosterman F & Wilson MA Hippocampal replay of extended experience. Neuron
63, 497–507, (2009). [PubMed: 19709631]
184. Joo HR & Frank LM The hippocampal sharp wave-ripple in memory retrieval for immediate use
and consolidation. Nat Rev Neurosci 19, 744–757, (2018). [PubMed: 30356103]
185. Findlay G, Tononi G & Cirelli C The evolving view of replay and its functions in wake and sleep.
Sleep Adv 1, zpab002, (2021). [PubMed: 33644760]
186. Pfeiffer BE & Foster DJ Hippocampal place-cell sequences depict future paths to remembered
goals. Nature 497, 74–79, (2013). [PubMed: 23594744] This impressive study found that in
a two-dimensional environment, hippocampal replay events can flexibly predict the animal’s
subsequent trajectory to remembered reward locations, providing evidence for a possible role of
replay in planning.
187. Karlsson MP & Frank LM Awake replay of remote experiences in the hippocampus. Nat Neurosci
Author Manuscript

12, 913–918, (2009). [PubMed: 19525943]


188. Singer AC, Carr MF, Karlsson MP & Frank LM Hippocampal SWR Activity Predicts Correct
Decisions during the Initial Learning of an Alternation Task. Neuron 77, 1163–1173, (2013).
[PubMed: 23522050]
189. Gillespie AK et al. Hippocampal replay reflects specific past experiences rather than a plan for
subsequent choice. bioRxiv, 2021.2003.2009.434621, (2021).
190. Carey AA, Tanaka Y & van der Meer MAA Reward revaluation biases hippocampal replay
content away from the preferred outcome. Nat Neurosci 22, 1450–1459, (2019). [PubMed:
31427771]

Nat Rev Neurosci. Author manuscript; available in PMC 2022 October 17.
Sosa and Giocomo Page 25

191. Barron HC et al. Neuronal Computation Underlying Inferential Reasoning in Humans and Mice.
Cell 183, 228–243 e221, (2020). [PubMed: 32946810]
Author Manuscript

192. Ólafsdóttir HF, Carpenter F & Barry C Coordinated grid and place cell replay during rest. Nat
Neurosci 19, 792–794, (2016). [PubMed: 27089021]
193. Trimper JB, Trettel SG, Hwaun E & Colgin LL Methodological caveats in the detection of
coordinted replay between place cells and grid cells. Front Syst Neurosci 11, doi 10.3389,
(2017).
194. Yamamoto J & Tonegawa S Direct Medial Entorhinal Cortex Input to Hippocampal CA1
Is Crucial for Extended Quiet Awake Replay. Neuron 96, 217–227 e214, (2017). [PubMed:
28957670]
195. Todorova R & Zugaro M Hippocampal ripples as a mode of communication with cortical and
subcortical areas. Hippocampus 30, 39–49, (2018). [PubMed: 30069976]
196. Pezzulo G, van der Meer MA, Lansink CS & Pennartz CM Internally generated sequences in
learning and executing goal-directed behavior. Trends Cogn Sci 18, 647–657, (2014). [PubMed:
25156191]
197. Logothetis NK et al. Hippocampal-cortical interaction during periods of subcortical silence.
Author Manuscript

Nature 491, 547–553, (2012). [PubMed: 23172213]


198. Ji D & Wilson MA Coordinated memory replay in the visual cortex and hippocampus during
sleep. Nat Neurosci 10, 100–107, (2007). [PubMed: 17173043]
199. Rothschild G, Eban E & Frank LM A cortical-hippocampal-cortical loop of information
processing during memory consolidation. Nat Neurosci 20, 251–259, (2017). [PubMed:
27941790]
200. Abadchi JK et al. Spatiotemporal patterns of neocortical activity around hippocampal sharp-wave
ripples. Elife 9, (2020).
201. Bendor D & Wilson MA Biasing the content of hippocampal replay during sleep. Nat. Neurosci
15, 1439–1444, (2012). [PubMed: 22941111]
202. Eichenbaum H Prefrontal-hippocampal interactions in episodic memory. Nat Rev Neurosci 18,
547–558, (2017). [PubMed: 28655882]
203. Hyman JM, Zilli EA, Paley AM & Hasselmo ME Medial prefrontal cortex cells show dynamic
modulation with the hippocampal theta rhythm dependent on behavior. Hippocampus 15, 739–
Author Manuscript

749, (2005). [PubMed: 16015622]


204. Jung MW, Qin Y, McNaughton BL & Barnes CA Firing characteristics of deep layer neurons in
prefrontal cortex in rats performing spatial working memory tasks. Cerebral cortex (New York,
N.Y. : 1991) 8, 437–450, (1998).
205. Jadhav SP, Rothschild G, Roumis DK & Frank LM Coordinated Excitation and Inhibition of
Prefrontal Ensembles during Awake Hippocampal Sharp-Wave Ripple Events. Neuron 90, 113–
127, (2016). [PubMed: 26971950]
206. Hok V, Save E, Lenck-Santini PP & Poucet B Coding for spatial goals in the prelimbic/
infralimbic area of the rat frontal cortex. Proc Natl Acad Sci U S A 102, 4602–4607, (2005).
[PubMed: 15761059]
207. Yu JY, Liu DF, Loback A, Grossrubatscher I & Frank LM Specific hippocampal representations
are linked to generalized cortical representations in memory. Nat Commun 9, 2209, (2018).
[PubMed: 29880860]
208. Niv Y Learning task-state representations. Nat Neurosci 22, 1544–1553, (2019). [PubMed:
31551597]
Author Manuscript

209. Siapas AG, Lubenov EV & Wilson MA Prefrontal phase-locking to hippocampal theta
oscillations. Neuron 46, 141–151, (2005). [PubMed: 15820700]
210. Benchenane K et al. Coherent theta oscillations and reorganization of spike timing in
the hippocampal-prefrontal network upon learning. Neuron 66, 921–936, (2010). [PubMed:
20620877]
211. Jones MW & Wilson MA Phase precession of medial prefrontal cortical activity relative to the
hippocampal theta rhythm. Hippocampus 15, 867–873, (2005). [PubMed: 16149084]

Nat Rev Neurosci. Author manuscript; available in PMC 2022 October 17.
Sosa and Giocomo Page 26

212. Zielinski MC, Shin JD & Jadhav SP Coherent Coding of Spatial Position Mediated by Theta
Oscillations in the Hippocampus and Prefrontal Cortex. J Neurosci 39, 4550–4565, (2019).
Author Manuscript

[PubMed: 30940717]
213. Hasz BM & Redish AD Spatial encoding in dorsomedial prefrontal cortex and hippocampus is
related during deliberation. Hippocampus 30, 1194–1208, (2020). [PubMed: 32809246]
214. Tang W, Shin JD & Jadhav SP Multiple time-scales of decision making in the hippocampus and
prefrontal cortex. Elife 10, (2021).
215. Peyrache A, Khamassi M, Benchenane K, Wiener SI & Battaglia FP Replay of rule-learning
related neural patterns in the prefrontal cortex during sleep. Nat. Neurosci 12, 919–926, (2009).
[PubMed: 19483687]
216. Shin JD, Tang W & Jadhav SP Dynamics of Awake Hippocampal-Prefrontal Replay for
Spatial Learning and Memory-Guided Decision Making. Neuron 104, 1110–1125 e1117, (2019).
[PubMed: 31677957]
217. Gomperts SN, Kloosterman F & Wilson MA VTA neurons coordinate with the hippocampal
reactivation of spatial experience. Elife 4, (2015).
218. Mattar MG & Daw ND Prioritized memory access explains planning and hippocampal replay.
Author Manuscript

Nat Neurosci 21, 1609–1617, (2018). [PubMed: 30349103] This work provides an innovative
computational framework for how forward and reverse replay events could assign values to states
along spatial trajectories depending on the agent’s behavioral needs.
219. Lansink CS, Goltstein PM, Lankelma JV, McNaughton BL & Pennartz CM Hippocampus
leads ventral striatum in replay of place-reward information. PLoS Biol. 7, e1000173, (2009).
[PubMed: 19688032]
220. Lansink CS et al. Reward Expectancy Strengthens CA1 Theta and Beta Band Synchronization
and Hippocampal-Ventral Striatal Coupling. J Neurosci 36, 10598–10610, (2016). [PubMed:
27733611]
221. Berke JD, Okatan M, Skurski J & Eichenbaum HB Oscillatory entrainment of striatal neurons in
freely moving rats. Neuron 43, 883–896, (2004). [PubMed: 15363398]
222. van der Meer MA & Redish AD Covert Expectation-of-Reward in Rat Ventral Striatum at
Decision Points. Front. Integr. Neurosci 3, 1, (2009). [PubMed: 19225578]
223. Wirtshafter HS & Wilson MA Locomotor and Hippocampal Processing Converge in the Lateral
Author Manuscript

Septum. Curr Biol 29, 3177–3192 e3173, (2019). [PubMed: 31543450]


224. Girardeau G, Inema I & Buzsaki G Reactivations of emotional memory in the hippocampus-
amygdala system during sleep. Nat Neurosci 20, 1634–1642, (2017). [PubMed: 28892057]
225. Mizumori SJ & Tryon VL Integrative hippocampal and decision-making neurocircuitry during
goal-relevant predictions and encoding. Prog Brain Res 219, 217–242, (2015). [PubMed:
26072241]
226. Lisman JE & Grace AA The hippocampal-VTA loop: controlling the entry of information into
long-term memory. Neuron. 46, 703–713, (2005). [PubMed: 15924857]
227. Gershman SJ The Successor Representation: Its Computational Logic and Neural Substrates. J
Neurosci 38, 7193–7200, (2018). [PubMed: 30006364]
228. Dayan P Improving generalization for temporal difference learning: The successor representation.
Neural Computation 5, 613–624, (1993).
229. Dordek Y, Soudry D, Meir R & Derdikman D Extracting grid cell characteristics from place
cell inputs using non-negative principal component analysis. Elife 5, e10094, (2016). [PubMed:
26952211]
Author Manuscript

230. Momennejad I Learning Structures: Predictive Representations, Replay, and Generalization.


Current Opinion in Behavioral Sciences 32, 155–166, (2020). [PubMed: 35419465]
231. Bakkour A et al. The hippocampus supports deliberation during value-based decisions. Elife 8,
(2019).
232. Biderman N, Bakkour A & Shohamy D What Are Memories For? The Hippocampus Bridges Past
Experience with Future Decisions. Trends Cogn Sci 24, 542–556, (2020). [PubMed: 32513572]
233. Vikbladh OM et al. Hippocampal Contributions to Model-Based Planning and Spatial Memory.
Neuron 102, 683–693 e684, (2019). [PubMed: 30871859]

Nat Rev Neurosci. Author manuscript; available in PMC 2022 October 17.
Sosa and Giocomo Page 27

234. Jeong Y et al. Role of the hippocampal CA1 region in incremental value learning. Sci Rep 8,
9870, (2018). [PubMed: 29959363]
Author Manuscript

235. McNamee DC, Stachenfeld KL, Botvinick MM & Gershman SJ Flexible modulation of sequence
generation in the entorhinal-hippocampal system. Nat. Neurosci, (2021).
236. Johnson A, van der Meer MA & Redish AD Integrating hippocampus and striatum in decision-
making. Curr Opin Neurobiol 17, 692–697, (2007). [PubMed: 18313289]
237. Jung MW, Lee H, Jeong Y, Lee JW & Lee I Remembering rewarding futures: A simulation-
selection model of the hippocampus. Hippocampus 28, 913–930, (2018). [PubMed: 30155938]
238. Allen WE et al. Thirst regulates motivated behavior through modulation of brainwide neural
population dynamics. Science 364, 253, (2019). [PubMed: 30948440]
239. Stringer C et al. Spontaneous behaviors drive multidimensional, brainwide activity. Science 364,
255, (2019). [PubMed: 31000656]
240. Musall S, Kaufman MT, Juavinett AL, Gluf S & Churchland AK Single-trial neural dynamics
are dominated by richly varied movements. Nat Neurosci 22, 1677–1686, (2019). [PubMed:
31551604]
241. Otmakhova NA & Lisman JE D1/D5 dopamine receptor activation increases the magnitude of
Author Manuscript

early long-term potentiation at CA1 hippocampal synapses. J Neurosci 16, 7478–7486, (1996).
[PubMed: 8922403]
242. Li S, Cullen WK, Anwyl R & Rowan MJ Dopamine-dependent facilitation of LTP induction in
hippocampal CA1 by exposure to spatial novelty. Nat. Neurosci 6, 526–531, (2003). [PubMed:
12704392]
243. Huang YY & Kandel ER D1/D5 receptor agonists induce a protein synthesis-dependent late
potentiation in the CA1 region of the hippocampus. Proc. Natl. Acad. Sci 92, 2446–2450, (1995).
[PubMed: 7708662]
244. Batallán-Burrowes AA & Chapman CA Dopamine suppresses persistent firing in layer III lateral
entorhinal cortex neurons. Neuroscience Letters 674, 70–74, (2018). [PubMed: 29524644]
245. Rosenkranz JA & Johnston D Dopaminergic regulation of neuronal excitability through
modulation of Ih in layer V entorhinal cortex. J Neurosci 26, 3229–3244, (2006). [PubMed:
16554474]
246. Caruana DA, Sorge RE, Stewart J & Chapman CA Dopamine has bidirectional effects on synaptic
Author Manuscript

responses to cortical inputs in layer II of the lateral entorhinal cortex. J Neurophysiol 96, 3006–
3015, (2006). [PubMed: 17005616]
247. Glovaci I, Caruana DA & Chapman CA Dopaminergic enhancement of excitatory synaptic
transmission in layer II entorhinal neurons is dependent on D(1)-like receptor-mediated
signaling. Neuroscience 258, 74–83, (2014). [PubMed: 24220689]
248. Pralong E & Jones RS Interactions of dopamine with glutamate- and GABA-mediated synaptic
transmission in the rat entorhinal cortex in vitro. The European journal of neuroscience 5, 760–
767, (1993). [PubMed: 7903191]
249. Hutter JA & Chapman CA Exposure to cues associated with palatable food reward results in a
dopamine D(2) receptor-dependent suppression of evoked synaptic responses in the entorhinal
cortex. Behav Brain Funct 9, 37, (2013). [PubMed: 24093833]
250. Jin X et al. Dopamine D2 receptors regulate the action potential threshold by modulating T-type
calcium channels in stellate cells of the medial entorhinal cortex. J Physiol 597, 3363–3387,
(2019). [PubMed: 31049961]
251. Stenkamp K, Heinemann U & Schmitz D Dopamine suppresses stimulus-induced field potentials
Author Manuscript

in layer III of rat medial entorhinal cortex. Neurosci Lett 255, 119–121, (1998). [PubMed:
9835229]
252. Mayne EW, Craig MT, McBain CJ & Paulsen O Dopamine suppresses persistent network activity
via D(1) -like dopamine receptors in rat medial entorhinal cortex. The European journal of
neuroscience 37, 1242–1247, (2013). [PubMed: 23336973]
253. Cilz NI, Kurada L, Hu B & Lei S Dopaminergic modulation of GABAergic transmission in the
entorhinal cortex: concerted roles of alpha1 adrenoreceptors, inward rectifier K(+), and T-type
Ca(2)(+) channels. Cerebral cortex (New York, N.Y. : 1991) 24, 3195–3208, (2014).

Nat Rev Neurosci. Author manuscript; available in PMC 2022 October 17.
Sosa and Giocomo Page 28

254. Li HB, Lin L, Yang LY & Xie C Dopaminergic facilitation of GABAergic transmission in layer
III of rat medial entorhinal cortex. Chin J Physiol 58, 46–54, (2015). [PubMed: 25687491]
Author Manuscript

255. Burak Y & Fiete IR Accurate path integration in continuous attractor network models of grid
cells. PLoS Comput Biol 5, e1000291, (2009). [PubMed: 19229307]
256. Couey JJ et al. Recurrent inhibitory circuitry as a mechanism for grid formation. Nat Neurosci 16,
318–324, (2013). [PubMed: 23334580]
257. Silva D, Feng T & Foster DJ Trajectory events across hippocampal place cells require previous
experience. Nat Neurosci 18, 1772–1779, (2015). [PubMed: 26502260]
258. O'Neill J, Senior TJ, Allen K, Huxter JR & Csicsvari J Reactivation of experience-dependent cell
assembly patterns in the hippocampus. Nat Neurosci 11, 209–215, (2008). [PubMed: 18193040]
259. Roux L, Hu B, Eichler R, Stark E & Buzsaki G Sharp wave ripples during learning stabilize the
hippocampal spatial map. Nat Neurosci 20, 845–853, (2017). [PubMed: 28394323]
260. Sabatini BL & Tian L Imaging Neurotransmitter and Neuromodulator Dynamics In Vivo with
Genetically Encoded Indicators. Neuron 108, 17–32, (2020). [PubMed: 33058762]
Author Manuscript
Author Manuscript
Author Manuscript

Nat Rev Neurosci. Author manuscript; available in PMC 2022 October 17.
Sosa and Giocomo Page 29

Box 1:
Author Manuscript

Hippocampal and entorhinal dopamine-mediated plasticity


The hippocampus expresses both D1-type dopamine receptors (including D1 and D5
receptors) and D2-type receptors (including D2 and D4 receptors), with CA1 primarily
expressing D1-type receptors226. In slice physiology experiments, dopamine and D1-type
receptor agonists amplify long-term potentiation at CA3 to CA1 synapses165,241-243
without increasing the excitability of CA1 neurons241, suggesting that dopamine
works together with glutamatergic inputs to augment plasticity. These plasticity effects
depend on the temporal dynamics of dopamine transmission. Tonic activation of
ventral tegmental area (VTA) inputs to the hippocampus depresses CA3–CA1 synapses
by recruiting local interneurons165. Phasic activation instead enhances CA3–CA1
excitation165, suggesting that phasic reward prediction error (RPE)-like activity in the
Author Manuscript

VTA may facilitate new associations in the hippocampus. Intriguingly, VTA input seems
not to affect entorhinal cortex (EC) to CA1 synapses, even though dopamine depresses
EC–CA1 synapses165,241. These results indicate that dopaminergic afferents differentially
affect distinct hippocampal pathways.

The effect of dopamine on synaptic transmission in the EC is both concentration-


dependent and lamina-dependent. In the lateral EC, low concentrations of dopamine
reduce excitability in layers III244 and V245 but increase excitation in layer II via D1-type
receptors246,247. High concentrations of dopamine increase excitability in layer V248 and
reduce excitation in layer II via D2-type receptors246,249. This suppression in layer II is
hypothesized to reduce the strength of sensory inputs during times of high dopamine
release, boosting the signal-to-noise ratio of the most relevant inputs or preventing
competition with ongoing memory processes244,249. In the medial EC, high dopamine
Author Manuscript

concentrations reduce the excitability of principal cells in layers II250 and III251,252,
primarily through the D2-type receptor. The suppressive effects of dopamine may be
facilitated in part by a dopamine-evoked excitation of layer III interneurons253, resulting
in increased inhibition onto principal cells in both layer III254 and layer II253. Dopamine
is therefore well positioned to moderate the recurrent activity between excitatory and
inhibitory MEC neurons that is thought to give rise to grid cell firing5,255,256. However,
the functional consequences of EC dopamine for spatial navigation and memory remain
unclear.
Author Manuscript

Nat Rev Neurosci. Author manuscript; available in PMC 2022 October 17.
Sosa and Giocomo Page 30

Box 2 ∣
Author Manuscript

Interactions between dopamine, plasticity and replay


Replay events rely on plasticity during experience to accurately replicate past
episodes257. The more that place cells overlap in their fields and fire together during
movement, the more they reactivate in the correct order during subsequent sharp-wave
ripples (SWRs)258. Accordingly, the shifting of place fields toward reward during
learning requires plasticity and increases the co-firing of neurons near reward locations,
allowing for cells representing reward locations to be reactivated more often during
replay32. Place field clustering may therefore increase the granularity of spatial reward
memories that are consolidated through SWRs. Dopamine is likely to play a substantial
role in this plasticity, as stimulation of axons projecting from the ventral tegmental
area to the hippocampus during learning supports the shifting of place fields75 and
Author Manuscript

enhances the fidelity of replayed place cell ensembles during sleep168. In turn, the
reward-enhanced reactivation of place field maps during sleep supports the subsequent
expression of those maps and the animal’s memory of the task32,168. Whether dopamine
further strengthens hippocampal synapses during reactivation in sleep is unknown.
However, this possibility is suggested by evidence that stimulation of the medial
forebrain bundle coupled to place cell spikes during sleep (many of which occur during
SWRs) causes a behavioural preference for the field of that place cell during subsequent
exploration of the environment33. Dopamine release at the time of wake replay may
have an additional role in solidifying the reactivated place cell maps by strengthening
their synaptic connections. Consistent with this possibility, SWRs during wake at
reward locations are required for stabilizing place fields over learning259. The effects
of dopamine on hippocampal plasticity are well suited to influence what information
Author Manuscript

gets stored during experience and to increase the probability that this information will be
consolidated into a stable representation that guides behaviour.
Author Manuscript

Nat Rev Neurosci. Author manuscript; available in PMC 2022 October 17.
Sosa and Giocomo Page 31

Box 3 ∣
Author Manuscript

Open questions and future directions


Does hippocampal reward coding drive reward-related changes in the medial entorhinal
cortex (MEC), or vice versa? Why have reward coding in both structures? It remains
unknown whether reward-related changes in MEC grid cell activity indeed reflect a
low-dimensional readout of hippocampal reorganization around reward sites. To better
understand what might be unique versus universal across the hippocampus (HPC) and
MEC in terms of reward coding, future work could use simultaneous recordings across
these areas or develop computational models in the context of complex, goal-oriented
tasks.

How do other neuromodulators, such as noradrenaline, serotonin and acetylcholine,


influence reward-related changes in the HPC and EC? Do these neuromodulators work in
Author Manuscript

concert with dopamine or exert independent effects on hippocampal plasticity and reward
coding? Are different neuromodulators acting in the entorhinal cortex (EC) versus in the
HPC, or perhaps engaged at different stages of learning? With the development of new
receptor-based fluorescent sensors for these neuromodulators260, some of these questions
can now be addressed with optical imaging techniques.

Are there dedicated subcircuits for reward processing in the ventral (or dorsal) HPC,
distinct from those dedicated to coding for aversive experiences or environmental
context? In what circumstances are the dorsal or ventral HPC most engaged in processing
rewarding experiences?

To what degree do task demands drive reward representations in the HPC or MEC? Do
the firing patterns that we describe here truly reflect ‘reward’ per se, or do they instead
Author Manuscript

reflect task engagement more broadly? Although some studies have tried to address this
issue with manipulations of reward size and probability71,83,88,92,234, additional clarity
could be gained by perturbing outcome valence and value across a range of tasks.

Does the HPC compute a successor representation alone (state occupancy), or


additionally compute its own value function? The assignment of reward value to
hippocampal sequences may occur locally, by modifying the spike rate, timing or
synaptic strength of hippocampal neurons in the sequence, or it may occur through
associated firing in downstream targets such as the prefrontal cortex or striatum. Further
experimental and computational work is needed to understand whether HPC–EC firing
patterns compute value locally or merely reflect state occupancy.
Author Manuscript

Nat Rev Neurosci. Author manuscript; available in PMC 2022 October 17.
Sosa and Giocomo Page 32
Author Manuscript
Author Manuscript

Fig. 1 ∣. Modulations of hippocampal-entorhinal activity at reward-related behavioural


timepoints.
Author Manuscript

a ∣ Timepoints and associated behaviours surrounding reward acquisition. The magenta star
indicates the goal location throughout the figure. Warmer colors indicate higher firing rates
except where noted. In example spike plots, grey points indicate positions of the animal;
coloured points indicate spikes. b ∣ Example hippocampal cell firing pattern during goal-
approach to the east reward well in a 2D environment64. c ∣ Place-field clustering in CA1
near three goal locations (white dots). Left: Place maps for an example cell before learning
(pre), at the end of learning and during a probe session (post)32. Right: density of population
place field centres (scale indicates proportion of cells). Because this overrepresentation of
goals is characterized as a change in the time-averaged hippocampal activity over the course
of a session, its specificity to goal approach versus goal arrival is not clear. d ∣ Example
CA1 or subiculum cells showing reward-specific firing (right) or place firing (left). Red
lines indicate reward locations. ‘A’ and ‘B’ denote distinct virtual environments. Each plot
Author Manuscript

shows mean calcium activity across trials89. e ∣ Left: continuous T-maze task, in which a
rodent must choose between left and right goals that have different probabilities of reward.
Right: Example CA1 cell showing increased firing rate based on reward history at the
right goal (R+) compared with unrewarded times (R−) and left goals (L−, L+). Top: Spike
raster for each outcome. Middle: Total occupancy of each spatial bin. Bottom: Average
firing rate for each outcome71. f ∣ Goal approach activity in an example medial entorhinal
cortex grid cell. Firing patterns in a 2D environment and continuous T-maze are shown. The

Nat Rev Neurosci. Author manuscript; available in PMC 2022 October 17.
Sosa and Giocomo Page 33

cell exhibited higher firing rate on the centre stem on right choice trials114. g ∣ Increase
Author Manuscript

in grid cell firing rates near a hidden goal zone (red box) when food is delivered inside
the zone (right) vs. during random foraging for scattered food (left)117. h ∣ Shifting grid
cell fields toward three reward locations (black dots) (similar format to part c). Red circle
highlights the field that moves the most across learning118. Part b adapted from ref. [64],
with permission. Part c adapted from ref. [32], with permission. Part d adapted from ref.
[89], with permission. Part e adapted from ref. [71], with permission. Part f adapted from
ref. [114], with permission. Part g adapted from ref. [117], with permission. Part h adapted
from ref. [118], with permission.
Author Manuscript
Author Manuscript
Author Manuscript

Nat Rev Neurosci. Author manuscript; available in PMC 2022 October 17.
Sosa and Giocomo Page 34
Author Manuscript
Author Manuscript

Fig. 2 ∣. Dopaminergic signalling and innervation of the hippocampus.


a ∣ Reward prediction error (RPE) signalling. Dopaminergic neurons of the ventral tegmental
area (VTA), which typically maintain a tonic firing rate, fire phasically in response to
unexpected reward (positive RPE). As the reward becomes more predictable over learning,
firing decreases for reward and increases for the reward-predictive cue, scaling with
the degree of expectation and the value predicted128. After extended learning, firing is
suppressed if the expected reward is omitted (negative RPE) and increased if reward is larger
than expected126. b ∣ Cartoon of value or motivation signalling in the nucleus accumbens
(NAc), similar between dopamine concentration and VTA axon activity (putative time
Author Manuscript

course based on refs 138,140,143). The example task here involves a movement to initiate the
trial, such as a nosepoke, followed by a reward-predictive cue just before reward delivery,
such as a feeder click. Phasic and ramping signals before reward delivery scale with recent
rate of reward, which approximates value and increases motivation to perform the task.
Note that RPE signals layer on top of this value signal, but here the reward delivered is
as expected. c ∣ Distribution of VTA and locus coeruleus (LC) axons in the hippocampus.
Darker yellow shading indicates greater LC axon density in CA3. d ∣ Summarized effects
of dopaminergic input inactivation or activation on four hippocampal place cells (coloured
blobs). Left: LC or VTA axon inhibition (colours as in part c), or dopamine antagonism
in hippocampus, destabilizes place fields in sequential exposures to the same square
environment. Right: LC or VTA axon activation with optogenetics (shown as a blue light)
promotes the shift of place fields toward a goal location (magenta star). SLM, stratum
Author Manuscript

lacunosum moleculare, SO, stratum oriens; SR, stratum radiatum; Sub, subiculum.

Nat Rev Neurosci. Author manuscript; available in PMC 2022 October 17.
Sosa and Giocomo Page 35
Author Manuscript
Author Manuscript

Fig. 3 ∣. Hippocampal theta sequences and replay.


a ∣ A rodent running on a linear environment engages theta sequences. An example
theta trace (local field potential filtered at 5–11 Hz) is shown below the track. Place
cells with overlapping fields spanning just behind to just ahead of the animal’s position
spike sequentially within each theta cycle (spikes are shown as vertical ticks, theta cycles
separated by dashed vertical lines). Early phases of theta (0 to pi radians) contain spikes
Author Manuscript

corresponding to past and present, whereas late phases (pi to 2 pi) contain more spikes
corresponding to future positions. b ∣ A ‘W-maze’ alternation task (for example as in refs
178,187,188) illustrating right and left choices represented as single spikes of place cells

(green and yellow fields) on alternating theta cycles178. Note that spikes occur on the late
phases of opposite theta cycles (same example theta trace as in part a). On the W-maze,
the animal is rewarded for visiting the opposite side arm from the previously visited arm
when coming from the centre. Thus theta alternation could act as a mode of deliberation,
with retrieval of information relevant to future experience taking place in the second half
of the cycle182. c ∣ In periods of immobility such as during food consumption, sequences
of places cells replay during sharp-wave ripples (SWRs). The same example SWR (local
field potential filtered at 150–250 Hz) is shown to illustrate both forward and reverse replay
events. d ∣ In the same W-maze task shown in part b, a rodent exhibits forward replay of
Author Manuscript

both alternate trajectories while immobile, before beginning a run. Separate replay events
(same SWR used for illustration purposes) are shown, displaying replay of leftward and
rightward place cell sequences, putatively allowing the animal to evaluate possible future
outcomes188.

Nat Rev Neurosci. Author manuscript; available in PMC 2022 October 17.
Sosa and Giocomo Page 36
Author Manuscript
Author Manuscript
Author Manuscript

Fig. 4 ∣. Hypothesized interactions between brain systems in navigating to reward.


a ∣ A sequence of hippocampal place fields interpreted as a sequence of 5 states (s1–s5) that
discretize forward movement on a linear track, with expected reward in each state (r1–r5). b
∣ The successor representation (SR) matrix for the 5 states depicted in part a. Hypothetical
transition probabilities arise from the assumption that the hippocampal representation is
mostly unidirectional on the linear track (that is, states in this sequence predict past states
with very low probability and future states with high probability that decays with increased
distance). Purple arrows indicate the firing field for hippocampus (HPC) cell 1 (column 1)
and its SR (row 1). c ∣ Left to right, first: The successor representation vector M(si,:) for
all states given trajectories initiated in state i for i = [1:5] (rows of the SR matrix). Darker
colours indicate higher predicted occupancy. Second: The firing rates of each hippocampal
Author Manuscript

cell in 5 spatial bins (that is, the 5 states) derived from the columns of the SR matrix.
Darker colours indicate higher firing rates. Third: Each hippocampal cell is hypothetically
coupled with a reward function that provides the expected reward in each state, here shown
as a ramp of dopamine release peaking at the reward location. This coupling could occur
via dopaminergic innervation of the HPC, or via spike coupling of HPC cells with nucleus
accumbens (NAc) neurons, for example, which receive ramping dopamine. Fourth: The
SR and reward are multiplied to estimate the value function for each state (combined

Nat Rev Neurosci. Author manuscript; available in PMC 2022 October 17.
Sosa and Giocomo Page 37

colours). d ∣ In this simplified hypothesis, dopaminergic and other neuromodulatory systems


Author Manuscript

convey reward prediction information to the HPC–entorhinal cortex (EC) system, which
helps assign these values to discrete states that compose an experience. ‘States’ here are
synonymous with spatial representations of the HPC-EC. State representations are sent
to downstream areas (yellow), which layer additional information onto these states, such
as task requirements and sensory features. No reciprocal arrow is shown for the basal
ganglia because there is no known direct return projection, but the basal ganglia (including
the NAc) help to use state values for action invigoration. The HPC–EC, frontal cortices
and basal ganglia each project back to the dopaminergic system directly or indirectly,
putatively providing updates about predicted outcomes and value changes to individual
states. Interactions in this network contribute to memory storage, decision-making and
action generation.
Author Manuscript
Author Manuscript
Author Manuscript

Nat Rev Neurosci. Author manuscript; available in PMC 2022 October 17.

You might also like