Neural systems involved in delay and risk assessment in the rat
A dissertation submitted for the degree of Doctor of Medicine
Rudolf Nicholas Cardinal
St John’s College, Cambridge
May 2006
Abstract
This thesis investigated the
contribution of the nucleus accumbens core (AcbC) and the hippocampus
(H) to choice and learning involving reinforcement that was delayed or
unlikely. Animals must frequently act to influence the world even when
the reinforcing outcomes of their actions are delayed. Learning with
action–outcome delays is a complex problem, and little is known
of the neural mechanisms that bridge such delays. Impulsive choice, one
aspect of impulsivity, is characterized by an abnormally high
preference for small, immediate rewards over larger delayed rewards,
and is a feature of attention-deficit/hyperactivity disorder (ADHD),
addiction, mania, and certain personality disorders. Furthermore, when
animals choose between alternative courses of action, seeking to
maximize the benefit obtained, they must also evaluate the likelihood
of the available outcomes. Little is known of the neural basis of this
process, or what might predispose individuals to be overly conservative
or to take risks excessively (avoiding or preferring uncertainty,
respectively), but risk taking is another aspect of the personality
trait of impulsivity and is a feature of a number of psychiatric
disorders, including pathological gambling and some personality
disorders.
The AcbC, part of the ventral striatum, is
required for normal preference for a large, delayed reward over a
small, immediate reward (self-controlled choice) in rats, but the
reason for this is unclear. Chapter 3 investigated the role of the AcbC
in learning a free-operant instrumental response using delayed
reinforcement, performance of a previously learned response for delayed
reinforcement, and assessment of the relative magnitudes of two
different rewards. Groups of rats with excitotoxic or sham lesions of
the AcbC acquired an instrumental response with different delays (0,
10, or 20 s) between the lever-press response and reinforcer delivery.
A second (inactive) lever was also present, but responding on it was
never reinforced. The delays retarded learning in normal rats. AcbC
lesions did not hinder learning in the absence of delays, but
AcbC-lesioned rats were impaired in learning when there was a delay,
relative to sham-operated controls. Rats were subsequently trained to
discriminate reinforcers of different magnitudes. AcbC-lesioned rats
were more sensitive to differences in reinforcer magnitude than
sham-operated controls, suggesting that the deficit in self-controlled
choice previously observed in such rats was a consequence of reduced
preference for delayed rewards relative to immediate rewards, not of
reduced preference for large rewards relative to small rewards. AcbC
lesions also impaired the performance of a previously learned
instrumental response in a delay-dependent fashion. These results
demonstrate that the AcbC contributes to instrumental learning and
performance by bridging delays between subjects’ actions and the
ensuing outcomes that reinforce behaviour.
When outcomes are delayed, they may be
attributed to the action that caused them, or mistakenly attributed to
other stimuli, such as the environmental context. Consequently, animals
that are poor at forming context–outcome associations might learn
action–outcome associations better with delayed reinforcement
than normal animals. The hippocampus contributes to the representation
of environmental context, being required for aspects of contextual
conditioning. It was therefore hypothesized that animals with H lesions
would be better than normal animals at learning to act on the basis of
delayed reinforcement. Chapter 4 tested the ability of H-lesioned rats
to learn a free-operant instrumental response using delayed
reinforcement, and their ability to exhibit self-controlled choice.
Rats with sham or excitotoxic H lesions acquired an instrumental
response with different delays (0, 10, or 20 s) between the response
and reinforcer delivery. H-lesioned rats responded slightly less than
sham-operated controls in the absence of delays, but they became better
at learning (relative to shams) as the delays increased; delays
impaired learning less in H-lesioned rats than in shams. In contrast,
lesioned rats exhibited impulsive choice, preferring an immediate,
small reward to a delayed, larger reward, even though they preferred
the large reward when it was not delayed. These results support the
view that the H hinders action–outcome learning with delayed
outcomes, perhaps because it promotes the formation of
context–outcome associations instead. However, although lesioned
rats were better at learning with delayed reinforcement, they were
worse at choosing it, suggesting that self-controlled choice and
learning with delayed reinforcement tax different psychological
processes.
Chapter 5 examined the effects of excitotoxic
lesions of the AcbC on probabilistic choice in rats. Rats chose between
a single food pellet delivered with certainty (probability p = 1) and
four food pellets delivered with varying degrees of uncertainty (p = 1,
0.5, 0.25, 0.125, and 0.0625) in a discrete-trial task, with the
large-reinforcer probability decreasing or increasing across the
session. Subjects were trained on this task and then received
excitotoxic or sham lesions of the AcbC before being retested. After a
transient period during which AcbC-lesioned rats exhibited relative
indifference between the two alternatives compared to controls,
AcbC-lesioned rats came to exhibit risk-averse choice, choosing the
large reinforcer less often than controls when it was uncertain, to the
extent that they obtained less food as a result. Rats behaved as if
indifferent between a single certain pellet and four pellets at p =
0.32 (sham-operated) or at p = 0.70 (AcbC-lesioned) by the end of
testing. When the probabilities did not vary across the session,
AcbC-lesioned rats and controls strongly preferred the large reinforcer
when it was certain, and strongly preferred the small reinforcer when
the large reinforcer was very unlikely (p = 0.0625), with no
differences between AcbC-lesioned and sham-operated groups. These
results suggest that the AcbC contributes to action selection by
promoting the choice of uncertain, as well as delayed, reward.
Key words:
delay
uncertainty
impulsivity
addiction
nucleus
accumbens
hippocampus