Science - USA (2022-01-07)

(Antfer) #1

9 to 16 were delivered on the usual VR5 sched-
ule but were diluted by a factor of 4. This
caused a disruption in the established goal-
directed behavior without affecting consump-
tion behavior. Mice exhibited significantly
longer action period duration, interbehavioral
sequence duration, and latency to reward con-
sumption (Fig. 5, A and B) caused by increased
non-task-related behaviors both inside and
between behavioral sequences (fig. S14, A to C).
Eventually, all mice obtained and consumed
the maximum number of available outcomes.
We then tested the effect of the violation of
the outcome value on the neuronal represen-
tations of action and consumption at both the
single-neuron level and the population level
(326 neurons from four mice). First, whereas
action and consumption neurons detected
during the initial period showed significantly
reduced responses during the perturbed period
(Fig. 5, C and D), additional consumption and


action neurons emerged during the perturbed
period (fig. S15, C and D). Similarly, immedi-
ately after the start of outcome value violation
(i.e., after consumption of the first diluted re-
ward) new action- and consumption-associated
neuronal activity patterns emerged that re-
placed the action- and consumption-associated
neuronal activity patterns of the initial period
(Fig. 5, E and F, and fig. S15F). This effect was
also observed when considering only neuronal
activity patterns associated with the first or
the last action in an action period, indicating
that it was not a consequence of the increased
action period duration (fig. S15G, and see com-
parison with neuronal activity pattern stability
on day 5 in fig. S16).
During the violation of action-outcome con-
tingency paradigm, the initial eight rewards
were delivered contingent on the actions on
the usual VR5 schedule. Subsequently, out-
comes 9 to 20 were delivered noncontingently

on the animalÕs actions. Whereas at the be-
havioral level, this test disrupted goal-directed
behavior similarly to the violation of outcome
value (Fig. 5, G and H, and fig. S14, D to F), the
impact on BLA neuronal activity was different
(434 neurons from five mice). Action neurons
detected during the initial period showed a
reduction of their responses during the entire
perturbed period (Fig. 5I). However, unlike the
immediate effect of violating outcome value,
this change developed slowly after the start of
the perturbation and was not accompanied by
the emergence of additional action neurons
(fig. S15H). In accordance with this finding,
at the population level, the action-associated
neuronal activity patterns gradually lost their
correlation with the action-associated neuro-
nal activity patterns of the initial period, but
no new action-associated neuronal activity pat-
terns emerged during the perturbed period
(Fig.5Kandfig.S15J;alsoseetheresultsfor
the first and last action periods in fig. S15L). In
clear contrast to the impact of the violation of
outcome value, consumption-associated activ-
ity during the initial period after violation of
action-outcome contingency remained stable
along the entire perturbed period (Fig. 5, J to
L, and fig. S15K). As previously reported ( 17 ),
we were able to detect a population of con-
sumption neurons that gradually increased
their activity over the course of unexpected
reward deliveries (fig. S15I). However, the over-
all population activity pattern during consump-
tion remained highly correlated throughout
the entire session. Although changing outcome
value resulted in the immediate emergence
of new action and outcome representations,
action-outcome contingency violation resulted
in a gradual loss of the representation of the
action with no emergence of a new stable ac-
tion representation and no effect on reward
consumption representation.

Discussion
Our findings support a crucial role for the BLA
in the motivational control of goal-directed
behavior. At the time of goal-directed action,
BLA PNs integrated and encoded pursued
outcome identity, pursued outcome value, and
action-outcome contingency information. The
maintenance of such prospective, outcome-
specific, and updatable neuronal activity re-
flects a specific motivational state ( 18 , 19 ) that
differs from a primary motivational state
known to energize goal-directed actions in an
unspecific manner ( 1 , 20 ). Furthermore, this
state could be retrieved upon re-exposure to
the task, but its maintenance depended on
continued reinforcement. At the time of con-
sumption, BLA PNs represented current out-
come identity and value but not licking
behavior. This is consistent with the observa-
tions that BLA neurons encode distinct reward
features ( 21 ), including value ( 22 ), magnitude

Courtinet al.,Science 375 , eabg7277 (2022) 7 January 2022 5 of 13


C

0

2

4

6
*

GFP ArchT
action
ArchT
cons.

0

20

40

60

GFP ArchT
action
ArchT
cons.

** **

Duration (s)
0

20

40

D

Beh. sequence 16

9

8

1

Beh. epochs (sec)
Beh. sequence
1 OFF 89 ON 16

20
0

40

Action

20
0

40 ArchT

Action 1-2 UR lick 1-2 R lick 1-2

Beh. sequence
1 89 16

20
0

40

Consumption

20
0

40 ArchT

Laser-OFF

0 10 20
Time (s)

Laser-ON

Behavioral sequence
Action Consumption

OFF ON

OFF ON OFF ON

GFP GFP

*

A

0

40

80

Action #

GFP : actionAction Consumption

0

40

80

ArchT : actionAction Consumption

0

40

80

ArchT : consumptionAction Consumption

B

Actions per min
0

5

10

15

25
**

GFP ArchT
action
ArchT
cons.

**

Action period
**

GFP ArchT
action
ArchT
cons.

Action to lick
latency

Inter-behavioral
sequence

Duration (s) Duration (s)

Laser ON period

GFP ArchT
action

ArchT
cons.

Action # Action #

50 s 50 s 50 s
Laser-OFF Laser-ON
20

60

Action 1

Action 2
Unrewarded lick 2

Unrewarded lick 1

Rewarded lick 2

Transition

Rewarded lick 1

Idle time
Exploration

Fig. 4. BLA PNs activity is necessary for the maintenance of goal-directed actions.(A) Examples of
the effects of optogenetic manipulations on goal-directed behaviors on day 5 for GFP- and ArchT-expressing
mice. Black dots indicate individual actions; green dots, action period onset; vertical dashed lines, outcome
delivery. Colors indicate behavioral epochs. Shaded yellow areas indicate laser deliveries; double-headed
arrows, laser-ON periods (behavioral sequences 9 to 16). (B) Left to right, actions per minute, action period
duration, action to lick latency, and interbehavioral sequence duration during laser-OFF (behavioral sequences
1 to 8, white bars) and laser-ON periods, respectively (yellow bars; two-sided pairedttest for action per
minute and two-sided Wilcoxon signed-rank tests for the other variables comparing laser-OFF and laser-ON
periods, *P< 0.05; **P< 0.01). GFP,N= 20 mice in three cohorts × two outcomes (N= 12,N= 8 mice
with laser stimulation during action periods and consumption periods, respectively, with data pooled). ArchT
action,N= 12 mice in two cohorts × two outcomes (laser delivered during action periods); ArchT cons.,
N= 8 mice in two cohorts × two outcomes (laser delivered during consumption periods). Box-and-whisker
plots indicate median, interquartile, extreme data values, and outliers of the data distribution. (C) Proportions
of different behavioral epochs during laser-ON periods (color denote behavioral epochs as in (A); Black
dots denote significant increase in behavioral epoch duration;P< 0.01, two-sided Wilcoxon signed-rank tests
comparing laser-OFF and laser-ON periods; see raw values in fig. S11). (D) Mouse behavior during laser-
OFF and laser-ON periods while milk outcome was available (left). Colors denote behavioral epochs as in (A).
Also shown is the duration of the distinct behavioral epochs during the behavioral sequences for each
sequence along laser-OFF and laser-ON periods (right, two-sided Wilcoxon signed-rank test).


RESEARCH | RESEARCH ARTICLE

Free download pdf