3 years ago

Optogenetic Blockade of Dopamine Transients Prevents Learning Induced by Changes in Reward Features

Maria Gonzalez Di Tillio, Geoffrey Schoenbaum, Chun Yun Chang, Matthew Gardner


Prediction errors are critical for associative learning [1, 2]. Transient changes in dopamine neuron activity correlate with positive and negative reward prediction errors and can mimic their effects [3–15]. However, although causal studies show that dopamine transients of 1–2 s are sufficient to drive learning about reward, these studies do not address whether they are necessary (but see [11]). Further, the precise nature of this signal is not yet fully established. Although it has been equated with the cached-value error signal proposed to support model-free reinforcement learning, cached-value errors are typically confounded with errors in the prediction of reward features [16]. Here, we used optogenetic and transgenic approaches to prevent transient changes in midbrain dopamine neuron activity during the critical error-signaling period of two unblocking tasks. In one, learning was unblocked by increasing the number of rewards, a manipulation that induces errors in predicting both value and reward features. In another, learning was unblocked by switching from one to another equally valued reward, a manipulation that induces errors only in reward feature prediction. Preventing dopamine neurons in the ventral tegmental area from firing for 5 s beginning before and continuing until after the changes in reward prevented unblocking of learning in both tasks. A similar duration suppression did not induce extinction when delivered during an expected reward, indicating that it did not act independently as a negative prediction error. This result suggests that dopamine transients play a general role in error signaling rather than being restricted to only signaling errors in value.

Publisher URL: http://www.cell.com/current-biology/fulltext/S0960-9822(17)31247-2

DOI: 10.1016/j.cub.2017.09.049

