CBN (Computational Biology and Neurocomputing) seminars

A liquid striatal microcircuit model for trajectory learning

by Carlos Toledo Suarez (University of Freiburg, Bernstein Center Freiburg, Germany)

Europe/Stockholm
RB35

RB35

Description
In reinforcement learning theories of the basal ganglia, dopamine is assumed to act as an error signal guiding the update of the values of such states during the learning process. Although it has been shown that a realistic dopaminergic error signal can drive the variant of RL known as temporal-difference learning [1] this study relied on a pre-defined partitioning of the environment into discrete states that were encoded as the firing rate of disjunct sets of neurons. A more likely scenario is that neurons are involved in the encoding of multiple different states through their spike patterns, and that an appropriate partitioning of an environment is learned on the basis of the actions leading to highest cumulative reward, such that patterns associated with the same actions are classified together. This is equivalent to a reduction in the effective number of states involved. Here I present a microcircuit model of striatum that reproduces experimentally observed activity statistics [2] and the use of its transient high-dimensional dynamics [3] i.e. liquid state dynamics for the supervised learning of trajectories on a flat surface employing only four simple linear readouts. I show performance and generalization scans over scales of cortico-striatal versus intra-striatal synaptic weights for multiple instantiations of the circuit, and their comparison against a measure of circuit's sensitivity to small input perturbations i.e. chaotic behavior.