Speaker
Bert Kappen
(Radboud University Nijmegen)
Description
To compute a course of actions in the presence of
uncertainty is the topic of stochastic optimal control
theory. Such computations require the solution of complex
partial differential equations and these computations become
intractable for most problems. I will introduce a class of
control problems that can be expressed as a
KL divergence and that can be mapped onto
a graphical model inference problem.
In this talk, we show how to apply this theory in the context
of a delayed choice task and for collaborating agents. We
first introduce
the KL control framework. Then we show that in a delayed
reward task
when the future is uncertain it is optimal to delay the
timing of your
decision. We show preliminary results on human subjects that
confirm
this prediction. Subsequently, we discuss two player games,
such as the
stag-hunt game, where collaboration can improve or worsen as
a result
of recursive reasoning about the opponents actions. The
Nash equilibria
appear as local minima of the optimal cost to go, but may
disappear when monetary gain decreases. This behaviour is in
agreement with
experimental findings in humans. We subsequently extend the
setting to delayed rewards and show how cooperation develops
as a result of recursive reasoning.Suboptimal cooperation
arise as local minima of the objective function.