Complex behaviors are often driven by an internal model, which integrates sensory information over time and facilitates long-term planning to reach subjective goals. A fundamental challenge in neuroscience is: how can we use behavior and neural activity to understand this internal model and its dynamic latent variables? Here we interpret behavioral data by assuming an agent behaves rationally — that is, it takes actions that optimize its subjective reward according to its understanding of the task and its relevant causal variables. We apply a new method, Inverse Rational Control (IRC), to learn an agent's internal model and reward function by maximizing the likelihood of its measured sensory observations and actions. This thereby extracts rational and interpretable thoughts of the agent from its behavior. We also provide a framework for interpreting encoding, recoding and decoding of neural data in light of this rational model for behavior. When applied to behavioral and neural data from simulated agents performing suboptimally on a naturalistic foraging task, this method successfully recovers their internal model and reward function, as well as the Markovian computational dynamics within the neural manifold that represents the task. This work lays a foundation for discovering how the brain represents and computes with dynamic latent variables.
1. Distinguish rational behavior from optimal behavior
2. Identify how to use estimates of latent mental variables to predict brain dynamics