Open Problems
Various problems for which I would like to gain a deeper understanding.
- Bounded filtering: How do you do approximate inference over sequences when you have a limited time and memory to process new information. Such as when doing online learning?
- Bayesian inference with incorrect models: What happens when the data generating process is not in the hypothesis space? How do you deal with it? In an RL context, to what degree is it “acceptable” to have an incorrect model?
- Rational irrationality: As Upton Sinclair said, “It is difficult to get a man to understand something, when his salary depends on his not understanding it.” In certain situations, such as multi-agent environments, it might be rational for an agent to possess incorrect beliefs. What are the implications of this for RL?
- Timescale invariant models: Humans are able to reason abstractly and make predictions about events that are arbitrarily far into the future. This is in stark contrast to sequence predictors like WaveNet and char-rnn that only predict the next step. Can we come up with NN architectures that can do the same?