The remarkable success of modern machine learning has arguably been due to the ability of algorithms to combine powerful models, such as neural networks, with large-scale datasets. This data-driven paradigm has been applied to a variety of applications from computer vision and speech processing, to machine translation and question answering. However, the majority of these successes have been in prediction problems, such as supervised learning. In contrast, many real-world applications of machine learning involve decision making problems, where one must leverage learned models to select optimal actions that maximize some objective of interest. Unfortunately, learned models can often fail in these situations, due to issues such as distribution shift and model exploitation. This thesis proposes methods and algorithms which are designed to handle these shortcomings in modern machine learning methods in order to produce reliable decision making agents. We begin in the area of reinforcement learning, where we study robust algorithms for offline reinforcement learning and model-based reinforcement learning. We discuss considerations for benchmarking offline reinforcement learning and off-policy evaluation, and propose a variety of domains and datasets designed to stress test state-of-the-art algorithms in the area. Finally, we study the more general problem of model-based optimization, and show how information-theoretic principles can guide us to construct uncertainty-aware models that mitigate exploitation.




Download Full History