Data-driven and learning-based methodologies have been very popular in modern decision-making systems. In order to make optimal use of data and computational resources, these problems require theoretically sound procedures for choosing between estimators, tuning their parameters, and understanding bias/variance trade-offs. In many settings, asymptotic and/or worst-case theory fails to provide the relevant guidance.

In this dissertation, I present some recent advances that involve a more refined approach, one that leads to non-asymptotic and instance-optimal guarantees. Focusing on function approximation methods for policy evaluation in reinforcement learning, in Part I, I describe a novel class of optimal oracle inequalities for projected Bellman equations, as well as computationally efficient algorithms achieving them. In contrast to corresponding results for ordinary regression, the approximation pre-factor depends on the geometry of the problem, and can be much larger than unity. In Part II, I discuss optimal procedures for estimating linear functionals from observational data. Our theory reveals a rich spectrum of behavior beyond the asymptotic semi-parametric efficiency bound. It also highlights the fundamental roles of geometry, and provides concrete guidance on practical procedures and parameter choices.




Download Full History