Description
AI applications have seen great advances in recent years, so much so that the people who design and build them often have difficulty understanding, predicting and controlling what they do. To make progress on these fundamental challenges, I believe that developing a solid mathematical foundation for AI is both beneficial and possible. The research presented in this dissertation is an attempt to chip away at a few small aspects of that endeavor.
The dissertation is divided into three parts.
Part I addresses a question at the core of learning theory: "how much data is necessary for supervised learning?" Concretely, Chapter 2 considers statistical settings, in which training data is drawn from an unknown distribution. Here, we answer a question posed by Antos and Lugosi (1996) concerning the shape of learning curves. Chapter 3 considers adversarial settings, in which few or no assumptions are made regarding the source of the training data. Specifically, we chart the landscape of transductive online learning.
Part II investigates whether it is possible to verify the optimality of a machine learning outcome offered by an untrusted party, such that verification would be significantly cheaper (in terms of compute, or the quantity or quality of training data) compared to the cost of running a trusted machine learning system. This question has many parallels in the theory of computation, and it also has tangible implications to the economics of selling machine learning as a service.
Part III studies notions of stability in machine learning. We offer a taxonomy that can help make sense of an assortment of seemingly-unrelated stability definitions that have appeared in the learning theory literature.
The dissertation consists of five chapters, each of which is self-contained and can be read independently. However, a small number of basic notions crop up repeatedly in different guises. Taken together, the dissertation showcases the richness, versatility and unity of learning theory.