Description
We begin in the setting of machine learning-based design. The goal in this setting is to propose novel objects such as proteins, small molecules, or materials with desired properties, in a way that is guided by machine-learning models of such properties. Toward addressing model trustworthiness for design, we propose (i) a method for learning models that accounts for the distribution shifts inherent to design, and (ii) a method for constructing statistically valid confidence sets for the properties of objects designed using machine learning.
Finally, we examine the trustworthy use of machine learning for drawing scientific conclusions. In particular, we consider the increasingly relevant setting of treating predictions made by machine-learning systems as "data" in estimating quantities of scientific interest. We propose prediction-powered inference, a novel statistical framework for constructing valid confidence sets in this setting, which enables researchers to incorporate evidence from machine-learning systems into their scientific inquiry in a standardized and principled way.