Description
The recent surge in highly successful, but opaque, machine-learning models has given rise to a dire need for interpretability. This work addresses the problem of interpretability with novel definitions, methodology, and scientific investigations, ensuring that interpretations are useful by grounding them in the context of real-world problems and audiences. We begin by defining what we mean by interpretability and some desiderata surrounding it, emphasizing the underappreciated role of context. We then dive into novel methods for interpreting/improving neural network models, focusing on how to best score, use, and distill interactions. Next, we turn from neural networks to relatively simple rule-based models, where we investigate how to improve predictive performance while maintaining an extremely concise model. Finally, we conclude with work on open-source software and data for facilitating interpretable data science. In each case, we dive into a specific context which motivates the proposed methodology, ranging from cosmology to cell biology to medicine. Code for everything is available at github.com/csinva.