Description
In this work, we tackle the general topic of Machine Learning Safety from four different angles: robustness, anomaly detection, alignment, and systemic safety. Concretely, we introduce PixMix to comprehensively improve performance on robustness, calibration, consistency, and monitoring. We curate the Species dataset for large-scale anomaly detection. We create the Jiminy Cricket game environments to measure ML agent's understanding of and execution according to morality. We collect a large suite of emotionally evocative videos to show traction on preference learning. Additionally, we curate the MMLU benchmark to measure large language models' knowledge across 57 different domains and a forecasting benchmark to measure their ability to predict future trends and events.