Description
Convolutional neural networks (CNNs) for image classification are trained to learn compact and informative representations from high-dimensional images to make accurate predictions. CNNs are burdened with learning to distinguish between information relevant for classification and noise. By training CNNs on perceptually compressed images, we show several benefits to removing irrelevant content from our data prior to training the model. We generate compressed data sets using JPEG and learned compression models at various quality levels to train image classification models. First, we explore these classifiers’ ability to maintain high accuracy when trained on compressed data. Next, we compare the performance of classifiers trained on compressed and uncompressed data as the number of trainable parameters are reduced. Finally, we show that compressed data can be used as a form of data augmentation, enabling gains in robustness to high and low frequency image distortions while preserving accuracy on the original data. By thinking more about the quality of information contained in our training data, we can reduce the usage of large, high-quality image data sets, avoid highly parameterized neural networks, and train robust models. We hope these findings will motivate the future use of compressed data for training deep learning models.