The overwhelming success of AI has spurred the rise of Machine Learning as a Service (MLaaS), where companies develop, maintain, and serve general-purpose models such as object detectors and image classifiers for users that pay a fixed rate per inference. As more organizations incorporate AI technologies into their operations, the MLaaS market is set to expand, necessitating cost optimization for these services, particularly in high-volume applications. We explore how a simple yet effective method of increasing model efficiency, aggregating multiple images into a grid before inference, can significantly reduce the required number of inferences for processing a batch of images with varying drops in accuracy. To counter the slight decrease in object detection accuracy, we introduce ImGrid, an innovative technique that decides when to reprocess gridded images at a higher resolution based on model confidence and bounding box area assessments. Experiments on open-source and commercial models show that ImGrid reduces inferences by 50%, while maintaining low impact on mean Average Precision (mAP) for the Pascal VOC object detection task.




Download Full History