Large labeled datasets are vital for applying supervised learning to computer vision tasks, but making these datasets takes time and effort. Human-machine collaboration has the potential to mitigate this cost. This work introduces a general-purpose framework for human-to-human and human-machine collaboration on image data. We show that by treating machine learning models as virtual users, multi-user synchronization can support versatile human-machine interaction; in other words, all you need is sync. In order to achieve synchronization behavior that seems correct to users, while also maintaining real-time editing speeds and supporting undo-redo, we adapt operational transformation [8] to the image labeling context. An open- source implementation of the collaboration system is presented as an in-progress addition to Scalabel, an annotation tool for visual data that was used to create the BDD100K dataset [24]. Finally, we give an example of how the collaboration feature can improve the labeling process by integrating Polygon-RNN++ [2] with Scalabel.




Download Full History