We propose a motion segmentation algorithm that aims to break a scene into its most prominent moving groups. Instead of identifying point correspondences between the image frames, the idea to find what groups of pixels are transformed from one image frame to another. To do this, we treat the image sequence as a three dimensional spatiotemporal data set and construct a weighted graph by taking each pixel as a node, and connecting pixels that are in the spatiotemporal neighborhood of each other. We define a motion profile vector at each image pixel which captures the probability distribution of the image velocity at that point. By defining a distance between motion profile at two pixels, we can assign a weight on the graph edge connecting them. Using normalized cuts we find the most salient partitions of the spatiotemporal volume formed by the image sequence. Each partition, which is in the form of a spatiotemporal volume, corresponds to a group of pixels moving coherently in space and time. Normalized cuts can be computed efficiently by solving a generalized eignevalue problem.
For segmenting long image sequences, we have developed a recursive update procedure that incorporates knowledge of segmentation in previous frames for efficiently finding the group correspondence in the new frame. It speeds up the segmentation sigificantly when there is no major scene change, while in the worst case, when there is major scene change, the algorithm performs correctly as if no prior knowledge is used. Experimental results on various synthetic and real image sequences are presented.