Light-field cameras have recently become available to the consumer market. An array of micro-lenses captures enough information that one can refocus images after acquisition, as well as shift one's viewpoint within the sub-apertures of the main lens, effectively obtaining multiple views. Thus, depth cues from defocus, correspondence, specularity, and shading are available simultaneously in a single capture. Previously, defocus could be achieved only through multiple image exposures focused at different depths; correspondence and specularity cues needed multiple exposures at different viewpoints or multiple cameras; and shading required very well controlled scenes and low-noise data. Moreover, all four cues could not easily be obtained together. In this thesis, we will present a novel framework that decodes the light-field images from a consumer Lytro camera and uses the decoded image to compute dense depth estimation by obtaining the four depth cues: defocus, correspondence, specularity, and shading. By using both defocus and correspondence cues, depth estimation is more robust with consumer-grade noisy data than previous works. Shading cues from light-field data enable us to better regularize depth and estimate shape. By using specularity, we formulate a new depth measure that is robust against specularity, making our depth measure suitable for glossy scenes. By combining the cues into a high quality depth map, the results are suitable for a variety of complex computer vision applications.




Download Full History