Unsupervised algorithms which do not make use of labels are commonly found in computer vision and are widely applicable to all problem settings. In the presence of expert-labeled ground truth information, however, these algorithms are not optimal. Altering the unsupervised models to include labels is not always a straight forward modification. In this dissertation, we explore various ways to incorporate human supervision. We first start with the task of visual sequence recognition and demonstrate ways to effectively make use of temporal information. Next, we tackle the problem of scene segmentation and devise a novel framework to discriminatively train a generative hierarchical model with nonparametric Bayesian priors; the methodology can be easily applied to other nonparametric Bayesian models. Finally, we approach the difficult problem of object segmentation and describe how shape priors can be infused into a generative Bayesian segmentation model. We demonstrate the effectiveness of our models and algorithms on datasets which are widely used by the research community and universally regarded as difficult. The dissertation concludes with active venues for future research.