Description
In this thesis, I present algorithms for transferring learned information between visual data sources and across visual tasks - all with limited human supervision. I will both formally and empirically analyze the adaptation of visual models within the classical domain adaptation setting and extend the use of adaptive algorithms to facilitate information transfer between visual tasks and across image modalities.
Most visual recognition systems learn concepts directly from a large collection of manually annotated images/videos. A model which detects pedestrians requires a human to manually go through thousands or millions of images and indicate all instances of pedestrians. However, this model is susceptible to biases in the labeled data and often fails to generalize to new scenarios — a detector trained in Palo Alto may have degraded performance in Rome, or a detector trained in sunny weather may fail in the snow. Rather than require human supervision for each new task or scenario, this work draws on deep learning, transformation learning, and convex-concave optimization to produce novel optimization frameworks which transfer information from the large curated databases to real world scenarios.