Understanding visual scenes is a crucial piece in many artificial intelligence applications ranging from autonomous vehicles and household robotic navigation to automatic image captioning for the blind. Reliably extracting high-level semantic information from the visual world in real-time is key to solving these critical tasks safely and correctly. Existing approaches based on specialized recognition models are prohibitively expensive or intractable due to limitations in dataset collection and annotation. By facilitating learned information sharing between recognition models these applications can be solved; multiple tasks can regularize one another, redundant information can be reused, and the learning of novel tasks is both faster and easier.
In this thesis, I present algorithms for transferring learned information between visual data sources and across visual tasks - all with limited human supervision. I will both formally and empirically analyze the adaptation of visual models within the classical domain adaptation setting and extend the use of adaptive algorithms to facilitate information transfer between visual tasks and across image modalities.
Most visual recognition systems learn concepts directly from a large collection of manually annotated images/videos. A model which detects pedestrians requires a human to manually go through thousands or millions of images and indicate all instances of pedestrians. However, this model is susceptible to biases in the labeled data and often fails to generalize to new scenarios — a detector trained in Palo Alto may have degraded performance in Rome, or a detector trained in sunny weather may fail in the snow. Rather than require human supervision for each new task or scenario, this work draws on deep learning, transformation learning, and convex-concave optimization to produce novel optimization frameworks which transfer information from the large curated databases to real world scenarios.
Title
Adaptive Learning Algorithms for Transferable Visual Recognition
Published
2016-08-08
Full Collection Name
Electrical Engineering & Computer Sciences Technical Reports
Other Identifiers
EECS-2016-139
Type
Text
Extent
187 p
Archive
The Engineering Library
Usage Statement
Researchers may make free and open use of the UC Berkeley Library’s digitized public domain materials. However, some materials in our online collections may be protected by U.S. copyright law (Title 17, U.S.C.). Use or reproduction of materials protected by copyright beyond that allowed by fair use (Title 17, U.S.C. § 107) requires permission from the copyright owners. The use or reproduction of some materials may also be restricted by terms of University of California gift or purchase agreements, privacy and publicity rights, or trademark law. Responsibility for determining rights status and permissibility of any use or reproduction rests exclusively with the researcher. To learn more or make inquiries, please see our permissions policies (https://www.lib.berkeley.edu/about/permissions-policies).