Adapting Learning Techniques for Visual Recognition

Zhang, Hao; EECS Department, University of California

PDF

Description

This thesis builds on previous work on shape matching techniques in computer vision, and classification techniques in machine learning. It tries to model the visual recognition task as one based on comparison to prototypes. Careful attention is paid to insights from perception and neuroscience on the nature of human visual similarity.

Classifying hand written digits has been heavily researched, and serves as a clearly-defined shape matching problem. One contribution of this thesis is a prototype-based classifier that drastically reduces the number of prototypes needed for the nearest neighbor technique.

A second contribution of this thesis is a technique for determining what parts of the shape are most informative for classification. This notion can be formalized in the framework of "feature selection" as in machine learning. The study of discriminative power from feature selection yields insight on (1) the usefulness of individual parts of the shape (2) the classification process of a general linear model classifier on shape data.

The most significant contribution of this thesis is a new learning technique for visual categorization, "SVM-KNN", which draws on aspects of two well known techniques: support vector machines (SVM) and K-nearest neighbor. The basic idea is to find close neighbors to a query sample and train a local support vector machine that preserves the distance function on the collection of neighbors. This technique is a good match to the unique challenges of general visual recognition: hugely multi-class, few training examples per class, and high variation within class. This approach is quite flexible, and permits recognition based on color, texture, and particularly shape, in a homogeneous framework.

Our hybrid method has reasonable computational complexity both in training and at run time, and yields excellent results in practice. A wide variety of distance functions can be used and experiments show state-of-the-art performance on a number of benchmark data sets of shape and texture classification (MNIST, USPS, CUReT) and object recognition (Caltech-101). On Caltech-101, the technique achieved a correct classification rate of 62.42% +- 0.41% using only fifteen training examples, outperforming other published approaches at the time.

Details

Title

Adapting Learning Techniques for Visual Recognition

Creator

Zhang, Hao, Author
EECS Department, University of California, Publisher

Published

2007-05-22

Full Collection Name

Electrical Engineering & Computer Sciences Technical Reports

Other Identifiers

EECS-2007-73

Type

Text

Format

technical reports

Extent

109 p

Archive

The Engineering Library

Usage Statement

Researchers may make free and open use of the UC Berkeley Library’s digitized public domain materials. However, some materials in our online collections may be protected by U.S. copyright law (Title 17, U.S.C.). Use or reproduction of materials protected by copyright beyond that allowed by fair use (Title 17, U.S.C. § 107) requires permission from the copyright owners. The use or reproduction of some materials may also be restricted by terms of University of California gift or purchase agreements, privacy and publicity rights, or trademark law. Responsibility for determining rights status and permissibility of any use or reproduction rests exclusively with the researcher. To learn more or make inquiries, please see our permissions policies (https://www.lib.berkeley.edu/about/permissions-policies).

Collection

EECS Technical Reports

Files

Statistics

Download Full History

Download

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Add to Basket