Multi-stream automatic speech recognition (ASR) systems consisting of an ensemble of classifiers working together, each with its own feature vector, are popular in the research literature. Published work on feature selection for such systems has dealt with indivisible blocks of features. I break from this tradition by investigating feature selection at the level of individual features. I use the OGI ISOLET and Numbers speech corpora, including noisy versions I created using a variety of noises and signal-to-noise ratios. I have made these noisy versions available for use by other researchers, along with my ASR and feature selection scripts.
I start with the random subspace method of ensemble feature selection, in which each feature vector is simply chosen randomly from the feature pool. Using ISOLET, I obtain performance improvements over baseline in almost every case where there is a statistically significant performance difference, but there are many cases with no such difference.
I then try hill-climbing, a wrapper approach that changes a single feature at a time when the change improves a performance score. With ISOLET, hill-climbing gives performance improvements in most cases for noisy data, but no improvement for clean data. I then move to Numbers, for which much more data is available to guide hill-climbing. When using either the clean or noisy Numbers data, hill-climbing gives performance improvements over multi-stream baselines in almost all cases, although it does not improve over the best single-stream baseline. For noisy data, these performance improvements are present even for noise types that were not seen during the hill-climbing process. In mismatched condition tests involving mismatch between clean and noisy data, hill-climbing outperforms all baselines when Opitz's scoring formula is used. I find that this scoring formula, which blends single-classifier accuracy and ensemble diversity, works better for me than ensemble accuracy as a performance score for guiding hill-climbing.
Title
Ensemble Feature Selection for Multi-Stream Automatic Speech Recognition
Published
2008-12-15
Full Collection Name
Electrical Engineering & Computer Sciences Technical Reports
Other Identifiers
EECS-2008-160
Type
Text
Extent
129 p
Archive
The Engineering Library
Usage Statement
Researchers may make free and open use of the UC Berkeley Library’s digitized public domain materials. However, some materials in our online collections may be protected by U.S. copyright law (Title 17, U.S.C.). Use or reproduction of materials protected by copyright beyond that allowed by fair use (Title 17, U.S.C. § 107) requires permission from the copyright owners. The use or reproduction of some materials may also be restricted by terms of University of California gift or purchase agreements, privacy and publicity rights, or trademark law. Responsibility for determining rights status and permissibility of any use or reproduction rests exclusively with the researcher. To learn more or make inquiries, please see our permissions policies (https://www.lib.berkeley.edu/about/permissions-policies).