Structured Approaches to Data Selection for Speaker Recognition

Lei, Howard Hao; EECS Department, University of California

PDF

Description

In this work, I investigated structured approaches to data selection for speaker recognition, with an emphasis on information theoretic approaches, as well as approaches based on speaker-specific differences that arise from speech production. These approaches rely on the investigation of speaker discriminability measures that detect speech regions that result in high speaker differentiation. I also attempted to understand why certain data regions result in better speaker recognition system performance.

The knowledge gained from the speaker discriminability measures was used to implement an effective data selection procedure, that allows for the prediction of how well a speaker recognition system will behave without actually implementing the system. The use of speaker discriminability measures also leads to data reduction in speaker recognition training and testing, allowing for faster modeling and easier data storage, given that the latest speaker recognition corpora uses hundreds of gigabytes.

In particular, I focused primarily on Gaussian Mixture Model- (GMM) based speaker recognition systems, which comprise the majority of current state-of-the-art speaker recognition systems. Methods were investigated to make the speaker discriminability measures easily obtainable, such that the amount of computational resources required to extract these measures from the data would be significantly less in comparison to the computational resources required to run entire speaker recognition systems to determine what regions of speech are speaker discriminative.

Upon selecting the speech data using these measures, I created new speech units based on the data selected. The speaker recognition performances of the new speech units were compared to the existing units (mainly mono-phones and words) standalone and in combination. I found that in general, the new speech units are more speaker discriminative than the existing ones. Speaker recognition systems that use the new speech units as data in general outperformed systems using the existing speech units. This work, therefore, outlines an effective approach that is easy to implement for selecting speaker discriminative regions of data for speaker recognition.

Details

Title

Structured Approaches to Data Selection for Speaker Recognition

Creator

Lei, Howard Hao, Author
EECS Department, University of California, Publisher

Published

2010-12-08

Full Collection Name

Electrical Engineering & Computer Sciences Technical Reports

Other Identifiers

EECS-2010-150

Type

Text

Format

technical reports

Extent

90 p

Archive

The Engineering Library

Usage Statement

Researchers may make free and open use of the UC Berkeley Library’s digitized public domain materials. However, some materials in our online collections may be protected by U.S. copyright law (Title 17, U.S.C.). Use or reproduction of materials protected by copyright beyond that allowed by fair use (Title 17, U.S.C. § 107) requires permission from the copyright owners. The use or reproduction of some materials may also be restricted by terms of University of California gift or purchase agreements, privacy and publicity rights, or trademark law. Responsibility for determining rights status and permissibility of any use or reproduction rests exclusively with the researcher. To learn more or make inquiries, please see our permissions policies (https://www.lib.berkeley.edu/about/permissions-policies).

Collection

EECS Technical Reports

Files

Statistics

Download Full History

Download

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Add to Basket