| |
Speech Technology and Research (STAR) Laboratory Seminar Series
Upcoming Talks
-
Speaker: Sibel Yaman, ICSI, Berkeley
Time: Thursday, November 5th, 2009, 11:00 am
Venue: STAR Lab, EJ 124
Title: Single- and Multi-Objective Programming-Based Approaches to Automatic Spoken Language Identification
Abstract:
Automatic language identification (LID) is the problem of identifying the language being spoken by an unknown speaker from a sample of speech. LID is often used as a front-end system to a language-specific speech recognition system for applications such as directory assistance, machine translation, and multi-lingual information retrieval. It is possible to distinguish four categories of approaches which differ in the information sources being used. These four categories are (i) acoustic approaches, (ii) phonotactic approaches, (iii) approaches that use prosodic and duration information, and (iv) Discriminative approaches.
In my PhD research, we developed a single-objective programming (SOP)-based and a multi-objective programming (MOP)-based approach to LID, where the standard detection performance evaluation measures false-rejection (or miss) and false-acceptance (or false alarm) rates for a number of languages were to be simultaneously minimized. To obtain an approximation of the empirical FR and FA rates, the minimum classification error rate classification (MCE) framework was followed using linear discriminant functions (LDFs). When the LID task is defined as detecting the language spoken in a speech utterance, the actual goal is to minimize the FA and FR rates for each of the target languages rather than to minimize their average. Therefore, we formulated the LID problem as an MOP problem with a total of (2M) objectives, where each individual objective is either an FA or an FR rate for a target language/dialect. The MOP-based approach under discussion directly attempts to find how to make individual error rates as small as possible without significant degradation in any one of them.
|
|