Consistency Modeling



SRI's speaker independent, continuous-speech recognition system (DECIPHER(TM)) is based on hidden Markov models (HMM). Optimum HMM state clustering, Gaussian mixture modeling, and statistical language modeling are combined to give state-of-the art performance. In addition, robustness to different acoustic environments such as channels, noise, and nonnative speakers is achieved through noise-robust feature extraction and acoustic adaptation technology. New research ideas are implemented within the DECIPHER software so as to be available across different projects. Results of such research are incorporated into applications through the use the Nuance recognizer, developed by Nuance Communications, a spin-off from SRI's Speech Technology and Research (STAR) Laboratory. Close collaboration between the STAR laboratory and Nuance Communications facilitates the quick integration of novel research conducted by STAR laboratory researchers into the Nuance recognizer.

We are working on the following areas:

Recognition Accuracy

Recognition accuracy will be improved by robust training. This will be accomplished by developing techniques to determine the number of model parameters that can be robustly estimated. New parameter sharing techniques will be developed that will result in fewer and more robustly estimated parameters. Methods based on our previous work in acoustic adaptation will be used to robustly train large recognition models.

Recognition Speed and Memory

Recognition speed will be increased and memory decreased by developing methods to remove the large amount of redundancy that exists in current modeling techniques. This will result in a significant speed-up, while decreasing the number of model parameters and increasing recognition accuracy.

Acoustic Adaptation

Acoustic adaptation algorithms will be developed to port speech models to new unseen domains with only a small amount of target-specific data. For adaptation, the information from the small amount of target-specific data will be augmented by using correlations with large amount of available training data.

Named-Entity Recognition

A statistical class-based language modeling approach will be studied for named-entity recognition. The classes in the grammar will correspond to the entities to be recognized. This will allow both the word string and the named entities to be simultaneously recognized.