lvscr


	Speech Technology and Research Laboratory

	People

	Current Research Activities

	Past Research Activities

	Publications

	SRILM

	Seminars

	Technologies for License

	In the News

	Career Opportunities

	Contact Us

	Information and Computing Sciences Division

LVCSR

LVCSR: Large Vocabulary Conversational Speech Recognition

Investigators

Andreas Stolcke (PI)
Harry Bratt
Horacio Franco
Ramana Rao Gadde
Colleen Richey
Elizabeth Shriberg
Kemal Sönmez
Dimitra Vergyri

Former collaborators
Mitch Weintraub
Françoise Beaufays
Yochai Konig
Ananth Sankar

Project Summary

The goal of the LVCSR projects is to develop all aspects of speech recognition in the domain of spontaneous, human-human conversational speech (as opposed to planned, read, or human-machine dialog). This includes feature extraction, acoustic modeling, language modeling, and speech understanding. Most of our research uses the Switchboard and CallHome/CallFriend conversational telephone speech corpora.

Research Efforts

We are presently focusing on a number of fundamental research problems that have to be solved in order to attain the ultimate goal of conversational speech understanding.

Front end/Feature extraction We seek to develop new front-end features that are automatically trained to enhance discrimination for the purpose of enhanced word recognition. Research is carried out in collaboration with the Speaker Recognition project.
Discriminative modeling We are exploring new training methods for acoustic and language models that enhance recognition accuracy by explicit optimization of discrimination between correct and incorrect hypotheses.
Wordspotting and confidence measures LVCSR methods can be used to improve limited vocabulary word spotting, and conversely word-spotting-like techniques can be employed to optimize LVCSR word error. Related to this are methods to estimate the confidence in word recognition results.
Conversational speech phenomena In collaboration with the Disfluencies and Hidden Event Modeling projects, we aim to model and detect events that are characteristic to spontaenous speech, such as hesitations, self-repairs, and covert sentence boundaries. Explicit modeling of such events is important for effective speech understanding, but also enhances word recognition accuracy.
Duration and prosody modeling for recognition We have recently started to explore the durational and other prosodic properties of speech for improved word recognition in conversational speech.
Language modeling We investigate language modeling techniques specifically for conversational speech, especially in the context of the general research topics above. For example, we have developed discriminative LM training methods and LMs that capitalize on conversational speech patterns. Much of the SRI Language Modeling Toolkit was developed as a by-product of LVCSR research, and SRI often provides language modeling support for other sites in the LVCSR community.

Publications and Presentations

LVCSR research publications and presentations by SRI staff.

Presentations from the 2001 LVCSR post-evaluation workshop at NIST.

About Us Vertical divider R&D Divisions Divider Careers Divider Newsroom Divider Contact Us
©2011 SRI International, 333 Ravenswood Avenue, Menlo Park, CA 94025-3493
SRI International is an independent, nonprofit corporation. Privacy policy
Last modified Aug 23, 2022