Speech Translation Research at SRI International


	Speech Technology and Research Laboratory

	People

	Current Research Activities

	Past Research Activities

	Publications

	SRILM

	Seminars

	Technologies for License

	In the News

	Career Opportunities

	Contact Us

	Information and Computing Sciences Division

Full Spontaneous Translation

SRI's newest translation technology permits bidirectional, voice-to-voice machine translation of spontaneous utterances.

Unlike the Phraselator or BPTS, the full spontaneous translation system is not restricted to prerecorded translations. It can translate a wider range of utterances, including novel utterances it has never seen before.

Our most advanced translation system is IraqComm^TM, which has been in use in Iraq since early 2006. Below is described our earlier work on a similar system for Pashto, a major language of Afghanistan.

Speech synthesis output

In the full translation system, the computer-generated translations (in both directions) are played through a speech synthesizer. While the synthesized speech used in the full translation system is smooth and fluent, it is necessarily of lower quality than the prerecorded human translations used in the Phraselator and BPTS.

The speech synthesis technology in our translation systems is provided by Cepstral LLC. Translations into English are synthesized in Cepstral's off-the-shelf English voice. Foreign-language voices are custom-built by Cepstral specially for this project based on data that we provide.

Translation software

The full translation system relies entirely on computer-generated translations. This makes it more flexible than the Phraselator and BPTS, whose translations are hand-crafted in advance.

We currently have two different translation technologies: SRInterp statistical machine translation engine and Gemini interlingual translation engine. SRInterp is SRI's cross-platform large-scale statistical machine translation (SMT) decoder, which supports the state-of-the-art translation techniques, including phrase-based, hierarchical, syntax-based and string-to-dependency translation models. SRInterp has been used in SRI's major projects, including GALE and TRANSTAC. The latest IraqComm speech-to-speech translation system uses SRInterp technologies.

Gemini is an interlingual machine translation system, a system developed in SRI's Artificial Intelligence Center. The Gemini system can both interpret and generate natural language utterances, which makes it well-suited to automatic translation work. Gemini's translation abilities rely on sophisticated grammars developed by linguists for both the source and target languages. Our grammars of English and Pashto each contain thousands of words and hundreds of grammatical rules. Gemini system can generate high quality translations when the grammars cover the application domain, and can work completementarily with statistical translation.

Full translation proceeds as follows. First, the Dynaspeak speech recognition system sends a transcribed utterance in the source language to a translation engine, which finds a grammatical, natural-sounding, and semantically equivalent utterence in the target language based on its translation model or rules. This target-language translation is then output through the speech synthesizer.

History of full spontaneous translation

Work on the full translation system began early 2002. The use of Gemini for producing computer-generated translations was inspired by a previous SRI project called the Spoken Language Translator, which lasted from 1992 until 1999. The Spoken Language Translator, one of the first and most successful projects in the area of automatic speech translation, was able to translate among English, French, and Swedish in the domain of air travel planning. At the heart of the Spoken Language Translator was a natural language processing system called the Core Language Engine, a predecessor of Gemini.

More information about the Spoken Language Translator

About Us Vertical divider R&D Divisions Divider Careers Divider Newsroom Divider Contact Us
©2011 SRI International, 333 Ravenswood Avenue, Menlo Park, CA 94025-3493
SRI International is an independent, nonprofit corporation. Privacy policy
Last modified Dec 16, 2010