Divider
  Speech Technology and Research Laboratory
  People
  Current Research Activities
  Past Research Activities
  Publications
  Career Opportunities
  Seminars
  Technologies for License
  In the News
  Contact Us
  STAR Search
  Information and Computing Sciences Division
SpacerAbout UsDividerR and D DivisionsDividerCareersDividerNewsroomDividerContact UsDividerSRI HomeSpacer

Spacer
         
  SRI Logo

Modeling Disfluencies in Spontaneous Speech

Funding Information

  • Sponsor: National Science Foundation (NSF)
  • NSF program: Interactive Systems
  • NSF program official: Dr. Gary W. Strong
  • Grant No.: IRI-9314967
  • Award Period: February 14, 1994 - February 28, 1998

Principal Investigator

Co-Investigators

Project Summary

Spoken language is the medium used first and foremost by humans for accurate and efficient interactive problem solving. As an input modality for human-computer interaction, spoken language can offer: (1) accessibility to an increasing number of people, including those with little or no training, (2) increased access to a growing set of data resources via telephone without a computer terminal, (3) increased power for those already familiar with computer technology, (4) an additional communication channel for more robust communication, for use in unusual environments, and for devices for the disabled, (5) flexibility of modality and use of computers by humans generally, and (6) increased applications and job opportunities in areas that will grow out of increased exposure of people to the potential of technology.

Although there has been significant work devoted to some spontaneous speech phenomena, such as "slips of the tongue," other much more frequent types of spontaneous speech "disfluencies" have been largely ignored, e.g., false starts, hesitations, filled pauses and related phenomena. Such disfluencies are highly prevalent in normal human communication. Although disfluencies are less frequent in human-machine dialog, the causes and costs (e.g., in terms of cognitive load on the user) of this discrepancy are unknown. Further, because current speech understanding systems do not model disfluencies well, when they do occur, they are correlated with speech recognition and understanding errors. As spoken language systems evolve to allow more natural human-machine dialogue, the rate of disfluencies is likely to rise to rates closer to those observed in human-machine communication. A better understanding of the interdisciplinary aspects of disfluencies is critical to the development of a principled treatment of these highly frequent attributes of spontaneous speech.

This project models disfluencies at lexical, syntactic, and acoustic-prosodic levels. The goal is to gain insight into human communication, and to develop algorithms to robustly recognize speech that includes disfluencies. The approach involves analysis of disfluencies in existing, digitized corpora and in speech collected in controlled experiments. The investigation is undertaken by a team representing expertise in different, complementary disciplines, including linguistics, psycholinguistics, and cognitive psychology. As the project enters its final phase, recent efforts at SRI have investigated how results of the descriptive research can be integrated in SRI's speech understanding system. In particular SRI has developed methods for automatically detecting disfluencies, using acoustic-prosodic information combined with specialized language models. Related studies at Stanford have focused on syntactic properties of disfluencies and on functional aspects. Additional related work at MIT aims to understand the articulatory mechanisms involved in self-interruption, as well as the relationship between speech errors and sentence prosody.

Collaborators

  • Becky Bates, Boston University
  • John Bear, SRI International
  • Laura Dilley, MIT
  • Jean Fox Tree, University of California at Santa Cruz
  • Astrid Hagen, Univ. Erlangen / MIT
  • Gerald McRoberts, Stanford University
  • Mari Ostendorf, Boston University
  • Ken Stevens, MIT
  • Andreas Stolcke, SRI International
  • Tom Wasow, Stanford University

Reports

 

About Us  Vertical divider  R&D Divisions  Divider  Careers  Divider  Newsroom  Divider  Contact Us
©2011 SRI International, 333 Ravenswood Avenue, Menlo Park, CA 94025-3493
SRI International is an independent, nonprofit corporation. Privacy policy

Last modified Jul 20, 2006