next up previous
Next: Probability computation Up: The Model Previous: Disfluency types

The Cleanup Model

The central assumption incorporated in our DF language model is that probability estimates for words after a DF are more accurate if conditioned on the intended fluent word sequence. A secondary assumption is that DFs themselves can be modeled as word-like events, each having a probability conditioned on the context. A standard language model, by contrast, would look only at the surface string of words and assign word probabilities in a strictly sequential manner.

Because of the central assumption, we call our DF model the `Cleanup Model.' It is implemented as a standard backoff trigram model with the following three modifications to account for DFs.

  1. Words following a DF event are conditioned on the cleaned-up, fluent version of the context. Filled pauses are removed from contexts, as is the sequence of extraneous words in repetitions and deletions.

    For example, the probability estimate for ``WANT'' following ``BECAUSE I I'' would be

    displaymath402

    where tex2html_wrap_inline404 denotes a repetition event. The repeated ``I'' is deleted from the context.

  2. Disfluencies are represented by probabilistic events occurring within the word stream, some of which are hidden from direct observation. For simplicity, we model only the most prevalent subtypes for each DF class, namely filled pauses UH and UM, repetitions of one or two words (REP1, REP2), deletions at the beginning of a sentence (SDEL), and other one- or two-word deletions (DEL1, DEL2).
  3. Just as words, DFs are treated as events that are assigned probabilities conditioned on their context. The contexts themselves are subject to DF cleanup as described above. For example, tex2html_wrap_inline406 is the probability of repeating ``I'' after ``BECAUSE.''

By representing DFs simply as another type of N-gram event, we allow DFs to be conditioned on specific lexical contexts, so that simple word-based regularities in their distribution can be captured. Furthermore, because of its simple N-gram character, the model does not embody specific assumptions or constraints about the distribution of DF events.


next up previous
Next: Probability computation Up: The Model Previous: Disfluency types

Andreas Stolcke
Fri Jun 28 19:31:43 PDT 1996