next up previous
Next: Results from related work Up: Results and Analysis Previous: Deletions

Filled pauses and utterance segmentation

 

As shown above, the Cleanup Model as applied to filled pauses yields a higher perplexity overall than the baseline trigram model. This is largely attributable to poorer word probability estimates at locations immediately following a filled pause. In prior work Shriberg [9] observed that filled pauses tend to occur at linguistic segment (e.g., clause) boundaries. Since the standard LM test utterances are segmented according to acoustic criteria, filled pauses around linguistic boundaries can actually occur in the middle of acoustic utterance segments. At such locations, the assumptions of the Cleanup Model would be grossly violated, since the preceding words actually belong to a different linguistic segment. The standard model, on the other hand, can produce reasonable predictions, as the filled pause can serve as an indicator of the boundary.

To test this hypothesis we compared the perplexities of both models on a subset of the test data that was hand-annotated for linguistic segmentations, and that had been re-segmented accordingly (10250 words in 1325 segments). Specifically, we compared the perplexities of words following medial filled pauses, i.e., filled pauses not occurring as the first or last word in a linguistic segment. Results are shown in Table 5.

   table140
Table 5: Local perplexities after medial filled pauses

We see that the Cleanup Model is the better predictor for words following medial FPs, the reverse of the result for acoustically segmented utterances. That is, the cleanup assumption holds for medial FPs if one models utterances based on linguistic, rather than acoustic, segments.



Andreas Stolcke
Fri Jun 28 19:31:43 PDT 1996