classes-format

classes-format

NAME

classes-format - File format for word class definitions

SYNOPSIS

class [p] word1 word2 ...

DESCRIPTION

Various programs dealing with word classes use this format to define the posssible expansions of classes and their respective probabilities. Each expansion appears on a separate line as in the synopsis, where class names a word class, p gives the probability for the class expansion, and word1 word2 ... defines the word string that the class expands to. If p is omitted it is assumed to be 1. (All expansion probabilities for a given class should sum to one, although this is not necessarily enforced by the software and would lead to improper models.)

Note that the concept of word class here is generalized to include ``multi-words'', or phrases consisting of more than one word. All expansions must have at least one word. Certain models might impose more restrictive formats.

SEE ALSO

ngram(1), ngram-class(1), disambig(1), training-scripts(1), pfsg-scripts(5).

AUTHOR

Andreas Stolcke <stolcke@speech.sri.com>.
Copyright 1999 SRI International