nbest-pron-score

NAME

nbest-pron-score - score pronunciations and pauses in N-best hypotheses

SYNOPSIS

nbest-pron-score [ -help ] option ...

DESCRIPTION

nbest-pron-score reads N-best lists and computes log probability scores for the pronunciations and pauses contained in them. Pronunciation scoring requires that the N-best lists contain phone backtraces in "NBestList2.0" nbest-format(5).

Pronunciation scores are computed from the probabilities in a dictionary. Pauses are binned into three length classes (none, short, long) and scored according to a trigram language model that conditions the pause length on the left and right neighboring words, in that order (so that bigram backoff uses the left neighbor only).

OPTIONS

Each filename argument can be an ASCII file, or a compressed file (name ending in .Z or .gz), or ``-'' to indicate stdin/stdout.

-help

Print option summary.

-version

Print version information.

-debug level

Controls the amount of output (the higher the level, the more).

-tolower

Map all vocabulary to lowercase. Useful if case conventions for text/counts and language model differ.

-multiwords

Deal with N-best lists containing multiwords joined by underscores. This only affects pause scoring: if a word adjacent to a pause is a multiword and is not in the vocabulary of the pause LM, then it is split and only the component closest to the pause is conditioned on.

-multi-char C

Character used to delimit component words in multiwords (an underscore character by default).

-nbest file

Score the N-best hypothese in file.

-rescore file

Same as -nbest.

-nbest-files file

Process all N-best list filenames listed in file.

-max-nbest n

Limits the number of hypotheses read from an N-best list. Only the first n hypotheses are processed.

-dictionary file

Enable pronunciation scoring, using the pronunciation dictionary file. Each line contains a pronunciation in the format

	word [p] phone ...

The optional value p is the pronunciation probability. If the second field in a line is not a number the pronunciation is assumed to have probability one.

-intlogs

Interpret probabilities in the dictionary as intlog-scaled log probabilities (as used in the SRI Decipher(TM) system), rather than straight probabilities.

-pause-lm file

Enable pause scoring, using the pause LM in file.

-no-pause tag

The word used to represent the absence of a pause in the pause LM.

-short-pause tag

The word used to represent a short pause in the pause LM.

-long-pause tag

The word used to represent a long pause in the pause LM.

-min-pause-dur T

The minimum duration, in seconds, for a non-speech region to be considered a (short) pause.

-long-pause-dur T

The duration, in second, above which a non-speech region is considered a "long" pause.

The default values for pause tags and duration thresholds are printed by the -help option.

-pron-score-dir dir: Write pronunciation scores to dir when processing multiple N-best lists, using output filenames derived from the input files.
-pause-score-dir dir: Write pause scores to dir when processing multiple N-best lists, using output filenames derived from the input files.
-pause-score-weight W: Add pause LM scores to the pronunciation scores after multiplying them by W. This creates a single weighted combination of both models. Pause scores can still be output separately by specifying -pause-score-dir.

BUGS

The binning of pause lengths into three classes should be generalized.