lattice-tool

lattice-tool

NAME

lattice-tool - manipulate word lattices

SYNOPSIS

lattice-tool [ -help ] option ...

DESCRIPTION

lattice-tool performs operations on word lattices in pfsg-format(5) or in HTK Standard Lattice format (SLF). Operations include size reduction, pruning, null-node removal, weight assignment from language models, lattice word error computation, and decoding of the best hypotheses.

Each input lattice is processed in turn, and a series of optional operations is performed in a fixed sequence (regardless of the order in which corresponding options are specified). The sequence of operations is as follows:

1.
Read input lattice.
2.
Score pronunciations (if dictionary was supplied).
3.
Split multiword word nodes.
4.
Posterior- and density-based pruning (before reduction).
5.
Write word posterior lattice.
6.
Viterbi-decode and ouptut 1-best hypothesis (using either the original or updated language model scores, see -old-decoding).
7.
Generate and output N-best list (using either the original or updated language model scores, see -old-decoding).
8.
Compute lattice density.
9.
Check lattice connectivity.
10.
Compute node entropy.
11.
Compute lattice word error.
12.
Output reference word posteriors.
13.
Remove null nodes.
14.
Lattice reduction.
15.
Posterior- and density-based pruning (after reduction).
16.
Remove pause nodes.
17.
Lattice reduction (post-pause removal).
18.
Language model replacement or expansion.
19.
Pause recovery or insertion.
20.
Lattice reduction (post-LM expansion).
21.
Multiword splitting (post-LM expansion).
22.
Merging of same-word nodes.
23.
Lattice algebra operations (or, concatenation).
24.
Perform word-posterior based decoding.
25.
Write word mesh (confusion network).
26.
Compute and output N-gram counts.
27.
Compute and output N-gram index.
28.
Word posterior computation.
29.
Lattice-LM perplexity computation.
30.
Writing output lattice.

The following options control which of these steps actually apply.

OPTIONS

Each filename argument can be an ASCII file, or a compressed file (name ending in .Z or .gz), or ``-'' to indicate stdin/stdout.
-help
Print option summary.
-version
Print version information.
-debug level
Set the debugging output level (0 means no debugging output). Debugging messages are sent to stderr.
-in-lattice file
Read input lattice from file.
-in-lattice2 file
Read additional input lattice (for binary lattice operations) from file.
-in-lattice-list file
Read list of input lattices from file. Lattice operations are applied to each filename listed in file.
-set-lattice-names
Modify the lattice names embedded inside the lattice file to reflect the input filename. This allows the input filename information to be propagated to the output in cases where the embedded names are not informative.
-out-lattice file
Write result lattice to file.
-out-lattice-dir dir
Write result lattices from processing of -in-lattice-list to directory dir.
-read-mesh
Assume input lattices are in word mesh (confusion network) format, as described in wlat-format(5). Word posterior probabilities are converted to transition probabilities. If the input mesh contains acoustic information (time offsets, scores, pronunciations) that information is attached to words and links and output with -write-htk, as are the word posterior probabilities. (Use -htk-words-on-nodes to output word start times since HTK format supports times only on nodes.)
-write-internal
Write output lattices with internal node numbering instead of compact, consecutive numbering.
-overwrite
Overwrite existing output lattice files.
-vocab file
Initialize the vocabulary to words listed in file. This is useful in conjunction with
-limit-vocab
Discard LM parameters on reading that do not pertain to the words specified in the vocabulary. The default is that words used in the LM are automatically added to the vocabulary. This option can be used to reduce the memory requirements for large LMs; to this end, -vocab typically specifies the set of words used in the lattices to be processed (which has to be generated beforehand, see pfsg-scripts(1)).
-vocab-aliases file
Reads vocabulary alias definitions from file, consisting of lines of the form
	alias word
This causes all tokens alias to be mapped to word.
-unk
Map lattice words not contained in the known vocabulary with the unknown word tag. This is useful if the rescoring LM contains a probability for the unknown word (i.e., is an open-vocabulary LM). The known vocabulary is given by what is specified by the -vocab option, as well as all words in the LM used for rescoring.
-map-unk word
Map out-of-vocabulary words to word, rather than the default <unk> tag.
-keep-unk
Treat out-of-vocabulary words as <unk> but preserve their labels in lattice output.
-print-sent-tags
Preserve begin/end sentence tags in output lattice format. The default is to represent these as NULL node labels, since the begin/end of sentence is implicit in the lattice structure.
-tolower
Map all vocabulary to lowercase.
-nonevents file
Read a list of words from file that are used only as context elements, and are not predicted by the LM, similar to ``<s>''. If -keep-pause is also specified then pauses are not treated as nonevents by default.
-max-time T
Limit processing time per lattice to T seconds.

Options controlling lattice operations:

-write-posteriors file
Compute the posteriors of lattice nodes and transitions (using the forward-backward algorithm) and write out a word posterior lattice in wlat-format(5). This and other options based on posterior probabilities make most sense if the input lattice contains combined acoustic-language model weights.
-write-posteriors-dir dir
Similar to the above, but posterior lattices are written to separate files in directory dir, named after the utterance IDs.
-write-mesh file
Construct a word confusion network ("sausage") from the lattice and write it to file. If reference words are available for the utterance (specified by -ref-file or -ref-list) their alignment will be recorded in the sausage.
-write-mesh-dir dir
Similar, but write sausages to files in dir named after the utterance IDs.
-init-mesh file
Initialize the word confusion network by reading an existing sausage from file. This effectively aligns the lattice being processed to the existing sausage.
-acoustic-mesh
Preserve word-level acoustic information (times, scores, and pronunciations) in sausages, encoded as described in wlat-format(5).
-posterior-prune P
Prune lattice nodes with posteriors less than P times the highest posterior path.
-density-prune D
Prune lattices such that the lattice density (non-null words per second) does not exceed D.
-nodes-prune N
Prune lattices such that the total number of non-null, non-pause nodes does not exceed N.
-fast-prune
Choose a faster pruning algorithm that does not recompute posteriors after each iteration.
-write-ngrams file
Compute posterior expected N-gram counts in lattices and output them to file. The maximal N-gram length is given by the -order option (see below). The counts from all lattices processed are accumulated and output in sorted order at the end (suitable for ngram-merge(1)).
-write-ngram-index file
Output an index file of all N-gram occurences in the lattices processed, including their start times, durations, and posterior probabilities. The maximal N-gram length is given by the -order option (see below).
-min-count C
Prune N-grams with count less than C from output with -write-ngrams and -write-ngram-index. In the former case, the threshold applies to the aggregate occurrence counts; in the latter case, the threshold applies to the posterior probability of an individual occurence.
-max-ngram-pause T
Index only N-grams that contain internal pauses (between words) not exceeding T seconds (assuming time stamps are recorded in the input lattice).
-ngrams-time-tolerance T
Merge N-gram occurrences less than T seconds apart for indexing purposes (posterior probabilties are summed).
-posterior-scale S
Scale the transition weights by dividing by S for the purpose of posterior probability computation. If the input weights represent combined acoustic-language model scores then this should be approximately the language model weight of the recognizer in order to avoid overly peaked posteriors (the default value is 8).
-write-vocab file
Output the list of all words found in the lattice(s) to file.
-reduce
Reduce lattice size by a single forward node merging pass.
-reduce-iterate I
Reduce lattice size by up to I forward-backward node merging passes.
-overlap-ratio R
Perform approximate lattice reduction by merging nodes that share more than a fraction R of their incoming or outgoing nodes. The default is 0, i.e., only exact lattice reduction is performed.
-overlap-base B
If B is 0 (the default), then the overlap ratio R is taken relative to the smaller set of transitions being compared. If the value is 1, the ratio is relative to the larger of the two sets.
-reduce-before-pruning
Perform lattice reduction before posterior-based pruning. The default order is to first prune, then reduce.
-pre-reduce-iterate I
Perform iterative reduction prior to lattice expansion, but after pause elimination.
-post-reduce-iterate I
Perform iterative reduction after lattice expansion and pause node recovery. Note: this is not recommended as it changes the weights assigned from the specified language model.
-no-nulls
Eliminate NULL nodes from lattices.
-no-pause
Eliminate pause nodes from lattices (and do not recover them after lattice expansion).
-compact-pause
Use compact encoding of pause nodes that saves nodes but allows optional pauses where they might not have been included in the original lattice.
-loop-pause
Add self-loops on pause nodes.
-insert-pause
Insert optional pauses after every word in the lattice. The structure of inserted pauses is affected by -compact-pause and -loop-pause.
-collapse-same-words
Perform an operation on the final lattices that collapses all nodes with the same words, except null nodes, pause nodes, or nodes with noise words. This can reduce the lattice size dramatically, but also introduces new paths.
-connectivity
Check the connectedness of lattices.
-compute-node-entropy
Compute the node entropy of lattices.
-compute-posteriors
Compute node posterior probabilities (which are included in HTK lattice output).
-density
Compute and output lattice densities.
-ref-list file
Read reference word strings from file. Each line starts with a sentence ID (the basename of the lattice file name), followed by the words. This or the next option triggers computation of lattice word errors (minimum word error counts of any path through a lattice).
-ref-file file
Read reference word strings from file. Lines must contain reference words only, and must be matched to input lattices in the order processed.
-write-refs file
Write the references back to file (for validation).
-add-refs P
Add the reference words as an additional path to the lattice, with probability P. Unless -no-pause is specified, optional pause nodes between words are also added. Note that this operation is performed before lattice reduction and expansion, so the new path can be merged with existing ones, and the probabilities for the new path can be reassigned from an LM later.
-noise-vocab file
Read a list of ``noise'' words from file. These words are ignored when computing lattice word errors, when decoding the best word sequence using -viterbi-decode or -posterior-decode, or when collapsing nodes with -collapse-same-words.
-keep-pause
Causes the pause word ``-pau-'' to be treated like a regular word. It prevents pause from being implicitly added to the list of noise words.
-ignore-vocab file
Read a list of words that are to be ignored in lattice operations, similar to pause tokens. Unlike noise words (see above) they are also skipped during LM evaluation. With this option and -keep-pause, pause words are not ignored by default.
-split-multiwords
Split lattice nodes with multiwords into a sequence of non-multiword nodes. This option is necessary to compute lattice error of multiword lattices against non-multiword references, but may be useful in its own right.
-split-multiwords-after-lm
Perform multiword splitting after lattice expansion using the specified LM. This should be used if the LM uses multiwords, but the final lattices are not supposed to contain multiwords.
-multiword-dictionary file
Read a dictionary from file containing multiword pronunciations and word boundary markers (a ``|'' phone label). Specifying such a dictionary allows the multiword splitting options to infer accurate time marks and pronunciation information for the multiword components.
-multi-char C
Designate C as the character used for separating multiword components. The default is an underscore ``_''.
-operation O
Perform a lattice algebra operation O on the lattice or lattices processed, with the second operand specified by -in-lattice2. Operations currently supported are concatenate and or, for serial and parallel lattice combination, respectively, and are applied after all other lattices manipulations.
-viterbi-decode
Print out the word sequence corresponding to the highest probability path.
-posterior-decode
Print out the word sequence with lowest expected word error.
-output-ctm
Output word sequences in NIST CTM (conversation time mark) format. Note that word start times will be relative to the lattice start time, the first column will contain the lattice name, and the channel field is always 1. The word confidence field contains posterior probabilities if -posterior-decode is in effect. This option also implies -acoustic-mesh.
-hidden-vocab file
Read a subvocabulary from file and constrain word meshes to only align those words that are either all in or outside the subvocabulary. This may be used to keep ``hidden event'' tags from aligning with regular words.
-dictionary-align
Use the dictionary pronunciations specified with -dictionary to induce a word distance metric used for word mesh alignment. See the nbest-lattice(1) -dictionary option.
-nbest-decode N
Generate the up to N highest scoring paths through a lattice and write them out in nbest-format(5), along with optional additional score files to store knowledge sources encoded in the lattice. Further options are needed to specify the location of N-best lists and score files, described below under "N-BEST DECODING". Duplicated Hypotheses that differ only in pause and words specified with -ignore-vocab are removed from the N-best output. If the -multiwords option is specified, duplicates due to multiwords are also eliminated.
-old-decoding
Decode lattices (in Viterbi or N-best mode) without applying a new language model. By default, if -lm is specified, the -viterbi-decode and -nbest-decode options will use the LM to replace language model scores encoded in an HTK-formatted lattice. For PFSG lattices, the new LM scores will be added to the original scores.
-nbest-duplicates K
Allow up to K duplicate word hypotheses to be output in N-best decoding (implies -old-decoding).
-nbest-max-stack M
Limits the depth of the hypothesis stack used in N-best decoding to M entries, which may be useful for limiting memory use and runtime.
-nbest-viterbi
Use a Viterbi algorithm to generate N-best, rather than A-star. This uses less memory but may take more time (implies -old-decoding).
-decode-beamwidth B
Limits beamwidth in LM-based lattice decoding. Default value is 1e30.
-decode-max-degree D
Limits allowed in-degree in the decoding search graph for LM-based lattice decoding. Default value is 0, meaning unlimited.
-ppl file
Read sentences from file and compute the maximum probability (of any path) assigned to them by the lattice being processed. Effectively, the lattice is treated as a (deficient) language model. The output detail is controlled by the
-word-posteriors-for-sentences file
Read sentences from file and compute and output the word posterior probabilities according to a confusion network generated from the lattice (as with -write-mesh). If there is no path through the confusion network matching a sentence, the posteriors output will be zero.
-debug
option, similar to ngram -ppl output. (In particular, -debug 2 enables tracing of lattice nodes corresponding to sentence prefixes.) Pause words in file are treated as regular words and have to match pause nodes in the lattice, unless -nopause specified, in which case pauses in both lattice and input sentences are ignored.

The following options control transition weight assignment:

-order n
Set the maximal N-gram order to be used for transition weight assignment (the default is 3).
-lm file
Read N-gram language model from file. This option also triggers weight reassignment and lattice expansion.
-use-server S
Use a network LM server (typically implemented by ngram(1) with the -server-port option) as the main model. This option also triggers weight reassignment and lattice expansion. The server specification S can be an unsigned integer port number (referring to a server port running on the local host), a hostname (referring to default port 2525 on the named host), or a string of the form port@host, where port is a portnumber and host is either a hostname ("dukas.speech.sri.com") or IP number in dotted-quad format ("140.44.1.15").
For server-based LMs, the -order option limits the context length of N-grams queried by the client (with 0 denoting unlimited length). Hence, the effective LM order is the mimimum of the client-specified value and any limit implemented in the server.
When -use-server is specified, the arguments to the options -mix-lm, -mix-lm2, etc. are also interpreted as network LM server specifications provided they contain a '@' character and do not contain a '/' character. This allows the creation of mixtures of several file- and/or network-based LMs.
-cache-served-ngrams
Enables client-side caching of N-gram probabilities to eliminated duplicate network queries, in conjunction with -use-server. This may results in a substantial speedup but requires memory in the client that may grow linearly with the amount of data processed.
-no-expansion
Suppress lattice expansion when a language model is specified. This is useful if the LM is to be used only for lattice decoding (see -viterbi-decode and -nbest-decode).
-multiwords
Resolve multiwords in the lattice without splitting nodes. This is useful in rescoring lattices containing multiwords with a LM does not use multiwords.
-zeroprob-word W
If a word token is assigned a probability of zero by the LM, look up the word W instead. This is useful to avoid zero probabilities when processing lattices with an LM that is mismatched in vocabulary.
-classes file
Interpret the LM as an N-gram over word classes. The expansions of the classes are given in file in classes-format(5). Tokens in the LM that are not defined as classes in file are assumed to be plain words, so that the LM can contain mixed N-grams over both words and word classes.
-simple-classes
Assume a "simple" class model: each word is member of at most one word class, and class expansions are exactly one word long.
-mix-lm file
Read a second N-gram model for interpolation purposes. The second and any additional interpolated models can also be class N-grams (using the same -classes definitions).
-factored
Interpret the files specified by -lm, -mix-lm, etc. as factored N-gram model specifications. See ngram(1) for more details.
-lambda weight
Set the weight of the main model when interpolating with -mix-lm. Default value is 0.5.
-mix-lm2 file
-mix-lm3 file
-mix-lm4 file
-mix-lm5 file
-mix-lm6 file
-mix-lm7 file
-mix-lm8 file
-mix-lm9 file
Up to 9 more N-gram models can be specified for interpolation.
-mix-lambda2 weight
-mix-lambda3 weight
-mix-lambda4 weight
-mix-lambda5 weight
-mix-lambda6 weight
-mix-lambda7 weight
-mix-lambda8 weight
-mix-lambda9 weight
These are the weights for the additional mixture components, corresponding to -mix-lm2 through -mix-lm9. The weight for the -mix-lm model is 1 minus the sum of -lambda and -mix-lambda2 through -mix-lambda9.
-loglinear-mix
Implement a log-linear (rather than linear) mixture LM, using the parameters above.
-context-priors file
Read context-dependent mixture weight priors from file. Each line in file should contain a context N-gram (most recent word first) followed by a vector of mixture weights whose length matches the number of LMs being interpolated. (This and the following options currently only affect linear interpolation.)
-bayes length
Interpolate models using posterior probabilities based on the likelihoods of local N-gram contexts of length length. The -lambda values are used as prior mixture weights in this case. This option can also be combined with -context-priors, in which case the length parameter also controls how many words of context are maximally used to look up mixture weights. If -context-priors is used without -bayes, the context length used is set by the -order option and Bayesian interpolation is disabled, as when scale (see next) is zero.
-bayes-scale scale
Set the exponential scale factor on the context likelihood in conjunction with the -bayes function. Default value is 1.0.
-compact-expansion
Use a compact expansion algorithm that uses backoff nodes to reduce the size of expanded lattices (see paper reference below).
-old-expansion
Use older versions of the lattice expansion algorithms (both regular and compact), that handle only trigram models and require elimination of null and pause nodes prior to expansion. Not recommended, but useful if full backward compatibility is required.
-max-nodes M
Abort lattices expansion when the number of nodes (including null and pause nodes) exceeds M. This is another mechanism to avoid spending too much time on very large lattices.
-hyp-list file
Read 1st ASR hypothesis word strings from file. Each line starts with a sentence ID (the basename of the lattice file name), followed by the words. The hypothesized words are added into the word mesh (confusion network)
-hyp-file file
Read 1st ASR hypothesis word strings from file. Lines must contain hypothesized words only, and must be matched to input lattices in the order processed. The hypothesized words are added into the word mesh (confusion network)
-hyp2-list file
Read 2nd ASR hypothesis word strings from file. Each line starts with a sentence ID (the basename of the lattice file name), followed by the words. The hypothesized words are added into the word mesh (confusion network)
-hyp2-file file
Read 2nd ASR hypothesis word strings from file. Lines must contain hypothesized words only, and must be matched to input lattices in the order processed. The hypothesized words are added into the word mesh (confusion network)
-add-hyps P
Add the hypothesized words as an additional path to the word mesh (confusion network), with probability P.

LATTICE EXPANSION ALGORITHMS

lattice-tool incorporates several different algorithms to apply LM weights to lattices. This section explains what algorithms are applied given what options.
Compact LM expansion
This expands the nodes and transitions to be able to assign higher-order probabilities to transitions. Backoffs in the LM are exploited in the expansion, thereby minimizing the number of added nodes (Weng et al., 1998). This algorithm is triggered by -compact-expansion For the resulting lattices to work correctly, backoff paths in the LM must have lower weight than the corresponding higher-order paths. (For N-gram LMs, this can be achieved using the ngram -prune-lowprobs option.) Pauses and null nodes are handled during the expansion and do not have to be removed and restored.
General LM expansion
This expands the lattice to apply LMs of arbitrary order, without use of backoff transitions. This algorithm is the default (no -compact-expansion).
Unigram weight replacement
This simply replaces the weights on lattice transitions with unigram log probabilities. No modification of the lattice structure is required. This algorithm is used if -old-expansion and -order 1 are specified.
Bigram weight replacement
This replaces the transition weights with bigram log probabilities. Pause and null nodes have to be eliminated prior to the operation, and are restored after weight replacement. This algorithm is used if -old-expansion and -order 2 are specified.

HTK LATTICES

lattice-tool can optionally read, process, and output lattices in HTK Standard Lattice Format. The following options control HTK lattice processing.

-read-htk
Read input lattices in HTK format. All lattices are internally represented as PFSGs; to achieve this HTK lattices links are mapped to PFSG nodes (with attached word and score information), and HTK lattice nodes are mapped to PFSG NULL nodes. Transitions are created so as to preserve words and scores of all paths through the original lattice. On output, this mapping is reversed, so as to create a compact encoding of PFSGs containing NULL nodes as HTK lattices.
-htk-acscale S
-htk-lmscale S
-htk-ngscale S
-htk-prscale S
-htk-duscale S
-htk-x1scale S
-htk-x2scale S
...
-htk-x9scale S
-htk-wdpenalty S
These options specify the weights for acoustic, LM, N-gram, pronunciation, and duration models, up to nine extra scores, as well as word transition penalties to be used for combining the various scores contained in HTK lattices. The combined scores are then used to compute the transition weights for the internal PFSG representation. Default weights are obtained from the specifications in the lattice files themselves.
Word transition penalties are scaled according to the log base used. Values specified on the command line are scaled according to -htk-logbase, or the default 10. Word transition penalties specified in the lattice file are scaled according to the log base specified in the file, or the default e.
-htk-logzero Z
Replace HTK lattices score that are zero (minus infinity on the log scale) by the log-base-10 score Z. This is typically used after rescoring with a language model that assigns probability zero to some words in the lattice, and allows meaningful computation of posterior probabilities and 1-best hypotheses from such lattices.
-no-htk-nulls
Eliminate NULL nodes otherwise created by the conversion of HTK lattices to PFSGs. This creates additional links and may or may not reduce the overall processing time required.
-dictionary file
Read a dictionary containing pronunciation probabilities from file, and add or replace the pronunciation scores in the lattice accordingly. This requires that the lattices contain phone alignment information.
-intlogs
Assume the dictionary contains log probabilities encoded on the int-log scale, as used by the SRI Decipher system.
-write-htk
Write output lattices in HTK format. If the input lattices were in PFSG format the original PFSG weights will be output as HTK acoustic scores. However, LM rescoring will discard the original PFSG weights and the results will be encoded as LM scores. Pronunciation scoring results will be encoded as pronunciations scores. If the -compute-posteriors was used in lattice processing the output lattices will also contain node posterior probabilities. If the input lattices were in HTK format, then acoustic and duration scores are preserved from the input lattices. The score scaling factors in the lattice header will reflect the -htk-*scale options given above.
-htk-logbase B
Modify the logarithm base in HTK lattices output. The default is to use logs base 10, as elsewhere in SRILM. As value of 0 means to output probabilities instead of log probabilities. Note that the log base for input lattices is not affected by this option; it is encoded in the lattices themselves, and defaults to e according to the HTK SLF definition.
-htk-words-on-nodes
Output word labels and other word-related information on HTK lattice nodes, rather than links. This option is provided only for compatibility with software that requires word information to be attached specifically to nodes.
Note:
The options -no-htk-nulls, -htk-words-on-nodes, and -htk-scores-on-nodes defeat the mapping of internal PFSG nodes back to HTK transitions, and should therefore NOT be used when a compact output representation is desired.
-htk-quotes
Enable the HTK string quoting mechanism that allows whitespace and other non-printable characters to be included in words labels and other fields. This is disabled by default since PFSG lattices and other SRILM tools don't support such word labels. It affects both input and output format for HTK lattices.

N-BEST DECODING

The option -nbest-decode triggers generation of N-best lists, according to the aggregate score of paths encoded in the lattice. The output format for N-best lists and associated additional score files is compatible with other SRILM tools that process N-best lists, such as those described in nbest-lattice(1) and nbest-scripts(1). The following options control the location of output files:
-out-nbest-dir dir
The directory to which N-best list files are written. These contain acoustic model scores, language model scores, word counts, and the word hypotheses themselves, in SRILM format as described in nbest-format(5).
-out-nbest-dir-ngram dir
Output directory for separate N-gram LM scores as may be encoded in HTK lattices.
-out-nbest-dir-pron dir
Output directory for pronunciation scores encoded in HTK lattices.
-out-nbest-dir-dur dir
Output directory for duration model scores encoded in HTK lattices.
-out-nbest-dir-xscore1 dir
-out-nbest-dir-xscore2 dir
...
-out-nbest-dir-xscore9 dir
Output score directories for up to nine additional knowledge sources encoded in HTK lattices.
-out-nbest-dir-rttm dir
N-best hypotheses in NIST RTTM format. This function is experimental and makes assumptions about the input file naming conventions to infer timing information.

SEE ALSO

ngram(1), ngram-merge(1), pfsg-scripts(1), nbest-lattice(1), pfsg-format(5), ngram-format(5), classes-format(5), wlat-format(5), nbest-format(5).
F. Weng, A. Stolcke, and A. Sankar, ``Efficient Lattice Representation and Generation.'' Proc. Intl. Conf. on Spoken Language Processing, vol. 6, pp. 2531-2534, Sydney, 1998.
S. Young et al., The HTK Book, HTK version 3.1. http://htk.eng.cam.ac.uk/prot-docs/htk_book.shtml

BUGS

Not all LM types supported by ngram(1) are handled by lattice-tool.

Care must be taken when processing multiword lattices with -unk and -multiwords or -split-multiwords. Multiwords not listed in the LM (or the explicit vocabulary specified) will be considered ``unknown'', even though their components might be in-vocabulary.

The -nbest-duplicates option does not work together with -nbest-viterbi.

When applying -decode-viterbi or -decode-nbest to PFSG lattices, the old transition weights are effectively treated as acoustic scores, and the new LM scores are added to them. There is no way to replace old LM scores that might be part of the PFSG transition weights. This is a limitation of the format, since PFSGs cannot encode separate acoustic and language scores.

Input lattices in HTK format may contain node or link posterior information. However, this information is effectively discarded; posteriors are always recomputed from scores when needed for pruning or output.

The -no-nulls, -no-pause and -compact-pause options discard the acoustic information associated with NULL and pause nodes in HTK lattice input, and should therefore not be used if equivalent HTK lattice output is intended.

The -keep-unk option currently only works for input/output in HTK lattice format.

When rescoring HTK lattices with LMs the new scores are not taken into account in subsequent operations based on word posterior probabilities (posterior decoding, word mesh building, N-gram count generation). To work around this write the rescored lattices to files and invoke the program a second time.

AUTHORS

Fuliang Weng <fuliang@speech.sri.com>
Andreas Stolcke <andreas.stolcke@microsoft.com>
Dustin Hillard <hillard@ssli.ee.washington.edu>
Jing Zheng <zj@speech.sri.com>
Copyright 1997-2011 SRI International
Copyright 2012-2013 Microsoft Corp.