B. Plank wrote:
> Dear SRILM-team,
>
> is there a parameter to get the n most frequent words out of a LM? (i.e.
> like restricing the write-vocab of "ngram -order 1" to just output the
> n-most frequent words?) I am sure there is, just now I don't see it.
>
> Thank you for any help,
> Barbara
>
>
Actually, there is no such tool. The frequency of words is not
generally available in the LM, only their unigram
probabilities. Since the unigram probabilities are usually a monotonic
function of the unigram frequencies you
could write a small script that extracts the words from the unigram
section of the LM file and sorts them by
their probabilities.
Andreas
Click here to go to the SRILM home page.