Divider
  Speech Technology and Research Laboratory
  People
  Current Research Activities
  Past Research Activities
  Publications
  Career Opportunities
  Seminars
  Technologies for License
  In the News
  Contact Us
  STAR Search
  Information and Computing Sciences Division
SpacerAbout UsDividerR and D DivisionsDividerCareersDividerNewsroomDividerContact UsDividerSRI HomeSpacer

Spacer
         
  SRI Logo

Search SRILM-USER Archives

Match: Format: Sort by:
Search:

Re: read/write counts in FLMs

From: Tanel =?ISO-8859-1?Q?Alum=E4e?= <tanel.alumae at ADDRESS HIDDEN>
Date: Fri, 10 Jun 2005 16:46:50 +0300

Hello,

As far as I understand, you need both the FLM LM file and the FLM counts
file to actually use the FLM. So you should actually always use both the
-write-counts and the -lm option when building FLM.

As for -read-counts, I believe that you could use a general counts file
there (i.e. which counts the occurrances of tagged words rather than the
factors). You can get the general counts file from the tagged corpus
using the ngram-count program, just like for untagged corpus.

The FLM counts file uses a special format (look into it and you see)
which probably confuses fngram-count when fed into it using
-read-counts.

Hope this helps,

Tanel A.

On Wed, 2005-06-08 at 09:29 -0400, Shachi Dave wrote:
> Hi,
>
> I am trying to build a factored language model(FLM) using "fngram-count"
> in SRILM toolkit.
>
> When I run it using "-write-counts" and "-lm" options together, it
> builds the FLM correctly. But when I try to break it down into two
> steps:
> (a) only "-write-counts" option to write the counts file
> (b) "-read-counts" and "-lm" options to build the FLM using the counts
> file
>
> it gives errors. I checked the debug output; it seems it is getting the
> count-of-counts for modified Kneser-Ney discounting wrong in the step
> (b) above. The counts file generated in step (a) is exactly similar to
> the one generated using both "-write-counts" and "-lm" options together.
> I tried these steps using a couple of different FLM specifications and
> the error is the same. Has anyone faced this problem before? I will
> appreciate if you can help me out here.
>
> Thanks,
> Shachi
>
>
>
>

Click here to go to the SRILM home page.

 

About Us  Vertical divider  R&D Divisions  Divider  Careers  Divider  Newsroom  Divider  Contact Us
©2006 SRI International, 333 Ravenswood Avenue, Menlo Park, CA 94025-3493
SRI International is an independent, nonprofit corporation. Privacy policy

Last modified Nov 21, 2008