<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>

  <head>

    <meta content="text/html; charset=ISO-8859-1"

      http-equiv="Content-Type">

  </head>

  <body bgcolor="#ffffff" text="#000000">

    Dear all,<br>

    <br>

        I am trying to implement the CN based misspelling correction

    method published by Bertoldi et al. 2010 (full citation is available

    at the end of this e-mail). However, I am sticked  at step number 4

    which involves the generation of a word-based CN by means of

    lattice-tool of SRILM toolkit.<br>

    <br>

        Once I have set the unifilar word lattices altogether in SLF

    format I call lattice-tool through this command:<br>

    <br>

        lattice-tool -in-lattice wordlattice.slf -read-htk -lm lm/en.lm 

    -write-mesh wordlattice.cn<br>

    <br>

        However, the fact of including the language model may destroy

    completely the original CN form if the input lattice is considerably

    long (>15 nodes). I have tried to scale the language model impact

    through -htk-scale and -htk-wdpenalty options. But even though I set

    the htk-scale and htk-wdpenalty options to 0 the CN still gets

    destroyed. The only way I can save the CN structure is avoiding

    completely the -lm option. But then the BLEU score of the

    translations decrease considerably.<br>

    <br>

        Could anyone give me some clues in order to keep track of the

    problem I may have? I can provide slf lattice sample alongside

    dot-generated images of intact and destroyed CNs.<br>

    <br>

        Regards,<br>

    <br>

    Lluís Formiga<br>

    <br>

    <br>

    <br>

    <span class="Apple-style-span" style="border-collapse: separate;

      color: rgb(0, 0, 0); font-family: 'Times New Roman'; font-style:

      normal; font-variant: normal; font-weight: normal; letter-spacing:

      normal; line-height: normal; orphans: 2; text-indent: 0px;

      text-transform: none; white-space: normal; widows: 2;

      word-spacing: 0px; font-size: medium;"><span

        class="Apple-style-span" style="font-family:

        Verdana,Arial,Helvetica,sans-serif; font-size: 16px; text-align:

        left;">[Bertoldi et al. 2010] Nicola Bertoldi, Mauro Cettolo,

        and Marcello Federico. 2010. Statistical machine translation of

        texts with misspelled words. In<span

          class="Apple-converted-space"> </span><em>Human Language

          Technologies: The 2010 Annual Conference of the North American

          Chapter of the Association for Computational Linguistics</em><span

          class="Apple-converted-space"> </span>(HLT '10). Association

        for Computational Linguistics, Stroudsburg, PA, USA, 412-419.</span></span><br>

    <div class="moz-signature">-- <br>

      <img src="cid:part1.05040107.04050605@upc.edu" border="0"></div>

  </body>

</html>