From stolcke at speech.sri.com Thu Oct 4 12:39:09 2001 From: stolcke at speech.sri.com (Andreas Stolcke) Date: Thu, 04 Oct 2001 12:39:09 PDT Subject: Patch for ngram.cc Message-ID: <200110041939.MAA20363@toulouse> Dear All, a rather embarassing bug in ngram causes it to mess up the mixture weight for the second mixture model (-mix-lm) when three or more models are interpolated. This problem exists in all versions of SRILM prior to 1.2 (which won't be released for a while), so please apply the patch below if this bug affects you. --Andreas *** /tmp/T0EFaaMN Thu Oct 4 12:32:01 2001 --- ngram.cc Wed Oct 3 19:51:14 2001 *************** *** 463,470 **** mixLambda2/totalLambda); } mixLM = makeMixLM(mixFile, *vocab, classVocab, order, mixLM, ! 1.0 - totalLambda - mixLambda); useLM = new BayesMix(*vocab, *useLM, *mixLM, bayesLength, mixLambda, bayesScale); --- 463,472 ---- mixLambda2/totalLambda); } + double mixLambda1 = 1.0 - totalLambda - mixLambda; + mixLM = makeMixLM(mixFile, *vocab, classVocab, order, mixLM, ! mixLambda1/(mixLambda1 + totalLambda)); useLM = new BayesMix(*vocab, *useLM, *mixLM, bayesLength, mixLambda, bayesScale); From dealsmart1 at lotte.com Thu Oct 25 06:40:02 2001 From: dealsmart1 at lotte.com (dealsmart1 at lotte.com) Date: Thu, 25 Oct 2001 21:40:02 +0800 (CST) Subject: Free conference calls! Message-ID: <0972348295.0568775655@star.lotte.com> An HTML attachment was scrubbed... URL: From pegram50 at hotmail.com Sun Nov 4 19:10:35 2001 From: pegram50 at hotmail.com (pegram50 at hotmail.com) Date: Sun, 04 Nov 2001 20:10:35 -0700 Subject: Great deals on inkjets and paper?? Message-ID: <00007b7869c0$00001952$00001cb9@> We currently have the following specials on inkjet cartridges for Epson, Canon, and HP printers: Go To http://3632334421/~stem/index.html or use fax order form below Current Specials! _________________ Epson Stylus Color 400/500/600 Black Ink Cartridge [S020093C] $7.95 ea. Epson Stylus Color 400/600/800/850/1520 Color Ink Cartridge [S020089C] $9.95 ea. Epson Stylus Color 800/850/1520 Black Ink Cartridge [S020108C] $8.95 ea. Epson Stylus Color 440/640/660/670 Black Ink Cartridge [S020187C] $8.50 ea. Epson Stylus Color 740/760/860 Black Ink Cartridge [S020189C] $10.50 ea. Epson Stylus Color 440/640/660/740/760/860 Color Ink Cart. [S020191C] $11.50 ea. Epson Stylus Color 700 Color Ink Cartridge [S020110C] $13.95 ea. Epson Stylus Color 750 Color Ink Cartridge [S020193C] $14.95 ea. Epson Stylus Color 900 Black Ink Cartridge [T003011C] $14.95 ea. Epson Stylus Color 900 Color Ink Cartridge [T005011C] $17.95 ea. Go To http://3632334421/~stem/index.html or use fax order form below Canon 2000/2100/4000/4100/4200 Black Ink Cartridge [BCI-21BKC] $5.25 ea. Canon 2000/2100/4000/4100/4200 Black Ink Cartridge [BCI-21CLR] $7.95 ea. Canon 600/610/620 4-Color Set B/C/M/Y Ink Cartridges [BJI-201C] $13.95 ea. HP Deskjet/Deskjet 400/500 Series Black (Remanufactured) [51626AR] $17.95 ea. HP Deskjet 600 Series Black (Remanufactured) [51629AR] $17.95 ea. HP Deskjet 700/800/900 Series Black (Remanufactured) [51645AR] $17.95 ea. HP Deskjet 700/870/890 Series Color (Remanufactured) [C1823AR] $23.95 ea. HP Deskjet 600 Series Color (Remanufactured) [51649AR] $23.95 ea. HP Deskjet 840/900 Series Black (Remanufactured) [C6578DR] $23.95 ea. Go To http://3632334421/~stem/index.html or use fax order form below Lexmark 7000/7200 Black Cartridge (Remanufactured) [12A1970R] $23.95 ea. Lexmark 3200/5000/5700/7000 Color Cart. (Remanufactured) [12A1980R] $26.95 ea. Lexmark Z42/Z51/Z52 Color Cartridge (Remanufactured) [15M0120R] $27.95 ea. Lexmark 1000/1020/2030/3000 Black Cart. (Remanufactured) [13400HCR] $23.95 ea. Lexmark 1000/1100/2030/2050/3000 Clr Cart.(Remanufactured) [13619HCR] $24.95 ea. -NEW!- Just Arrived!!! ______________________ Epson Stylus Color 777 Black Ink Cartridge [T017201C] $16.95 ea. Epson Stylus Color 777 Color Ink Cartridge [T018201C] $19.95 ea. Epson Stylus Color 880/880i Black Ink Cartridge [T019201C] $9.95 ea. Epson Stylus Color 880/880i Color Ink Cartridge [T020201C] $13.95 ea. Epson Stylus Color 870/1270 Black Ink Cartridge [T007201C] $15.95 ea. Epson Stylus Color 870 Color Cart. (1270 clr avail. soon!) [T008201C] $18.95 ea. HP Deskjet 700/800/900 Series Black (Remanufactured) [51645AR] $17.95 ea. Lexmark Z12/Z22/Z32 Black Cartridge (Remanufactured) [17G0050R] $23.95 ea. Go To http://3632334421/~stem/index.html or use fax order form below Shipping is $4.95 per order and takes 3-4 business days by USPS. We accept MC, Visa and AMEX. One year warranty on all products sold. All Epson and Canon cartridges are new compatibles. ************************************************************* VISA/MC/AMEX ONLY STEP 1: Print out the below ORDER FORM STEP 2: Type or Print your order information into the form STEP 3: FAX Your order to us. FAX TO: 305-278-4183 (Order with confidence. This is a secure fax area. Only our qualified sales team will have access to your order information) ORDER FORM(Print clearly with DARK pen) ****************************************************************************** Name: ________________________________ Address: ________________________________ * *BILLING ADDRESS ONLY City, State, ZIP: ________________________________ Country: ______________ (International Orders) Phone Number: ______________ (In case we can't make out your order) METHOD OF PAYMENT- CREDIT CARD ONLY [ ]Visa [ ]MasterCard [ ]AMEX Credit Card #: __________________________________ Exp Date: _______________ Signature: ____________________________ (Required) E-Mail Address: ____________________________ *(PRINT CLEARLY!!) Cartridge type Quantity Price Total 1) X = 2) X = 3) X = 4) X = Sub-Total = Shipping & Handling + $4.95 Total *********************************************************************** This mail has been sent in accordance with the pending Anti-Spam law Unsolicited Electronic Mail Act of 2000 (H.R. 3113). Should you wish to be removed from this mailing list please go to http://3632334421 and enter your email address From alan_lehmann at hotmail.com Fri Nov 9 17:03:09 2001 From: alan_lehmann at hotmail.com (alan_lehmann at hotmail.com) Date: Fri, 09 Nov 2001 20:03:09 -0500 Subject: National Background and Asset Searches! Message-ID: <000005e53689$0000686e$000041da@> We Run Background and Asset Searches Just call us Toll Free at (888) 729-8976 and PROTECT YOURSELF America Find Inc. 72 hour Turn Around Time. Just call us Toll Free at (888) 729-8976 and PROTECT YOURSELF What do you Really Know about your Employee? What do you Really Know about your Lover? What do you Really Know about your Baby Sitter? What do you Really Know about your Business Associate? You NEED to protect yourself! You NEED to know the TRUTH! Just call us Toll Free at (888) 729-8976 and PROTECT YOURSELF In 72 hours we can tell you everything to allow you to make an informed decision.!! All real property owned, all STATE AND FEDERAL civil law suits in the past 7 years.!! All State and Federal Misdemeanor and Criminal charges.!! Past 7 years as well as pending cases!! Do NOT go uninformed. Let America Find Inc. tell you the TRUTH! Just call us Toll Free at (888) 729-8976 and PROTECT YOURSELF. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ AS A COURTESY, IF YOU WOULD LIKE To be removed from further mailings send an email from the address you want removed to mailto:wonnabefree at yahoo.com?subject=Remove From pink2640 at yahoo.com Wed Nov 14 12:22:35 2001 From: pink2640 at yahoo.com (pink2640 at yahoo.com) Date: Wed, 14 Nov 2001 12:22:35 +-0800 Subject: Fwd: What if you're attacked? Message-ID: <200111142225.fAEMPvI13259@ns.neopp.co.kr> An HTML attachment was scrubbed... URL: From stolcke at speech.sri.com Thu Nov 15 08:36:28 2001 From: stolcke at speech.sri.com (Andreas Stolcke) Date: Thu, 15 Nov 2001 08:36:28 PST Subject: interpolation and SRISLM In-Reply-To: Your message of Thu, 15 Nov 2001 15:08:11 +0100. Message-ID: <200111151636.IAA14849@huge> In message you wrote: > Dear Mr Stolcke, > > I'm trying to use SRISLM in order to build ARPA n-gram language models. Up > to know it seems to work except that when I try to evaluate a model with > the CMUSLM toolkit, I've got an error message (due to the use of unk > instead ok UNK and also to the fact that there's no backoff weight for the > symbol). I also wonder whether it is possible to get interpolation > weights using SRISLM. Is there an option to use in order to get these > weights (the lambdas)? Thanks in advance. > > Best regards, > > jeanphi Dear jean-philippe, I'm sorry you encountered problems using SRILM model with the CMU toolkit, but they are easy to fix. The case of "unk" you can just edit by hand or with a simple text filter. The "missing" backoff weights on unigrams are actually a feature, because backoff weights should only be needed on unigrams that are a prefix to a longer ngram. However, because this is a common problem there is a script that adds "dummy" backoff weights. The script should be in $SRILM/bin/$MACHINE_TYPE/add-dummy-bows and documented in the "lm-scripts" manual page". As for the interpolation weights: SRILM currently only supports interpolation of LMs are the model-level, so there is a fixed lambda for each model that you are interpolating. Given a held-out training set, you can estimate these model-lambdas to minimize the perplexity of the data. This is done by $SRILM/bin/$MACHINE_TYPE/compute-best-mix . It is described in the "ppl-scripts" manual page. Hope this helps, --Andreas From stolcke at speech.sri.com Thu Dec 6 17:13:56 2001 From: stolcke at speech.sri.com (Andreas Stolcke) Date: Thu, 06 Dec 2001 17:13:56 PST Subject: questions about sri toolkit In-Reply-To: Your message of Thu, 06 Dec 2001 16:56:28 -0800. <3C1013BC.7090302@stl.research.panasonic.com> Message-ID: <200112070113.RAA00542@huge> In message <3C1013BC.7090302 at stl.research.panasonic.com>you wrote: > Dear Dr. stolcke, > > This is Yan from Panasonic Speech Tech Lab in Santa Barbara. I am > trying to use SRI toolkit on BN data, but always get the complain , > "can't allocate trie". I have 1.5G memory on my machine, which I suppose > should be ok for this task. Could you please give me some hints or > suggestions to fix this problem. > > Thank you very much! > > Yan Yan, I have no idea whether 1.5GB is enough, it depends entirely on the data. Please tell me 1 - exactly what operation you are trying to perform (counting, LM estimation, LM usage), 2 - what is the command line for what you are trying to do. 3 - some idea of how big the input data is. 4 - what type of machine (OS etc.) First of all, the amount of RAM is not all that matters. The size of a program's address space is limited by the configured swap space (plus the amount of real memory, at least on OSs). Also, on 32bit processors the limit is usually either 2GB or 4GB, regardless of the amount of swap you have. I should say that producing trigrams for the entire BN corpus (> 100 M words) will usually not work even with 2GB using just ngram-count and keeping everything in memory. That's what the "merge-batch-counts" and "make-big-lm" scripts were made for. Please consult the "training-scripts" manual page for details. Also, consider subscribing to the srilm-user mailing list. I will not always be able to help (at least not right away), and other users might be able to. See http://www.speech.sri.com/projects/srilm/welcome.html#srilm-user for instructions on how to join. --Andreas