Call for Workshop Papers
  HIGHER-LEVEL LINGUISTIC AND OTHER KNOWLEDGE FOR AUTOMATIC SPEECH PROCESSING
                                       
(Workshop in conjunction with NAACL/HLT 2004)

   The Park Plaza Hotel, Boston, Massachusetts
   Thursday, May 6, 2004
   
   The theme of this workshop is the use of higher-level linguistic and
   other types of knowledge for automatic speech processing, especially,
   but not limited to, speech recognition (ASR). Most current
   state-of-the-art speech recognizers do not explicitly use linguistic
   information (with the exception of pronunciation dictionaries),
   relying mainly on information encoded in statistical N-gram language
   models. Higher-level linguistic processes such as prosody, syntax,
   semantics, and pragmatics are obviously important, but such
   information is typically harder to label, model, and integrate into
   the standard computational frameworks (such as hidden Markov models).
   In addition, high-level meta-information, such as personal information
   stored in a database or dialogue and pragmatic coherence constraints,
   can also play important roles. All these sources of information can
   potentially compensate for acoustic confusability resulting from noisy
   environments and unexpected channel and speaker mismatch, which are
   very challenging issues for automatic speech recognizers. Furthermore,
   high-level information is typically crucial when the ultimate goal is
   to interpret the spoken input (i.e., the same sequence of words can
   mean different things depending on prosodic and syntactic features, as
   well as pragmatic constraints). Speaker recognition is another field
   that has recently recognized the importance of higher-level linguistic
   features, due to the fact that speakers exhibit idiosyncratic
   prosodic, lexico-syntactic, and pragmatic patterns ("conversational
   biometrics").
   
   This workshop seeks to bring together researchers in speech, NLP, and
   linguistics, exploring novel ideas on the use of information beyond
   the low-level approaches traditionally used in speech processing
   (frame-level acoustic modeling and N-gram based language modeling).
   Many vigorous research efforts in this direction are well-established,
   and some have proven to be very successful, such as
   structured/dependency language models for speech recognition, or
   prosodic information for speaker recognition. For limited domains
   (e.g. travel reservation and financial transactions), semantic
   information has clearly been useful for improving speech recognition.
   Recently, more human knowledge resources that encode different aspects
   of syntax, semantics, ontology, and common-sense knowledge have become
   available, and could well be used to augment language models to
   improve speech recognition. Such resources may include, but are not
   limited to, annotated corpora such as the Penn Treebank and PropBank,
   as well as FrameNet, WordNet, OpenCyc, etc. One challenge is that
   conditioning a language model on such information typically leads to
   data sparseness/fragmentation, so a proper representation of such
   knowledge is absolutely critical to success. This workshop seeks to
   improve the dissemination and exchange of ideas, methods, and data
   resources that are relevant to further progress. This workshop is
   seeking papers that present novel ideas of how higher-level linguistic
   and other types of information can be utilized for automatic speech
   processing, as well as experimental results.
   
IMPORTANT DATES

   Wed, Jan 21, 2004 Submissions due
   Fri, Feb 20, 2004 Acceptance/rejection notification
   Mon, Mar 8, 2004 Camera ready copy due
   Thu, May 6, 2004 Workshop
   
SUBMISSION FORMAT

   The format and length requirements will be the same as for full papers
   of NAACL/HLT 2004, except that submissions need not be anonymized. For
   details, go to http://www1.cs.columbia.edu/~pablo/hlt-naacl04/callpapers.html.
   
SUBMISSION PROCEDURE

   Papers should be sent to hlt-workshop@speech.sri.com. The paper
   should be an attachment in PDF format and the heading on the email
   should read "PAPER SUBMISSION". Notification of acceptance or
   rejection will be sent to the originating email address.
   
PROGRAM COMMITTEE

   Yuqing Gao (Co-chair) (IBM TJ Watson Research Center)
   Hong-Kwang Jeff Kuo (Co-chair) (IBM TJ Watson Research Center)
   Andreas Stolcke (Co-chair) (SRI & ICSI)
   Jerome Bellegarda (Apple Computer)
   Ciprian Chelba (Microsoft Research)
   Jennifer Chu-Carroll (IBM TJ Watson Research Center)
   Dan Jurafsky (University of Colorado)
   Sanjeev Khudanpur (Johns Hopkins University)
   Martha Palmer (U. Penn)
   Barbara Peskin (ICSI)
   Roberto Pieraccini (IBM TJ Watson Research Center)
   Roni Rosenfeld (CMU)
   Julia Hirschberg (Columbia University)
   Stephanie Seneff (MIT)
   
CONTACT INFORMATION

   All inquiries should be sent to hlt-workshop@speech.sri.com with
   the SUBJECT heading "NAACL/HLT WORKSHOP INQUIRY".