| |
Andreas Stolcke
Publications
Publications are grouped by research area and ordered by recency.
Most papers are available in Postscript format, and where indicated in
HTML and PDF
as well.
Research topics:
Some papers are listed under more than one topic.
W. Wang, A. Stolcke, & J. Zheng (2007),
Reranking Machine Translation Hypotheses With Structured and Web-based Language
Models.
Proc. IEEE Automatic Speech Recognition and Understanding Workshop,
pp. 159-164, Kyoto.
(PDF)
M. Creutz, T. Hirsimäki, M. Kurimo, A. Puurula, J. Pylkkönen, V. Siivola,
M. Varjokallio, E. Arisoy, M. Saraclar, and A. Stolcke (2007),
Morph-based speech recognition and modeling of out-of-vocabulary words across
languages.
ACM Transactions on Speech and Language Processing 5(1),
Article 3, 29 pages.
(PDF)
W. Wang & A. Stolcke (2007),
Integrating MAP, Marginals, and Unsupervised Language Model Adaptation,
Proc. Interspeech/Eurospeech, pp. 618-621, Antwerp.
(PDF)
G. Tur & A. Stolcke (2007),
Unsupervised Language Model Adaptation for Meeting Recognition,
Proc. IEEE ICASSP,
vol. 4, pp. 173-176, Honolulu, Hawaii.
(PDF)
M. Creutz, T. Hirsimäki, M. Kurimo, A. Puurula, J. Pylkkönen, V. Siivola,
M. Varjokallio, E. Arisoy, M. Saraclar,& A. Stolcke (2007),
Analysis of Morph-Based Speech Recognition and the Modeling of
Out-of-Vocabulary Words Across Languages.
Proc. HLT/NAACL, pp. 380-387, Rochester, NY.
(PDF)
K. Kirchhoff, D. Vergyri, J. Bilmes, K. Duh, & A. Stolcke (2006),
Morphology-based language modeling for conversational Arabic speech recognition,
Computer Speech and Language 20(4), 589-608.
(PDF,
abstract)
D. Vergyri, K. Kirchhoff, K. Duh, & A. Stolcke (2004),
Morphology-Based Language Modeling for Arabic Speech Recognition.
Proc. Intl. Conf. Spoken Language Processing,
pp. 2245-2248, Jeju, Korea.
(PDF)
W. Wang, A. Stolcke, & M. P. Harper (2004),
The Use of a Linguistically Motivated Language Model in Conversational Speech
Recognition.
Proc. IEEE Intl. Conf. on Acoustics, Speech and Signal Processing,
vol. 1, pp. 261-264, Montreal.
(PDF)
I. Buyko, M. Ostendorf, & A. Stolcke (2003),
Class-dependent Interpolation for Estimating Language Models from Multiple Text Sources,
Technical Report UWEETR-2003-0003, Dept. of Electrical Engineering,
University of Washington, Seattle.
Shorter version appeared as
Getting More Mileage from Web Text Sources for Conversational
Speech Language Modeling using Class-Dependent Mixtures
in Proc. HLT-NAACL Conference, vol. 2, pp. 7-9,
Edmonton, Canada, May 2003.
W. Wang, M. P. Harper, & A. Stolcke (2003),
The Robustness of an Almost-Parsing Language Model Given Errorful Training Data.
Proc. IEEE Intl. Conf. on Acoustics, Speech and Signal Processing,
vol. 1, pp. 240-243, Hong Kong.
(PDF)
A. Stolcke (2002),
SRILM -- An Extensible Language Modeling Toolkit.
Proc. Intl. Conf. on Spoken Language Processing,
vol. 2, pp. 901-904, Denver.
(PDF)
A. Stolcke, K. Ries, N. Coccaro, E. Shriberg, R. Bates, D. Jurafsky,
P. Taylor, R. Martin, C. Van Ess-Dykema, & M. Meteer (2000),
Dialogue Act Modeling for Automatic Tagging and Recognition of
Conversational Speech,
Computational Linguistics 26(3), 339-373.
(PDF,
abstract)
F. Weng, A. Stolcke, & M. Cohen (2000),
Language Modelling for Multilingual Speech Translation,
Chapter 16 in
M. Rayner, D. Carter, P. Bouillon, V. Digalakis, & M. Wirén (eds.),
The Spoken Language Translator,
pp. 250-264,
Cambridge University Press.
(PDF)
A. Stolcke (1998),
Entropy-based Pruning of Backoff Language Models.
Proc.
DARPA Broadcast News Transcription and Understanding Workshop,
pp. 270-274, Lansdowne, VA.
(HTML,
PDF)
NOTE: See erratum at end of postscript file.
A. Stolcke (1997),
Modeling Linguistic Segment and Turn Boundaries for N-best Rescoring of Spontaneous Speech.
Proc. EUROSPEECH, vol. 5, pp. 2779-2782,
Rhodes, Greece.
(PDF)
C. Chelba, D. Engle, F. Jelinek, V. Jimenez, S. Khudanpur, L. Mangu,
H. Printz, E. Ristad, R. Rosenfeld, A. Stolcke, & D. Wu (1997),
Structure and Performance of a Dependency Language Model.
Proc. EUROSPEECH, vol. 5, pp. 2775-2778,
Rhodes, Greece.
(PDF)
F. Weng, A. Stolcke, & A. Sankar (1997),
Hub4 Language Modeling Using Domain Interpolation and Data Clustering.
Proc.
DARPA Speech Recognition Workshop, pp. 147-151,
Chantilly, VA.
(HTML,
PDF)
A. Stolcke & E. Shriberg (1996),
Statistical language modeling for speech disfluencies.
Proc. IEEE Intl. Conf. on Acoustics,
Speech and Signal Processing, vol. 1, pp. 405-408,
Atlanta, GA.
(HTML,
PDF)
M. Weintraub, Y. Aksu, S. Dharanipragada, S. Khudanpur, H. Ney, J. Prange,
A. Stolcke, F. Jelinek, E. Shriberg (1996),
LM95 Project Report: Fast Training and Portability.
In
1995 Language Modeling Summer Research Workshop Technical Reports,
Research Note 1, Center for Language and Speech Processing, Johns Hopkins
University, Baltimore.
PDF)
D. Jurafsky, C. Wooters, J. Segal, A. Stolcke,
E. Fosler, G. Tajchman, & N. Morgan (1995),
Using a Stochastic Context-Free Grammar as a Language Model
for Speech Recognition.
Proc. IEEE Intl. Conf. on Acoustics,
Speech and Signal Processing, vol. 1, pp. 189-192,
Detroit.
A. Stolcke & J. Segal (1994),
Precise N-gram Probabilities from Stochastic Context-free Grammars.
Proc. ACL, pp. 74-79,
Las Cruces, NM.
(HTML)
E. Shriberg & A. Stolcke (2008),
The Case for Automatic Higher-Level Features in Forensic Speaker
Recognition,
to appear in
Proc. Interspeech, Brisbane, Australia.
(PDF)
A. Stolcke, S. Kajarekar, & L. Ferrer (2008),
Nonparametric Feature Normalization for SVM-based Speaker Verification,
Proc. IEEE ICASSP, pp. 1577-1580, Las Vegas.
(PDF)
E. Shriberg, L. Ferrer, S. Kajarekar, N. Scheffer, A. Stolcke,
& M. Akbacak (2008),
Detecting Nonnative Speech Using Speaker Recognition Approaches.
Proc. Odyssey Speaker and Language Recognition Workshop,
Stellenbosch, South Africa.
(PDF)
A. Stolcke & S. Kajarekar (2008),
Recognizing Arabic Speakers with English Phones.
Proc. Odyssey Speaker and Language Recognition Workshop,
Stellenbosch, South Africa.
(PDF)
A. Stolcke, S. Kajarekar, L. Ferrer, & E. Shriberg (2007),
Speaker Recognition with Session Variability Normalization Based on
MLLR Adaptation Transforms,
IEEE Transactions on Audio, Speech, and Language Processing,
15(7), 1987-1998.
Special issue on speaker and language recognition.
(PDF,
abstract)
G. Tur, E. Shriberg, A. Stolcke, & S. Kajarekar (2007),
Duration and Pronunciation Conditioned Lexical Modeling for Speaker Verification
Proc. Interspeech/Eurospeech, pp. 2049-2052, Antwerp.
(PDF)
S. Kajarekar & A. Stolcke (2007),
NAP and WCCN: Comparison of Approaches Using MLLR-SVM Speaker Verification
System,
Proc. IEEE ICASSP,
vol. 4, pp. 249-252, Honolulu, Hawaii.
(PDF)
A. Stolcke, E. Shriberg, L. Ferrer, S. Kajarekar, K. Sonmez, & G. Tur (2007),
Speech Recognition as Feature Extraction for Speaker Recognition,
Proc. SAFE 2007: Workshop on Signal Processing Applications for Public
Security and Forensics,
pp. 39-43, Washington, D.C.
(PDF)
M. Graciarena, S. Kajarekar, A. Stolcke, E. Shriberg (2007),
Noise Robust Speaker Identification for Spontaneous Arabic Speech,
Proc. IEEE ICASSP,
vol. 4, pp. 245-248, Honolulu, Hawaii.
(PDF)
A. O. Hatch, S. Kajarekar, & A. Stolcke (2006),
Within-Class Covariance Normalization for SVM-based Speaker Recognition.
Proc. ICSLP, pp. 1471-1474, Pittsburgh.
(PDF)
A. Stolcke, L. Ferrer, & S. Kajarekar (2006),
Improvements in MLLR-Transform-based Speaker Recognition.
Proc. IEEE Odyssey 2006 Speaker and Language Recognition Workshop,
pp. 1-6, San Juan, Puerto Rico.
(PDF)
L. Ferrer, E. Shriberg, S. S. Kajarekar, A. Stolcke, K. Sonmez,
A. Venkataraman, & H. Bratt (2006),
The Contribution of Cepstral and Stylistic Features to SRI's 2005 NIST
Speaker Recognition Evaluation System.
Proc. IEEE ICASSP, vol. 1, pp. 101-104, Toulouse.
(PDF)
A. O. Hatch & A. Stolcke (2006),
Generalized Linear Kernels for One-Versus-All Classification:
Application to Speaker Recognition.
Proc. IEEE ICASSP, vol. 5, pp. 585-588, Toulouse.
(PDF)
A. O. Hatch, A. Stolcke, & B. Peskin (2005),
Combining Feature Sets with Support Vector Machines:
Application to Speaker Recognition.
Proc. IEEE Speech Recognition and Understanding Workshop,
pp. 75-79, San Juan, Puerto Rico.
(PDF)
E. Shriberg, L. Ferrer, S. Kajarekar, A. Venkataraman, & A. Stolcke (2005),
Modeling Prosodic Feature Sequences for Speaker Recognition.
Speech Communication 46(3-4), 455-472.
Special Issue on Quantitative Prosody Modelling for Natural Speech
Description and Generation.
(abstract)
A. Stolcke, L. Ferrer, S. Kajarekar, E. Shriberg, & A. Venkataraman (2005),
MLLR Transforms as Features in Speaker Recognition.
Proc. Eurospeech, Lisbon, pp. 2425-2428.
(PDF)
S. S. Kajarekar, L. Ferrer, E. Shriberg, K. Sonmez, A. Stolcke,
A. Venkataraman, and J. Zheng (2005),
SRI's 2004 NIST Speaker Recognition Evaluation System,
Proc. IEEE ICASSP, Philadelphia, vol. 1, pp. 173-176.
(PDF)
A. O. Hatch, B. Peskin, & A. Stolcke (2005),
Improved Phonetic Speaker Recognition Using Lattice Decoding,
Proc. IEEE ICASSP, Philadelphia, vol. 1, pp. 169-172.
(PDF)
S. Kajarekar, L. Ferrer, K. Sonmez, J. Zheng, E. Shriberg, & A. Stolcke (2004),
Modeling NERFs for Speaker Recognition.
Proc. Odyssey 04 Speaker and Language Recognition Workshop,
pp. 51-56, Toledo, Spain.
(PDF)
S. Kajarekar, L. Ferrer, A. Venkataraman, K. Sonmez, E. Shriberg, A. Stolcke,
& R. R. Gadde (2003),
Speaker Recognition using Prosodic and Lexical Features.
Proc. IEEE Speech Recognition and Understanding Workshop,
pp. 19-24, St. Thomas, U.S. Virgin Islands.
(PDF)
L. Ferrer, H. Bratt, V. R. R. Gadde, S. Kajarekar, E. Shriberg, K. Sonmez,
A. Stolcke, & A. Venkataraman (2003),
Modeling Duration Patterns for Speaker Recognition.
Proc. Eurospeech,
pp. 2017-2020, Geneva.
(PDF)
S. Kajarekar, K. Sonmez, L. Ferrer, V. Gadde, A. Venkataraman, E. Shriberg,
A. Stolcke, & H. Bratt (2003),
"TalkPrinting": Improving Speaker Recognition by Modeling Stylistic Features
In
Intelligence and Security Informatics.
First NSF/NIJ Symposium, ISI 2003,
Springer Lecture Notes in Computer Science Series,
Volume 2665,
H. Chen, R. Miranda, D.D. Zeng, C. Demchak, J. Schroeder, & T. Madhusudan,
editors, pp. 350-354.
© 2003 Springer-Verlag.
(PDF,
abstract)
F. Enos, E. Shriberg, M. Graciarena, J. Hirschberg, & A. Stolcke (2007),
Detecting Deception Using Critical Segments,
Proc. Interspeech/Eurospeech, pp. 2281-2284, Antwerp.
(PDF)
M. Graciarena, E. Shriberg, A. Stolcke, F. Enos, J. Hirschberg, S. Kajarekar
(2006),
Combining Prosodic, Lexical and Cepstral Systems for Deceptive Speech Detection.
Proc. IEEE ICASSP, vol. 1, pp. 1033-1036, Toulouse.
(PDF)
J. Hirschberg, S. Benus, J. M. Brenier, F. Enos, S. Friedman,
S. Gilman, C. Girand, M. Graciarena, A. Kathol, L. Michaelis,
B. Pellom, E. Shriberg, & A. Stolcke (2005),
Distinguishing Deceptive from Non-Deceptive Speech.
Proc. Eurospeech, Lisbon, pp. 1833-1836.
(PDF)
J. Ang, R. Dhillon, A. Krupski, E. Shriberg, & A. Stolcke (2002),
Prosody-Based Automatic Detection of Annoyance and Frustration
in Human-Computer Dialog.
Proc. Intl. Conf. on Spoken Language Processing,
vol. 3, pp. 2037-2040, Denver.
(PDF)
A. Stolcke, X. Anguera, K. Boakye, O. Cetin, A. Janin, M. Magimai-Doss,
C. Wooters, & J. Zheng (2008),
The SRI-ICSI Spring 2007 Meeting and Lecture Recognition System,
R. Stiefelhagen, R. Bowers, and J. Fiscus (eds.),
Multimodal Technologies for Perception of Humans.
International Evaluation Workshops CLEAR 2007 and RT 2007,
Springer Lecture Notes in Computer Science 4625,
pp. 450-463.
(PDF,
abstract)
J. Zheng & A. Stolcke (2007),
fMPE-MAP: Improved Discriminative Adaptation for Modeling New Domains,
Proc. Interspeech/Eurospeech, pp. 1573-1576, Antwerp
(PDF)
G. Tur & A. Stolcke (2007),
Unsupervised Language Model Adaptation for Meeting Recognition,
Proc. IEEE ICASSP,
vol. 4, pp. 173-176, Honolulu, Hawaii.
(PDF)
A. Janin, A. Stolcke, X. Anguera, K. Boakye, O. Cetin, J. Frankel, J. Zheng
(2006),
The ICSI-SRI Spring 2006 Meeting Recognition System.
In
Machine Learning for Multimodal Interaction:
Third International Workshop, MLMI 2006,
Springer Lecture Notes in Computer Science Series,
S. Renals, S. Bengio, & J. Fiscus, editors, pp. 444-456.
© 2006 Springer-Verlag.
(PDF,
abstract)
K. Boakye & A. Stolcke (2006),
Improved Speech Activity Detection Using Cross-Channel Features for Recognition
of Multiparty Meetings.
Proc. ICSLP, pp. 1962-1965, Pittsburgh.
(PDF)
M. Zimmermann, A. Stolcke, & E. Shriberg (2006),
Joint Segmentation and Classification of Dialog Acts in Multiparty Meetings.
Proc. IEEE ICASSP, vol. 1, pp. 581-584, Toulouse.
(PDF)
M. Zimmermann, Y. Liu, E. Shriberg, & A. Stolcke (2005),
A* based Joint Segmentation and Classification of Dialog Acts in Multiparty
Meetings.
Proc. IEEE Speech Recognition and Understanding Workshop,
pp. 215-219, San Juan, Puerto Rico.
(PDF)
A. Stolcke, X. Anguera, K. Boakye, O. Cetin, F. Grezl, A. Janin,
A. Mandal, B. Peskin, C. Wooters, & J. Zheng (2005),
Further Progress in Meeting Recognition: The ICSI-SRI Spring 2005
Speech-to-Text Evaluation System.
Proc. NIST MLMI Meeting Recognition Workshop, Edinburgh.
Also in
Machine Learning for Multimodal Interaction:
Second International Workshop, MLMI 2005,
Springer Lecture Notes in Computer Science Series,
Volume 3869, S. Renals & S. Bengio, editors, pp. 463-475.
© 2006 Springer-Verlag.
(PDF,
abstract)
O. Cetin & A. Stolcke (2005),
Language Modeling in the ICSI-SRI Spring 2005 Meeting Speech Recognition
Evaluation System.
Technical Report TR-05-006, International Computer Science Institute,
Berkeley, CA.
M. Zimmermann, Y. Liu, E. Shriberg, & A. Stolcke (2005),
Toward Joint Segmentation and Classification of Dialog Acts in Multiparty
Meetings,
in
Machine Learning for Multimodal Interaction:
Second International Workshop, MLMI 2005,
Springer Lecture Notes in Computer Science Series,
Volume 3869, S. Renals & S. Bengio, editors, pp. 187-193.
© 2006 Springer-Verlag.
(Abstract)
N. Mirghafori, A. Stolcke C. Wooters, T. Pirinen, I. Bulyko,
D. Gelbart, M. Graciarena, S. Otterson, B. Peskin, & M. Ostendorf (2004),
From Switchboard to Meetings:
Development of the 2004 ICSI-SRI-UW Meeting Recognition System.
Proc. Intl. Conf. Spoken Language Processing,
pp. 1957-1960, Jeju, Korea.
(PDF)
A. Stolcke, C. Wooters, N. Mirghafori, T. Pirinen, I. Bulyko,
D. Gelbart, M. Graciarena, S. Otterson, B. Peskin, & M. Ostendorf (2004),
Progress in Meeting Recognition:
The ICSI-SRI-UW Spring 2004 Evaluation System.
NIST ICASSP 2004 Meeting Recognition Workshop, Montreal.
(PDF)
N. Morgan, D. Baron, S. Bhagat, H. Carvey, R. Dhillon, J. Edwards, D. Gelbart,
A. Janin, A. Krupski, B. Peskin, T. Pfau, E. Shriberg, A. Stolcke, &
C. Wooters (2003),
Meetings about meetings: research at ICSI on speech in multiparty conversations .
Proc. IEEE Intl. Conf. on Acoustics, Speech and Signal Processing,
vol. 4, pp. 740-743, Hong Kong.
D. Baron, E. Shriberg, & A. Stolcke (2002),
Automatic Punctuation and Disfluency Detection in Multi-Party
Meetings Using Prosodic and Lexical Cues.
Proc. Intl. Conf. on Spoken Language Processing,
vol. 2, pp. 949-952, Denver.
(PDF)
T. Pfau, D.P.W. Ellis, & A. Stolcke (2001),
Multispeaker Speech Activity Detection for the ICSI Meeting Recorder.
Proc. IEEE Automatic Speech Recognition and Understanding Workshop,
pp. 107-110,
Madonna di Campiglio, Italy.
(PDF)
E. Shriberg, A. Stolcke, & D. Baron (2001),
Can Prosody Aid the Automatic Processing of Multi-Party Meetings?
Evidence from Predicting Punctuation, Disfluencies, and Overlapping Speech.
In M. Bacchiani, J. Hirschberg, D. Litman, & M. Ostendorf (eds.),
Proc. ISCA Tutorial and Research Workshop on Prosody in
Speech Recognition and Understanding, pp. 139-146, Red Bank, NJ.
(PDF)
E. Shriberg, A. Stolcke, & D. Baron (2001),
Observations on Overlap: Findings and Implications for
Automatic Processing of Multi-Party Conversation.
Proc. EUROSPEECH, vol. 2, pp. 1359-1362,
Aalborg, Denmark.
(PDF)
N. Morgan, D. Baron, J. Edwards, D. Ellis, D. Gelbart, A. Janin,
T. Pfau, E. Shriberg, & A. Stolcke (2001),
The Meeting Project at ICSI,
Proc. of
HLT 2001, First International Conference on Human
Language Technology Research, pp. 246-252, San Diego, CA.
(PDF)
Y. Liu, E. Shriberg, A. Stolcke, D. Hillard, M. Ostendorf, & M. Harper (2006),
Enriching Speech Recognition with Automatic Detection of Sentence Boundaries
and Disfluencies.
IEEE Trans. Audio, Speech and Language Processing
14(5), 1526-1540.
(PDF,
abstract)
Y. Liu, N. V. Chawla, M. P. Harper, E. Shriberg, & A. Stolcke (2006),
A study in machine learning from imbalanced data for sentence boundary
detection in speech,
Computer Speech and Language 20(4), 468-494.
(PDF,
abstract)
D. Jones, W. Shen, E. Shriberg, A. Stolcke, T. Kamm, & D. Reynolds (2005),
Two Experiments Comparing Reading with Listening for Human Processing of
Conversational Telephone Speech.
Proc. Eurospeech, Lisbon, pp. 1145-1148.
(PDF)
Y. Liu, E. Shriberg, A. Stolcke, & M. Harper (2005),
Comparing HMM, Maximum Entropy, and Conditional Random Fields for Disfluency
Detection.
Proc. Eurospeech, Lisbon, pp. 3313-3316.
(PDF)
Y. Liu, A. Stolcke, E. Shriberg, & M. Harper (2005),
Using Conditional Random Fields for Sentence Boundary Detection in Speech,
Proc. ACL, Ann Arbor, MI, pp. 451-458.
Y. Liu, E. Shriberg, A. Stolcke, B. Peskin, D. Hillard, M. Ostendorf,
M. Tomalin, P. Woodland, and M. Harper (2005),
Structural Metadata Research in the EARS Program,
Proc. IEEE ICASSP, Philadelphia, vol. 5, pp. 957-980.
(PDF)
Y. Liu, E. Shriberg, A. Stolcke, D. Hillard, M. Ostendorf, B. Peskin,
& M. Harper (2004),
The ICSI-SRI-UW Metadata Extraction System.
Proc. Intl. Conf. on Spoken Language Processing,
pp. 577-580, Jeju, Korea.
(PDF)
Y. Liu, E. Shriberg, A. Stolcke, & M. Harper (2004),
Using Machine Learning to Cope with Imbalanced Classes in Natural Speech:
Evidence from Sentence Boundary and Disfluency Detection.
Proc. Intl. Conf. on Spoken Language Processing,
pp. 1525-1528, Jeju, Korea.
(PDF)
Y. Liu, A. Stolcke, E. Shriberg, & M. Harper (2004),
Comparing and Combining Generative and Posterior Probability Models:
Some Advances in Sentence Boundary Detection in Speech.
Proc. Conf. on Empirical Methods in Natural Language Processing,
pp. 64-71, Barcelona.
(PDF)
D. Hillard, M. Ostendorf, A. Stolcke, Y. Liu, & E. Shriberg (2004),
Improving Automatic Sentence Boundary Detection with Confusion Networks.
Proc. HLT-NAACL Conference,
Short papers, pp. 69-72, Boston.
(PDF)
Y. Liu, E. Shriberg, & A. Stolcke (2003),
Automatic disfluency identification in conversational speech using multiple
knowledge sources.
Proc. Eurospeech,
pp. 957-960, Geneva.
(PDF)
A. Stolcke, E. Shriberg, R. Bates, M. Ostendorf, D. Hakkani, M. Plauche,
G. Tur, & Y. Lu (1998),
Automatic Detection of Sentence Boundaries and Disfluencies based on
Recognized Words.
Proc. Intl. Conf. on Spoken Language Processing,
vol. 5, pp. 2247-2250,
Sydney, Australia.
(PDF)
E. Shriberg & A. Stolcke (1998),
How far do speakers back up in their repairs? A quantitative model.
Proc. Intl. Conf. on Spoken Language Processing,
vol. 5, pp. 2183-2186,
Sydney, Australia.
(PDF)
E. Shriberg, R. Bates, & A. Stolcke (1997),
A Prosody-Only Decision-Tree Model for Disfluency Detection.
Proc. EUROSPEECH, vol. 5, pp. 2383-2386,
Rhodes, Greece.
(PDF)
A. Stolcke & E. Shriberg (1996),
Statistical language modeling for speech disfluencies.
Proc. IEEE Intl. Conf. on Acoustics,
Speech and Signal Processing, vol. 1, pp. 405-409,
Atlanta, GA.
(HTML,
PDF)
E. Shriberg & A. Stolcke (1996),
Word predictability after filled pauses: A corpus-based study.
Proc. Intl. Conf. on Spoken Language Processing, vol. 3, pp. 1868-1871,
Philadelphia, PA.
(PDF)
A. Stolcke & E. Shriberg (1996),
Automatic linguistic segmentation of conversational speech.
Proc. Intl. Conf. on Spoken Language Processing, vol. 2, pp. 1005-1008,
Philadelphia, PA.
(HTML,
PDF)
E. Shriberg, D. R. Ladd, J. Terken, & A. Stolcke (1996),
Modeling Pitch Range Variation Within and Across Speakers:
Predicting F0 Targets when "Speaking Up".
Proc. Intl. Conf. on Spoken Language Processing, Addendum, pp. 1-4,
Philadelphia, PA.
(PDF)
A. Venkataraman, Y. Liu, E. Shriberg, & A. Stolcke (2005),
Does Active Learning Help Automatic Dialog Act Tagging in Meeting Data?.
Proc. Eurospeech, Lisbon, pp. 2777-2780.
(PDF)
A. Venkataraman, L. Ferrer, A. Stolcke, & E. Shriberg (2003),
Training a Prosody-based Dialog Act Tagger from Unlabeled Data.
Proc. IEEE Intl. Conf. on Acoustics, Speech and Signal Processing,
vol. 1, pp. 272-275, Hong Kong.
(PDF)
A. Venkataraman, A. Stolcke, & E. Shriberg (2002),
Automatic Dialog Act Tagging with Minimal Supervision.
Proc. 9th Australian International Conference on Speech Science
and Technology, Melbourne.
(PDF)
A. Stolcke, K. Ries, N. Coccaro, E. Shriberg, R. Bates, D. Jurafsky,
P. Taylor, R. Martin, C. Van Ess-Dykema, & M. Meteer (2000),
Dialogue Act Modeling for Automatic Tagging and Recognition of
Conversational Speech,
Computational Linguistics 26(3), 339-373.
(PDF,
abstract)
E. Shriberg, R. Bates, A. Stolcke, P. Taylor, D. Jurafsky, K. Ries,
N. Coccaro, R. Martin, M. Meteer, & C. Van Ess-Dykema (1998),
Can Prosody Aid the Automatic Classification of Dialog Acts
in Conversational Speech?
Language and Speech 41(3-4), 439-487.
(PDF,
abstract)
A. Stolcke, E. Shriberg, R. Bates, N. Coccaro, D. Jurafsky, R. Martin,
M. Meteer, K. Ries, P. Taylor, & C. Van Ess-Dykema (1998),
Dialog Act Modeling for Conversational Speech.
In
Applying Machine Learning to Discourse Processing.
Papers from the 1998 AAAI Spring Symposium,
Technical Report SS-98-01, pp. 98-105.
AAAI Press, Menlo Park, CA.
(PDF)
D. Jurafsky, R. Bates, N. Coccaro, R. Martin, M. Meteer, K. Ries, E. Shriberg,
A. Stolcke, Paul Taylor, & C. Van Ess-Dykema (1997),
Automatic Detection of Discourse Structure for Speech Recognition
and Understanding.
Proc. IEEE Workshop on Speech Recognition and Understanding, pp. 88-95,
Santa Barbara, CA.
(PDF)
M. Akbacak, D. Vergyri, & A. Stolcke (2008),
Open-Vocabulary Spoken Term Detection Using Graphone-Based Hybrid
Recognition Systems,
Proc. IEEE ICASSP, pp. 5240-5243, Las Vegas.
(PDF)
D. Vergyri, I. Shafran, A. Stolcke, R. R. Gadde, M. Akbacak, B. Roark,
& W. Wang (2007),
The SRI/OGI 2006 Spoken Term Detection System,
Proc. Interspeech/Eurospeech, pp. 2393-2396, Antwerp.
(PDF)
D. Gelbart, J. Bryant, A. Stolcke, R. Porzel, M. Baudis, & Nelson Morgan
(2006),
SmartKom-English: From Robust Recognition to Felicitous Interaction,
SmartKom: Foundations of Multimodal Dialogue Systems,
Springer, pp. 453-470.
(Abstract)
E. Shriberg & A. Stolcke (2004),
Direct Modeling of Prosody: An Overview of Applications to Automatic Speech Processing.
Proc. International Conference on Speech Prosody,
Nara, Japan.
(PDF)
E. Shriberg & A. Stolcke (2004),
Prosody Modeling for Automatic Speech Recognition and Understanding
Mathematical Foundations of Speech and Language Processing,
M. Johnson, S. Khudanpur, M. Ostendorf, and R. Rosenfeld (eds.),
Volume 138 in IMA Volumes in Mathematics and its Applications,
pp. 105-114, Springer-Verlag.
(PDF)
L. Ferrer, E. Shriberg, & A. Stolcke (2003),
A prosody-based approach to end-of-utterance detection that does not
require speech recognition.
Proc. IEEE Intl. Conf. on Acoustics, Speech and Signal Processing,
vol. 1, pp. 608-611, Hong Kong.
(PDF)
L. Ferrer, E. Shriberg, & A. Stolcke (2002),
Is the Speaker Done Yet? Faster and More Accurate
End-of-Utterance Detection Using Prosody.
Proc. Intl. Conf. on Spoken Language Processing,
vol. 3, pp. 2061-2064, Denver.
(PDF)
A. Stolcke & E. Shriberg (2001),
Markovian Combination of Language and Prosodic Models for better Speech
Understanding and Recognition .
Invited talk at the
IEEE Workshop on Speech Recognition and Understanding,
Madonna di Campiglio, Italy, December 2001.
(PDF)
E. Shriberg & A. Stolcke (2001),
Prosody Modeling for Automatic Speech Understanding:
An Overview of Recent Research at SRI.
In M. Bacchiani, J. Hirschberg, D. Litman, & M. Ostendorf (eds.),
Proc. ISCA Tutorial and Research Workshop on Prosody in
Speech Recognition and Understanding, pp. 13-16, Red Bank, NJ.
(PDF)
G. Tur, D. Hakkani-Tur, A. Stolcke, & E. Shriberg (2001),
Integrating Prosodic and Lexical Cues for Automatic Topic Segmentation,
Computational Linguistics, 27(1), 31-57.
(PDF,
abstract)
E. Shriberg, A. Stolcke, D. Hakkani-Tur, & G. Tur (2000),
Prosody-Based Automatic Segmentation of Speech into Sentences and Topics,
Speech Communication 32(1-2), 127-154
(Special Issue on Accessing Information in Spoken Audio).
(PDF,
abstract)
D. Hakkani-Tur, G. Tur, A. Stolcke, & E. Shriberg (1999),
Combining Words and Prosody for Information Extraction from Speech.
Proc. EUROSPEECH, vol. 5, pp. 1991-1994, Budapest.
(PDF)
A. Stolcke, E. Shriberg, D. Hakkani-Tur, G. Tur, Z. Rivlin, K. Sonmez (1999),
Combining Words and Speech Prosody for Automatic Topic Segmentation.
Proc. DARPA
Broadcast News Workshop, pp. 61-64,
Herndon, VA.
(HTML, PDF)
Z. Rivlin, D. Appelt, R. Bolles, A. Cheyer, D. Hakkani-Tur, D. Israel,
L. Julia, D. Martin, G. Myers, K. Nitz, B. Sabata, A. Sankar, E. Shriberg,
K. Sonmez, A. Stolcke, & G. Tur (2000),
MAESTRO:
Conductor of Multimedia Analysis Technologies,
Communications of the ACM
43(2), 57-63, Special Issue on News on Demand, February 2000.
D. Vergyri, A. Mandal, W. Wang, A. Stolcke, J. Zheng, M. Graciarena,
D. Rybach, C. Gollan, R. Schlüter, K. Kirchhoff, A. Faria, & N. Morgan (2008),
Development of the SRI/Nightingale Arabic ASR system,
to appear in
Proc. Interspeech, Brisbane, Australia.
(PDF)
J. Zheng & A. Stolcke (2007),
fMPE-MAP: Improved Discriminative Adaptation for Modeling New Domains,
Proc. Interspeech/Eurospeech, pp. 1573-1576, Antwerp
(PDF)
J. Zheng, O. Cetin, M.-Y. Hwang, X. Lei, A. Stolcke, & N. Morgan (2007),
Combining Discriminative Feature, Transform, and Model Training
for Large Vocabulary Speech Recognition,
Proc. IEEE ICASSP,
vol. 4, pp. 633-636, Honolulu, Hawaii.
(PDF)
A. Stolcke, B. Chen, H. Franco, V. R. R. Gadde, M. Graciarena, M.-Y. Hwang,
K. Kirchhoff, A. Mandal, N. Morgan, X. Lin, T. Ng, M. Ostendorf, K. Sonmez,
A. Venkataraman, D. Vergyri, W. Wang, J. Zheng, & Q. Zhu (2006),
Recent Innovations in Speech-to-Text Transcription at SRI-ICSI-UW.
IEEE Trans. Audio, Speech and Language Processing
14(5), 1729-1744.
(PDF,
abstract)
A. Mandal, M. Ostendorf, & A. Stolcke (2006),
Speaker Clustered Regression-Class Trees for MLLR Adaptation.
Proc. ICSLP, pp. 1133-1136, Pittsburgh.
(PDF)
A. Stolcke, F. Grezl, M.-Y. Hwang, X. Lei, N. Morgan, & D. Vergyri (2006),
Cross-domain and Cross-language Portability of Acoustic
Features Estimated by Multilayer Perceptrons.
Proc. IEEE ICASSP, vol. 1, pp. 321-324, Toulouse.
(PDF)
N. Morgan, Q. Zhu, A. Stolcke, K. Sonmez, S. Sivadas, T. Shinozaki,
M. Ostendorf, P. Jain, H. Hermansky, D. Ellis, G. Doddington, B. Chen,
O. Cetin, H. Bourlard and M. Athineos (2005),
Pushing the Envelope -- Aside,
IEEE Signal Processing Magazine 22(5), 81-88.
(PDF,
abstract)
Q. Zhu, A. Stolcke, B. Chen, & N. Morgan (2005),
Using MLP Features in SRI's Conversational Speech Recognition System.
Proc. Eurospeech, Lisbon, pp. 2141-2144.
(PDF)
J. Zheng & A. Stolcke (2005),
Improved Discriminative Training Using Phone Lattices.
Proc. Eurospeech, Lisbon, pp. 2125-2128.
(PDF)
A. Mandal, M. Ostendorf, & A. Stolcke (2005),
Leveraging Speaker-dependent Variation of Adaptation.
Proc. Eurospeech, Lisbon, pp. 1793-1796.
(PDF)
D. Vergyri, K. Kirchhoff, R. Gadde, A. Stolcke, & J. Zheng (2005),
Development of a Conversational Telephone Speech Recognizer for
Levantine Arabic.
Proc. Eurospeech, Lisbon, pp. 1613-1616.
(PDF)
Q. Zhu, B. Chen, N. Morgan, & A. Stolcke (2005),
Tandem Connectionist Feature Extraction for Conversational Speech Recognition,
in
Machine Learning for Multimodal Interaction.
First International Workshop, MLMI-2004,
pp. 223-231.
(Abstract)
M. Hwang, X. Lei, T. Ng, I. Bulyko, M. Ostendorf, A. Stolcke, W. Wang,
J. Zheng, V. R. R. Gadde, M. Graciarena, M. Siu, Y. Huang (2004),
Progress on Mandarin Conversational Telephone Speech Recognition.
Proc. 4th Intl. Symposium on Chinese Spoken Lanugage Processing,
Hong Kong.
Q. Zhu, A. Stolcke, B. Chen, & N. Morgan (2004),
Incorporating Tandem/HATs MLP Features into SRI's Conversational Speech
Recognition System.
Proc. DARPA RT-04F Rich Transcription Workshop,
Palisades, New York, November 2004.
(PDF)
M. Hwang, X. Lei, T. Ng, M. Ostendorf, A. Stolcke, W. Wang, J. Zheng, & V. Gadde
(2004),
Porting Decipher from English to Mandarin.
Proc. DARPA RT-04F Rich Transcription Workshop,
Palisades, New York, November 2004.
J. Zheng, H. Franco, & A. Stolcke,
A. Venkataraman, A. Stolcke, W. Wang, D. Vergyri, V. R. R. Gadde, & J. Zheng
(2004),
Effective Acoustic Modeling for Rate-of-Speech Variation in
Large Vocabulary Conversational Speech Recognition.
Proc. Intl. Conf. on Spoken Language Processing,
pp. 401-404, Jeju, Korea.
(PDF)
A. Venkataraman, A. Stolcke, W. Wang, D. Vergyri, V. R. R. Gadde, & J. Zheng
(2004),
An Efficient Repair Procedure For Quick Transcriptions.
Proc. Intl. Conf. on Spoken Language Processing,
pp. 1961-1964, Jeju, Korea.
(PDF)
Q. Zhu, B. Chen, N. Morgan, & A. Stolcke (2004),
On Using MLP Features in LVCSR.
Proc. Intl. Conf. on Spoken Language Processing,
pp. 921-924, Jeju, Korea.
(PDF)
N. Morgan, B. Y. Chen, Q. Zhu, & A. Stolcke (2004),
TRAPping Conversational Speech: Extending TRAP/Tandem approaches to
conversational telephone speech recognition.
Proc. IEEE Intl. Conf. on Acoustics, Speech and Signal Processing,
vol. 1, pp. 536-539, Montreal.
(PDF)
M. Graciarena, H. Franco, J. Zheng, D. Vergyri, & A. Stolcke (2004),
Voicing Feature Integration in SRI's Decipher LVCSR System.
Proc. IEEE Intl. Conf. on Acoustics, Speech and Signal Processing,
vol. 1, pp. 921-924, Montreal.
(PDF)
J. Zheng, H. Franco, & A. Stolcke (2003),
Modeling word-level rate-of-speech variation in large vocabulary conversational
speech recognition.
Speech Communication 41(2-3), 273-285.
(PDF,
abstract)
B. Hodjat, H. Franco, H. Bratt, K. Precoda, A. Stolcke, A. Venkataraman,
D. Vergyri, & J. Zheng (2003),
Iterative statistical language model generation for use with an agent-oriented
natural language interface.
Proc. 10th International Conference on Human-Computer Interaction,
Crete.
(PDF)
D. Vergyri, A. Stolcke, V. R. R. Gadde, L. Ferrer, & E. Shriberg (2003),
Prosodic Knowledge Sources for Automatic Speech Recognition.
Proc. IEEE Intl. Conf. on Acoustics, Speech and Signal Processing,
vol. 1, pp. 208-211, Hong Kong.
(PDF)
V. R. Rao Gadde, A. Stolcke, D. Vergyri, J. Zheng, K. Sonmez, &
A. Venkataraman (2002),
Building an ASR System for Noisy Environments:
SRI's 2001 SPINE Evaluation System.
Proc. Intl. Conf. on Spoken Language Processing,
vol. 3, pp. 1577-1580, Denver.
(PDF)
A. Sankar, V. R. Rao Gadde, A. Stolcke, & F. Weng (2002),
Improved Modeling and Efficiency for Automatic Transcription
of Broadcast News,
Speech Communication 37(1-2), 133-158.
(PDF,
abstract)
H. Franco, J. Zheng, J. Butzberger, F. Cesari, M. Frandsen, J. Arnold, R. Rao,
A. Stolcke, & V. Abrash (2002),
DynaSpeak: SRI's scalable speech recognizer for embedded and mobile systems.
Proc. Human Language Technology Conference (HLT-2002),
San Diego, CA.
(PDF)
J. Zheng, J. Butzberger, H. Franco, & A. Stolcke (2001),
Improved Maximum Mutual Information Estimation Training of Continuous
Density HMMs.
Proc. EUROSPEECH, vol. 2, pp. 679-682,
Aalborg, Denmark.
(PDF)
J. Zheng, H. Franco, & A. Stolcke (2000),
Rate-dependent Acoustic Modeling for Large Vocabulary Conversational
Speech Recognition.
Proc. ISCA Tutorial and Research Workshop on Automatic Speech Recognition:
Challenges for the new Millenium,
Paris.
(PDF)
J. Zheng, H. Franco, & A. Stolcke (2000),
Rate-dependent Acoustic Modeling for Large Vocabulary Conversational
Speech Recognition.
Proc. NIST Speech Transcription Workshop,
College Park, MD.
(Preliminary version of paper above,
HTML,
PDF)
A. Stolcke, H. Bratt, J. Butzberger, H. Franco, V. R. Rao Gadde, M. Plauche,
C. Richey, E. Shriberg, K. Sonmez, F. Weng, J. Zheng (2000),
The SRI March 2000 Hub-5 Conversational Speech Transcription System.
Proc. NIST Speech Transcription Workshop,
College Park, MD.
(HTML,
PDF)
L. Mangu, E. Brill, & A. Stolcke (2000),
Finding consensus in speech recognition:
word error minimization and other applications of confusion networks,
Computer Speech and Language 14(4), 373-400.
(PDF,
abstract)
[2003 CSL Paper Award]
A. Stolcke, E. Shriberg, D. Hakkani-Tur, & G. Tur (1999),
Modeling the Prosody of Hidden Events for Improved Word Recognition.
Proc. EUROSPEECH, vol. 1, pp. 307-310, Budapest.
(PDF)
L. Mangu, E. Brill, & A. Stolcke (1999),
Finding Consensus Among Words: Lattice-based Word Error Minimization.
Proc. EUROSPEECH, vol. 1, pp. 495-498, Budapest.
(PDF)
F. Weng, A. Stolcke, & A. Sankar (1998),
Efficient Lattice Representation and Generation.
Proc. Intl. Conf. on Spoken Language Processing,
vol. 6, pp. 2531-2534.
Sydney, Australia.
(PDF)
A. Sankar, F. Weng, Z. Rivlin, A. Stolcke, & R. Rao Gadde (1998),
Development of SRI's 1997 Broadcast News Transcription System.
Proc.
DARPA Broadcast News Transcription and Understanding Workshop,
pp. 91-96, Lansdowne, VA.
(HTML,
PDF)
F. Weng, A. Stolcke, & A. Sankar (1998),
New Developments in Lattice-Based Search Strategies in SRI's Hub4 System.
Proc.
DARPA Broadcast News Transcription and Understanding Workshop,
pp. 138-143, Lansdowne, VA.
(HTML,
PDF)
A. Stolcke (1997),
Linguistic Knowledge and Empirical Methods in Speech Recognition.
AI Magazine 18(4): Winter 1997, pp. 13-24.
A. Stolcke, Y. Konig, & M. Weintraub (1997),
Explicit Word Error Minimization in N-best List Rescoring.
Proc. EUROSPEECH, vol. 1, pp. 163-166.
Rhodes, Greece.
(PDF)
F. Weng, H. Bratt, L. Neumeyer, & A. Stolcke (1997),
A Study on Multilingual Speech Recognition.
Proc. EUROSPEECH, vol. 1, pp. 359-362,
Rhodes, Greece.
(PDF)
M. Weintraub, F. Beaufays, Z. Rivlin, Y. Konig, & A. Stolcke (1997),
Neural-Network Based Measures of Confidence for Word Recognition.
Proc. IEEE Intl. Conf. on Acoustics,
Speech and Signal Processing, vol. 2, pp. 887-890,
Munich.
(PDF)
A. Sankar, L. Heck, & A. Stolcke (1997),
Acoustic Modeling for the SRI Hub4 Partitioned Evaluation Continuous
Speech Recognition System.
Proc.
DARPA Speech Recognition Workshop, pp. 127-132,
Chantilly, VA.
(HTML
,
PDF)
A. Sankar, A. Stolcke, T. Chung, L. Neumeyer, M. Weintraub, H. Franco, &
F. Beaufays (1996),
Noise-resistant Feature Extraction and Model Training for
Robust Speech Recognition.
Proc. ARPA Speech Recognition Workshop,
Harriman, NY.
(PDF)
C. Wooters & A. Stolcke (1994),
Multiple-pronunciation Lexical Modeling in a
Speaker-independent Speech Understanding System.
Proc. Intl. Conf. on Spoken Language Processing,
vol. 3, pp. 1363-1366, Yokohama.
D. Jurafsky, C. Wooters, G. Tajchman, J. Segal, A. Stolcke, E. Fosler,
& N. Morgan (1994),
The Berkeley Restaurant Project.
Proc. Intl. Conf. on Spoken Language Processing,
vol. 4, pp. 2139-2142, Yokohama.
J. Feldman, G. Lakoff, D. Bailey, S. Narayanan, T. Regier, & A. Stolcke (1996),
L0 -- The First Five Years of an Automated Language Acquisition Project.
Artificial Intelligence Review, 10(1-2), 103-129.
Special Volume on Integration of Natural Language and Vision Processing:
Grounding Representations, P. McKevitt (ed.).
(Abstract)
A. Stolcke (1994),
Bayesian Learning of Probabilistic Language Models.
Doctoral dissertation, Dept. of Electrical Engineering and
Computer Science, University of California at Berkeley.
A. Stolcke & S. Omohundro (1994),
Best-first Model Merging for Hidden Markov Model Induction.
Technical Report TR-94-003, ICSI, Berkeley, CA.
A. Stolcke & S. Omohundro (1994),
Inducing Probabilistic Grammars by Bayesian Model Merging.
In
Grammatical Inference and Applications,
R. C. Carrasco & J. Oncina, editors, Springer, pp. 106-118.
(Abstract)
A. Stolcke & S. Omohundro (1992),
Hidden Markov Model Induction by Bayesian Model Merging.
In Advances in Neural Information Processing Systems 5,
S. J. Hanson, J. D. Cowan, & C. L. Giles, editors,
Morgan Kaufman, pp. 11-18.
A. Stolcke (1991),
Syntactic Category Formation with Vector Space Grammars.
Proc. COGSCI, pp. 908-912, Chicago.
J. A. Feldman, G. Lakoff, A. Stolcke & S. H. Weber (1990),
Miniature Language Acquisition: A touchstone for cognitive science.
Proc. COGSCI, pp. 686-693, Cambridge, MA.
A. Stolcke (1995),
An Efficient Probabilistic Context-Free Parsing Algorithm
that Computes Prefix Probabilities.
Computational Linguistics 21(2), 165-201.
(HTML,
PDF,
abstract)
A. Stolcke (1993),
An Efficient Probabilistic Context-Free Parsing Algorithm
that Computes Prefix Probabilities.
Technical Report TR-93-065, ICSI, Berkeley, CA.
(Extended version of article above.)
A. Stolcke (1990),
Gapping and Frame Semantics: A fresh look from a cognitive perspective.
Proc. COLING, vol. 2, pp. 341-346, Helsinki.
A. Stolcke & D. Wu (1992),
Tree Matching with Recursive Distributed Representations.
Technical Report TR-92-025, ICSI, Berkeley, CA.
Also in Workshop on Integrating Neural and Symbolic Processes,
AAAI, San Jose, CA.
A. Stolcke (1991),
Syntactic Category Formation with Vector Space Grammars.
Proc. COGSCI, pp. 908-912, Chicago.
A. Stolcke (1989),
Unification as Constraint Satisfaction in Structured Connectionist Networks.
Neural Computation 1(4), 559-567.
A. Stolcke (1989),
Processing Unification-based Grammars in a Connectionist Framework.
Proc. COGSCI, pp. 908-915, Ann Arbor, MI.
A. Stolcke (1988).
Generation of natural language sentences in unification-based grammars --
A connectionist approach.
Diploma thesis (in German).
Report FKI-94-88, Computer Science Dept.,
Technische Universität München.
|
|