CURRICULUM VITA
Richard M. Stern, Jr.
Department of Electrical and Computer Engineering
Carnegie Mellon University 513 Emerson Street
Pittsburgh, PA 15213 Pittsburgh, PA 15206
Tel: 412-***-**** Cell phone: 412-***-****
FAX: 412-***-****
Email : ***@**.***.*** www.ece.cmu.edu/~rms
Birthdate: J uly 5, 1948 C itizenship: U .S.A.
PROFESSIONAL A utomatic speech recognition, auditory perception, acoustics,
INTERESTS s ignal processing, biomedical instrumentation
EDUCATION
Ph.D. (1977) Electrical Engineering and Computer Science
Massachusetts Institute of Technology, Cambridge, MA
M.S. (1972) Electrical Engineering and Computer Sciences
University of California, Berkeley, CA
S.B. (1970) Electrical Engineering
Massachusetts Institute of Technology, Cambridge, MA
EXPERIENCE
1995 - present Professor of Electrical and Computer Engineering
Carnegie Mellon University.
1988 - present Associate Professor and Professor by Courtesy, Language
Technologies Institute, Computer Science Department, Biomedical
Engineering Department
2009 - present Lecturer, School of Music
Carnegie Mellon University
1995 - 2003 Associate Director of the Information Networking Institute
Carnegie Mellon University
1982 - 1995 Associate Professor of Electrical and Biomedical Engineering
Carnegie Mellon University
1985 Visiting Professor in Speech and Communication Sciences, Nippon
Telegraph and Telephone Electrical Communications Laboratory,
Tokyo, Japan
1977 - 1982 Assistant Professor of Electrical and Biomedical Engineering
Carnegie Mellon University
R ichard M. Stern, Jr. Page 2
1979 - 1981 Adjunct Assistant Professor of Otolaryngology
University of Pittsburgh School of Medicine
1973 - 1976 Teaching and Research Assistant, Department of Electrical
Engineering, Massachusetts Institute of Technology
PROFESSIONAL ACTIVITIES (partial listing)
Distinguished Lecturer, International Speech Communication Association, 2008-2009.
General Chair, INTERSPEECH International Conference on Spoken Language Processing,
September, 2006.
Technical Program Co-Chair, IEEE Workshop on Automatic Speech Recognition and Under-
standing, December 2005.
Technical Program Chair, 141 st m eeting of the Acoustical Society of America, June 2002.
General Chair, DARPA Spoken Language Technologies Workshop, March, 1994.
Publications Chair, ARPA Spoken Language Technology and Applications Day, April, 1993.
Publications Chair, IEEE Workshop on Applications of Signal Processing to Audio and Acous-
tics, October, 1993.
Chair, standing DARPA Speech and Natural Language Workshop Organizing Committee, 1991 -
1992.
Secretary, ARPA Spoken Language Coordinating Committee, 1990 - 1995.
General Chair, DARPA Speech and Natural Language Workshop, June, 1990.
International Advisory Board, International Speech Communication Association, 2006 - present.
International Advisory Board, Center for Speech and Language Technologies, Tsinghua Univer-
sity, Beijing, China, 2007 - present.
Chair, Selection Committee for IEEE James L. Flanagan Speech & Audio Processing Award,
2006 - 2008.
IEEE Signal Processing Society Technical Committee on Audio and Electroacoustics, 1991 -
1995.
IEEE Signal Processing Society Technical Committee on Speech, 1993 - 1997.
Editorial board, J ournal of Computer Speech and Language, 1994 - present.
Editorial board, F ree Speech Journal, 1996 - 1998.
Ongoing collaborative research in binaural hearing with the Department of Otolaryngology at the
University of Connecticut Medical School, Farmington, CT.
Member of Institute of Electrical and Electronics Engineers, Acoustical Society of America,
R ichard M. Stern, Jr. Page 3
International Speech Communication Association, Association for Research in Otolaryngology,
Audio Engineering Society
Reviewer for National Science Foundation, International Speech Communication Association,
IEEE, J . Acoust. Soc. Amer ., H earing Research, I EEE Transactions on Signal Processing, I EEE
Transactions on Speech and Language, I EEE Transactions on Systems, Man, and Cybernetics,
and C ommunications of the Association of Computing Machinery
HONORS AND AWARDS
Fellow, International Speech Communication Association (ISCA)
Fellow, Acoustical Society of America (ASA)
Distinguished Lecturer of the International Speech Communication Association, 2008 to 2009
Allen Newell Award for Research Excellence, Carnegie Mellon University Department of Com-
puter Science, 1992
IEEE Student Branch Award for Teacher of the Year, Carnegie Mellon University Department of
Electrical Engineering, 1979
PUBLICATIONS AND PAPERS
Papers in Archival Journals
KIM, C., and STERN, R. M. (2012). Power-normalized cepstral coefficients (PNCC) for robust
speech revision, I EEE Trans. on Audio, Speech, and Language Processing ( accepted for publi-
cation).
STERN, R. M., and MORGAN, N. (2012). Hearing is believing: biologically-inspred methods for
robust speech recognition, I EEE Signal Processing Magazine ( accepted for publication).
CHIU, Y.-H. B., RAJ, B., and STERN, R. M. (2012). Learning-based auditory encoding for
robust speech recognition, I EEE Trans. on Audio, Speech, and Language Processing 2 0: 900-
914, March 2012.
KIM, W., and STERN, R. M. (2011). Mask classification for missing-feature reconstruction for
robust speech recognition, S peech Communication, 5 3: 1-11, January 2011.
PARK, H.-M., and STERN, R. M. (2009). Spatial Separation of Speech Signals using Continuo-
sly-Variable Weighting Factors Estimated from Comparisons of Zero Crossings, S peech Com-
munication Journal, 5 1 (1):15-25, January 2009.
SELTZER, M. L., and STERN, R. M. (2006). Subband Likelihood-Maximizing Beamforming for
Speech Recognition in Reverberant Environments, I EEE Transactions of Speech, Language,
and Audio Processing 1 4(6): 2109-2121, November 2006.
RAJ, B., and STERN, R. M. (2005). Missing-Feature Methods for Robust Automatic Speech
Recognition, I EEE Signal Processing Magazine, September 2005.
KIM, N. S., LIM, W., and STERN, R. M. (2005). Feature compensation based on switching lin-
ear dynamic model, I EEE Signal Processing Letters, 1 2 (6): 473-476.
R ichard M. Stern, Jr. Page 4
SELTZER, M. L., RAJ, B., and STERN, R. M. (2004). Likelihood-Maximizing Beamforming for
Robust Hands-Free Speech Recognition, I EEE Transactions of Speech and Audio Processing,
12 (5): 489-498, September 2004.
OBUCHI, Y., HATAOKA, N., and STERN, R. M. (2004), Normalization of Time-Derivative
Parameters for Robust Speech Recognition in Small Devices, I EICE Trans. on Information and
System s, 8 7-D (4): 1004:1011, April 2004.
RAJ, B., SELTZER, M. L., and STERN, R. M. (2004), Reconstruction of Missing Features for
Robust Speech Recognition, S peech Communication Journal, 4 3 (4): 275-296, September
2004.
SELTZER, M. L., RAJ, B,, and STERN, R. M. (2004). A Bayesian Framework for Spectro-
graphic Mask Estimation for Missing Feature Speech Recognition, S peech Communication
Journal, 4 3 (4): 379-393, September 2004.
SINGH, R., RAJ, B., and STERN, R. M. (2001), Automatic Generation of Sub-Word Units for
Speech Recognition Systems, I EEE Trans. on Speech and Audio Proc. 1 0 (2):89-99.
HUERTA, J. M., and STERN, R. M. (2001). Distortion-Class Modeling for Robust Speech Rec-
ognition under GSM RPE-LTP Coding, S peech Communication Journal, 3 4: 213-225 (invited
paper).
MORENO, P. J., RAJ, B., and STERN, R. M. (1998). Data-Driven Environmental Compensation
for Speech Recognition: A Unified Approach, S peech Communication Journal, 2 4: 2 67-85.
STERN, R. M., and SHEAR, G. D. (1996a) Lateralization and Detection of Low- Frequency Bin-
aural Stimuli: Effects of Distribution of Internal Delay, J . Acoust. Soc. Amer. 1 00 : 2278-2288.
STERN, R. M., and SHEAR, G. D. (1996b) Lateralization and Detection of Low- Frequency Bin-
aural Stimuli: Specification of the Extended Position-Variable Model, P hysics Auxiliary Publica-
tion Service, AIP document E-JASMA-100-2278- 0.175MB via http://www.aip.org/epaps/
epaps.html.
TRAHIOTIS, C., and STERN, R. M. (1994) Across-Frequency Interaction in Lateralization of
Complex Binaural Stimuli, J . Acoust. Soc. Amer. 9 6 : 3804- 3806 (L).
STERN, R. M., ZEPPENFELD, T., and SHEAR, G. D. (1991). Lateralization of Rectangularly-
Modulated Noise: An Explanation for Counterintuitive Reversals, J . Acoust. Soc. Amer. 9 0:
1901-1907.
COAST, D. A., STERN, R. M., CANO, G. G., and BRILLER, S. A. (1990). An Approach to Car-
diac Arrhythmia Analysis Using Hidden Markov Models, I EEE Trans. Biomed. Eng . 3 7 : 826-
836.
TRAHIOTIS, C., and STERN, R. M. (1989). Lateralization of Bands of Noise: Effects of Band-
width and Differences of Interaural Time and Phase, J . Acoust. Soc. Amer. 8 6 : 1285-1293.
RUDNICKY, A. I., and STERN, R.M. (1989). Spoken Language Research at Carnegie Mellon,
Speech Technology Magazine 4 : 3 8-43.
STERN, R. M., ZEIBERG, A. S., and TRAHIOTIS, C. (1988). Lateralization of Complex Binaural
Stimuli: A Weighted Image Model, J . Acoust. Soc. Amer. 8 4, 156-165.
R ichard M. Stern, Jr. Page 5
STERN, R. M., and LASRY, M. J. (1987). Dynamic Speaker Adaptation for Feature-Based Iso-
lated Letter Recognition, I EEE Trans. on Acoustics, Speech, and Signal Processing 3 5: 7 51-
763.
STERN, R. M., and COLBURN, H. S. (1985). Lateral-Position Models of Interaural Discrimina-
tion, J . Acoust. Soc. Amer. 7 7: 7 53-755.
STERN, R. M., and COLBURN, H. S. (1985). Subjective Lateral Position and Interaural Dis-
crimination, P hysics Auxiliary Publication Service, AIP document no. PAPS JASMA-77-753-29.
LASRY, M. J., and STERN, R. M. (1984). A Posteriori Estimation of Correlated Jointly Gauss-
ian Mean Vectors, I EEE Trans. on Pattern Anal. and Mach. Intel. 6 : 5 30-535.
CROWLEY, J. L., and STERN, R. M., Jr. (1984). Fast Computation of the Difference of Low
Pass (DOLP) Transform, I EEE Transactions on Pattern Analysis and Machine Intelligence 6 :
212-222.
STERN, R. M., Jr., SLOCUM, J. E., and PHILLIPS, M. S. (1983). Interaural Time and Amplitude
Discrimination in Noise, J . Acoust. Soc. Amer. 7 3: 1714-1722.
YOST, W. A., GRANTHAM, D. W., LUFTI, R. A., and STERN, R. M., Jr. (1982). The Phase
Angle of Addition in Temporal Masking for Diotic and Dichotic Listening Conditions, H earing
Res. 7 : 2 47-259.
MURTI, K. G., STERN, R. M., CANTEKIN, E. I. and BLUESTONE, C. D. (1982). Classification
of Spectral Patterns Obtained from Eustacian Tube Sonometry, I EEE Trans. Biomed. Eng. 2 9:
473-477.
MURTI, K. G., STERN, R. M., Jr., CANTEKIN, E. I. and BLUESTONE, C. D. (1980). Sonometric
Evaluation of Eustachian Tube Function Using Broadband Stimuli, A nnals of Otology, Rhinol-
ogy, and Laryngology, (Suppl. 68) 89, 178-189.
RUOTOLO, B. R., STERN, R. M., Jr., and COLBURN, H. S. (1979). Discrimination of Symmet-
ric, Time- Intensity Traded Binaural Stimuli, J . Acoust. Soc. Amer., 6 6: 1 733-1737.
STERN, R. M., Jr. and COLBURN, H. S. (1978). Theory of Binaural Interaction Based on Audi-
tory-Nerve Data. IV. A Model for Subjective Lateral Position, J . Acoust. Soc. Amer., 6 4: 1 27-
140.
Critically-Reviewed Books, Book Chapters, and Theses
STERN, R. M. and MORGAN, N. (2012). Features Based on Auditory Physiology and Percep-
tion, Chapter in N oise-Robust Techniques for Automatic Speech Recognition, T . Virtanen, R.
Singh, and B. Raj, Eds. to be published by Wiley Press.
STERN, R. M., WANG, D., and BROWN, G. (2006). Binaural Sound Localization, Chapter in
Computational Auditory Scene Analysis: Principles, Algorithms and Applications, D. Wang and
G. Brown, Eds., Wiley and IEEE Press.
RAJ, B., and STERN, R. M. (2006). R econstruction of Incomplete Spectrograms for Robust
Speech Recognition, S pringer-Verlag, Heidelberg.
STERN, R. M., TRAHIOTIS, C., and RIPEPI, A. M. (2006). Fluctuations in Amplitude and Fre-
quency Enable Interaural Delays to Foster the Identification of Speech-like Stimuli, Chapter in
R ichard M. Stern, Jr. Page 6
Dynamics of Speech Production and Perception, P. Divenyi, Ed., IOS Press.
TRAHIOTIS, C., BERNSTEIN, L. R., STERN, R. M., and BUELL, T. N. (2005). Interaural Corre-
lation as the Basis of a Working Model of Binaural Processing: An Introduction, Chapter in
Springer Handbook of Auditory Research: Sound Source Localization, R. Fay and T. Popper,
Eds., Springer-Verlag.
STERN, R. M. (2004). Signal Separation Motivated by Human Auditory Perception: Applica-
tions to Automatic Speech Recognition, Chapter in S peech Separation by Humans and
Machines, P. Divenyi, Ed., Springer-Verlag.
SINGH, R., STERN, R. M., and RAJ, B. (2002). Signal and Feature Compensation Methods for
Robust Speech Recognition, Chapter in C RC Handbook on Noise Reduction in Speech Appli-
cations, Gillian Davis, Ed., Boca Raton: CRC Press.
SINGH, R., RAJ, B., and STERN, R. M. (2002). Model Compensation and Matched Condition
Methods for Robust Speech Recognition, Chapter in C RC Handbook on Noise Reduction in
Speech Applications, Gillian Davis, Ed., Boca Raton: CRC Press.
STERN, R. M., ACERO, A., LIU, F.-H., and OHSHIMA, Y. (1996). Signal Processing for Robust
Speech Recognition, Invited chapter in S peech Recognition, pp. 351-378, C.-H. Lee and F.
Soong, Eds., Boston: Kluwer Academic Publishers.
STERN, R. M., and TRAHIOTIS, C. (1996). Models of Binaural Perception, Invited chapter in
Binaural and Spatial Hearing in Real and Virtual Environments, pp. 499-531, R. Gilkey and T. R.
Anderson, Eds. New York: Lawrence Erlbaum Associates
STERN, R. M. (1995). Robust Speech Recognition, Invited chapter in S urvey on the State of
the Art in Speech and Natural Language Processing, R. A. Cole et al., Ed.
STERN, R. M., and TRAHIOTIS, C. (1995). Models of Binaural Interaction, Invited chapter in
Handbook of Perception and Cognition, Volume 6: Hearing, pp. 347-386, B. C. J. Moore., Ed.
New York: Academic Press.
STERN, R. M., Jr. (1976b). L ateralization, Discrimination, and Detection of Binaural Pure
Tones, Ph.D. Thesis, Electrical Engineering Department, MIT, December, 1976.
Invited Conference Presentations
STERN, R. M. (2011). Applying Physiologically-Motivated Models of Auditory Processing to
Automatic Speech Recognition, invited talk at the Third International Symposium on Auditory
and Audiological Research, August, 2011.
STERN, R. M. (2010). The impact of the distribution of internal delays in binaural models on
predictions for psychoacoustical data, invited talk at the 161th Meeting of the Acoustical Soci-
ety of America, Cancun, Mexico, November, 2010.
STERN, R. M. (2009). New Directions in Robust Speech Recognition: What We Can Learn from
Auditory Models, invited keynote address at the Symposium on Frontiers of Research in
Speech and Music, Gwalior, India, December, 2009.
STERN, R. M. (2009). New Directions in Robust Automatic Speech Recognition, invited key-
note address at the Workshop on Image and Speech Processing, Hyderabad, India, December,
2009.
R ichard M. Stern, Jr. Page 7
STERN, R. M. (2008). Applying Physiologically-Motivated Models of Auditory Processing to
Automatic Speech Recognition: Promises, Progress, and Problems, Invited keynote address at
the ISCA Tutorial and Research Workshop on Statistical and Perceptual Audition, Brisbane,
Australia, September, 2008.
STERN, R. M, GOUVEA, E., KIM, C., KUMAR, K., and PARK, H.-M. (2008). Binaural and Multi-
ple-Microphone Processing for Robust Automatic Speech Recognition, Invited keynote address
at the IEEE Workshop on Hnads-free Speech Communication and Microphone Arrays, Trento,
Italy, May, 2008.
STERN, R. M. (2004). Signal Processing for Sound Separation and Robust Representation,
Invited keynote address at AFOSR/NSF Symposium on Speech Separation and Comprehension
in Complex Acoustic Environments, Montreal, Quebec, November 2004.
STERN, R. M. (2003). Signal Separation Motivated by Auditory Processing: Applications to
Speech Recognition, invited review talk at the NSF Symposium on Signal Separation, Mon-
treal, Quebec, November, 2003.
STERN, R. M. (2003). Signal Processing for Robust Recognition, invited talk at the NAIST
International Center of Excellence Symposium, Nara, Japan, March, 2003.
STERN, R. M. (2002). Using Computational Models of Binaural Hearing to Improve Automatic
Speech Recognition Accuracy: Promise, Progress, and Problems, AFOSR Workshop on Com-
putational Audition, Columbus, Ohio, August, 2002.
STERN, R. M. (2000). Robust Signal Representations for Automatic Speech Recognition,
Institute for Mathematics and Its Applications Workshop on the Mathematical Foundations of
Speech Processing and Recognition, Minneapolis, Minnesota, September, 2000.
STERN, R. M. (2000). The Language of Music, invited keynote talk presented at the Third
International Symposium on Text, Speech, and Dialog, Brno, Czech Republic, September, 2000.
STERN, R. M. (2000). Tendencias Actuales en el Procesamiento del Lenguaje Hablado y Siste-
mas Conversacionales (Current Trends in Spoken Language Processing and Conversational
Systems), invited keynote talk at the XV Simposium Internacional de Electr nica y Comuni-
caci n, Instituto Tecnol gico de Estudios Superiores de Monterrey Mexico, February, 2000.
STERN, R. M. (1999). Tendencias Actuales en el Procesamiento del Lenguaje Hablado y Siste-
mas Conversacionales (Current Trends in Spoken Language Processing and Conversational
Systems), invited keynote talk at the XXIV Simposium Internacional de Sistemas Computacio-
nales, Instituto Tecnol gico de Estudios Superiores de Monterrey, Monterrey, Mexico, March,
1999.
STERN, R. M., and TRAHIOTIS, C. (1997). Binaural Mechanisms that Emphasize Consistent
Interaural Timing Information over Frequency, invited keynote talk in P sychophysical and Phys-
iological Advances in Hearing, P roceedings of the XI International Symposium on Hearing,
August, 1997, Grantham, United Kingdom. A. R. Palmer, A. Rees, A. Q. Summerfield, and R.
Meddis, Eds., Whurr Publishers, London, 1998.
STERN, R. M., RAJ, B., and MORENO, P. J. (1997). Compensation for Environmental Degra-
dation in Automatic Speech Recognition, invited keynote talk presented at the P roc. of the
ESCA Tutorial and Research Workshop on Robust Speech Recognition for Unknown Communi-
cation Channels, April, 1997, Pont-au-Mousson, France, pp. 33-42.
R ichard M. Stern, Jr. Page 8
STERN, R. M. (1996). The Current State of the Art of in Speech Recognition (Estado-da-Arte
em Reconhecimento de Voz), invited keynote talk presented at VOICETECH 96, the First Bra-
zilian Workshop in Automatic Speech Recognition Campinas, Sao Paolo, Brazil, September,
1996.
STERN, R. M. (1996). New Directions in Spoken Language Processing, invited talk at the Sec-
ond Joint NSF/CONACyT Workshop on Bilateral Collaboration, Jalapa, Mexico, March, 1996.
STERN, R. M. (1996). Tendencias Actuales en el Procesamiento del Lenguaje Hablado (Cur-
rent Trends in Spoken Language Processing), invited talk at the Universidad Veracruzana,
Jalapa, Mexico, March, 1996.
STERN, R. M., and SULLIVAN, T. M. (1996). Robust Speech Recognition Using Signal Pro-
cessing Based On Binaural Perception, invited talk presented at the First Forum Acusticum,
Antwerp, Belgium, April, 1996.
STERN, R. M., MORENO, P.J., and RAJ, B. (1996). Compensation for Speech Recognition in
Degraded Acoustical Environments, invited talk at the 132 th m eeting of the Acoustical Society
of America, Honolulu, Hawaii, December, 1996.
STERN, R. M. (1995). Nuevos Enfoques en Procesamiento de Lenguaje Hablado (New Direc-
tions in Spoken Language Processing), invited talk at the Universitat Politecnica de Catalunya,
Barcelona, Spain, September, 1995.
STERN, R. M. (1995). New Directions in Spoken Language Processing, invited talk presented
at the Telef nica Investigaci n y Desarrollo Laboratory Symposium on Spoken Language Pro-
cessing, Madrid, Spain, September 1995.
STERN, R. M. (1995). Automatic Speech Recognition using Signal Processing based on Audi-
tory Physiology and Perception, invited paper presented at the 129 th m eeting of the Acoustical
Society of America, Washington, D.C., June, 1995.
MORENO, P. J., RAJ, B., and STERN, R. M. (1995). Approaches to Environmment Compensa-
tion in Automatic Speech Recognition, invited paper presented at the 15 th I nternational Confer-
ence on Acoustics, Trondheim, Norway, Vol. III, pp. 109-112, June, 1995.
STERN, R. M., and SULLIVAN, T. M. (1994). Robust Speech Recognition Based on Human
Binaural Perception, invited paper presented at the ATR workshop on A Biological Framework
for Speech Perception and Production, Kansai Science City, September, 1994. Reprinted in
ATR technical report TR-H-121: P roceedings of the ATR workshop on A Biological Framework
for Speech Perception and Production, 122 pages, (1995).
STERN, R. M. LIU, F.-H., SULLIVAN, T. M., MORENO, P. J., and ACERO, A. (1994). Multiple
Approaches to Robust Speech Recognition, invited keynote paper at the Fifth Western Pacific
Regional Acoustical Conference, Seoul, Korea, August, 1994.
STERN, R. M. (1993). Models of Binaural Interaction, invited keynote paper at the AFOSR
Conference on Binaural and Spatial Hearing, Wright-Patterson Air Force Base, September,
1993.
STERN, R. M. (1993). Psychoacoustical Basis of Machine Speech Recognition, invited talk at
the Annual Meeting of the American Association for the Advancement of Science, February,
1993.
R ichard M. Stern, Jr. Page 9
STERN, R. M. (1989). Recent Progress in Spoken-Language Systems, invited lecture at the
Second International Symposium on Artificial Intelligence, Monterrey, Mexico, October, 1989.
STERN, R. M. (1988). Overview of Models of Binaural Perception, invited review paper at the
1988 National Research Council CHABA Symposium, Washington, D.C., October, 1988.
STERN, R. M. (1988). Estado Actual de la Tecnolog a de Entradas/Salidas de Canales de Voz
(Overview of Current Voice Input/Output Technologies), invited keynote lecture at the XIII Sim-
posium Internacional de Sistemas Computacionales, Monterrey, Mexico, March, 1988.
COLE, R. A., STERN, R. M., and LASRY, M. J. (1986). Performing Fine Phonetic Distinctions:
Templates vs. Features, invited talk, reprinted in I nvariance and Variability of Features in Spo-
ken English Letters, J. Perkell e t al ., eds., Lawrence Erlbaum, New York.
Critically-Reviewed Conference Presentations
HARVILLA, M., and STERN, R. M. (2012). Histogram-based subband power warping and spec-
tral averaging for robust speech recognition under matched and multistyle training, I EEE Inter-
national Conference on Acoustics, Speech, and Signal Processing, March 2012, Kyoto, Japan.
KIM, C. and STERN, R. M. (2012). Power-normalized cepstral coefficients (PNCC) for robust
speech recognition, I EEE International Conference on Acoustics, Speech, and Signal Process-
ing, March 2012, Kyoto, Japan.
KIM, C., KHAWAND, C, and STERN, R. M. (2012). Two-microphone source separation algo-
rithm based on statistical modeling of angle distributions, I EEE International Conference on
Acoustics, Speech, and Signal Processing, March 2012, Kyoto, Japan.
KIM, C., KUMAR, K., and STERN, R. M. (2011). Binaural sound source separation motivated
by auditory processing, I EEE International Conference on Acoustics, Speech, and Signal Pro-
cessing, May 2011, Prague, Czech Republic.
KUMAR, K., KIM, C., and STERN, R. M. (2011). Delta-spectral cepstral coefficients for robust
speech recognition, IEEE International Conference on Acoustics, Speech, and Signal Process-
ing, May 2011, Prague, Czech Republic.
KUMAR, K., RAJ, B., SINGH, R., and STERN, R. M. (2011). An iterative least-squares techique
for dereverberation, I EEE International Conference on Acoustics, Speech, and Signal Process-
ing, May 2011, Prague, Czech Republic.
KUMAR, K., SINGH, R., RAJ, B., and STERN, R. M. (2011). Gammatone sub-band magnitude-
domain dereverberation, I EEE International Conference on Acoustics, Speech, and Signal Pro-
cessing, May 2011, Prague, Czech Republic.
KIM, C., STERN, EOM, K., and Lee, J. (2010). Automatic selection of thresholds for signal sep-
aration algorithms based on interaural delay, I nterspeech 2010, September 2010, M akuhari,
Japan.
KIM, C., and STERN, R. M. (2010). Nonlinear enhancement of onset for robust speech recogni-
tion, I nterspeech 2010, September 2010, M akuhari, Japan.
AL BAWAB, Z., RAJ, B, and STERN, R. M. (2010). A hybrid physical and statistical dynamic
articulatory framework incorporating analysis-by-synthesis for improved phone classification,
IEEE International Conference on Acoustics, Speech, and Signal Processing, March 2010, Dal-
R ichard M. Stern, Jr. Page 10
las, Texas.
CHIU, Y.-H., RAJ, B., and STERN, R. M. (2010). Learning Based Auditory Encoding For
Robust Speech Recognition, I EEE International Conference on Acoustics, Speech, and Signal
Processing, March 2010, Dallas, Texas.
KIM, C., and STERN, R. M. (2010). Feature Extraction For Robust Speech Recognition Based
On Maximizing The Sharpness Of The Power Distribution And On Power Flooring, I EEE Inter-
national Conference on Acoustics, Speech, and Signal Processing, March 2010, Dallas, Texas.
KUMAR, K., and STERN, R. M. (2010). Maximum-Likelihood-Based Cepstral Inverse Filtering
For Blind Speech Dereverberation, I EEE International Conference on Acoustics, Speech, and
Signal Processing, March 2010, Dallas, Texas.
CHIU, Y.-H. B, and STERN, R. M. (2009). Minimum variance modulation filters for robust
speech recognition, I EEE International Conference on Acoustics, Speech, and Signal Process-
ing, April 2009, Taipei, Taiwan.
AL BAWAB, Z., TURICCHIA, L., STERN, R. M., and RAJ, B. (2009). "Deriving vocal tract
shapes from electromagnetic articulograph data via geometric adaptation and matching, I nter-
speech 2009, September 2009, B righton, United Kingdom.
BUERA, L., MIGUEL, A., ORTEGA, E., LLEIDA, E., and STERN, R. (2009). "Unsupervised train-
ing scheme with non-stereo data for empirical feature vector compensation, I nterspeech 2009,
September 2009, B righton, United Kingdom.
CHIU, Y.-H. B., RAJ, B., and STERN, R. M. (2009). "Learning-based auditory encoding," I nter-
speech 2009, September 2009, B righton, United Kingdom.
GU, L., and STERN, R. M. (2009). "Speaker segmentation and clustering for sumultaneously-
presented speech," I nterspeech 2009, September 2009, B righton, United Kingdom.
KIM, C., KUMAR, K., RAJ, B., and STERN, R. M. (2009). "Signal separation for robust speech
recognition based on phase difference information obtained in the frequency domain," I nter-
speech 2009, September 2009, B righton, United Kingdom.
KIM, C., and STERN, R. M. (2009). Feature extraction for robust speech recognition using a
power-law nonlinearity and power-bias subtraction, I nterspeech 2009, September 2009, B righ-
ton, United Kingdom.
KIM, C., and STERN, R. M. (2009). "Power Function-Based Power Distribution Normalization
Algorithm for Robust Speech Recognition," I EEE Automatic Speech Recognition and Under-
standing Workshop, December 2009, Merano, Italy.
KIM, C., and STERN, R. M. (2009). "Robust Speech Recognition using a Small Power Boosting
Algorithm," I EEE Automatic Speech Recognition and Understanding Workshop, December
2009, Merano, Italy.
CHIU, Y.-H., and STERN, R. M. (2008). "Analysis of Physiologically-Motivated Signal Process-
ing for Robust Speech Recognition," Interspeech 2008, September 2008, Brisbane, Australia.
KIM, C., and STERN, A. M. (2008). "Robust Signal-to-Noise Ratio Estimation Based on Wave-
form Amplitude Distribution Analysis," Interspeech 2008, September 2008, Brisbane, Australia.
R ichard M. Stern, Jr. Page 11
AL BAWAB, Z., RAJ, B., and STERN, R. M. (2008). Analysis-by-synthesis features for speech
recognition, IEEE International Conference on Acoustics, Speech, and Signal Processing, April
2008, Las Vegas, Nevada.
GU, L., and STERN, R. M., Single-channel speech separation based on modulation frequency,
IEEE International Conference on Acoustics, Speech, and Signal Processing, April 2008, Las
Vegas, Nevada.
KUMAR, K., and STERN, R. M. (2008). Environment-invariant compensation for reverberation
using linear post-filtering for minimum distortion, IEEE International Conference on Acoustics,
Speech, and Signal Processing, April 2008, Las Vegas, Nevada.
STERN, R. M., GOUVEA, E., and THATTAI, G. (2007). Polyaural array processing for auto-
matic speech recognition in degraded environments, Interspeech 2007, August 2007, Antwerp,
Belgium.
PARK, H.-M., and STERN, R. M. (2007). Missing-feature speech recognition using dereverber-
ation and echo supporession in reverberant environments, IEEE International Conference on
Acoustics, Speech, and Signal Processing, April 2007, Honolulu, Hawaii.
KUMAR, K., CHEN, T., and STERN, R. M. (2007). Profile view lip reading, IEEE International
Conference on Acoustics, Speech, and Signal Processing, April 2007, Honolulu, Hawaii.
KIM, C., CHIU, Y.-H., and STERN, R. M. (2006). Physiologically-motivated synchrony-based
processing for robust automatic speech recognition, Interspeech 2006, September 2006, Pitts-
burgh, Pennsylvania.
NARAYANASWAMY, B., GANGAGHARIAH, R., and STERN, R. M. (2006). Voting for two
speaker segmentation, Interspeech 2006, September 2006, Pittsburgh, Pennsylvania.
PARK, H.-M., and STERN, R. M. (2006). Spatial separation of speech sgnals using continu-
ously-variable masks estimated from comparisons of zero crossings, IEEE International Con-
ference on Acoustics, Speech, and Signal Processing, May 2006, Toulouse, France.
KIM, W., and STERN, R. M. (2006). Band-independent mask estimation for missing-feature
reconstruction, IEEE International Conference on Acoustics, Speech, and Signal Processing,
May 2006, Toulouse, France.
KIM, W., STERN, R. M., and KO, H. (2005). Environment-Independent Mask Estimation for
Missing Feature Reconstruction, P roc. Eurospeech-2005 S eptember, 2005, Lisbon, Portugal.
LI, X., and STERN, R. M. (2004). Parallel Feature Generation Based on Maximum Normalized
Acoustic Likelihood for Improved Combination Performance, P roc. of the International Confer-
ence of Spoken Language Processing, October, 2004, Jeju Island, Korea.
LI, X., and STERN, R. M. (2004). Feature Generation Based on Maximum Normalized Acoustic
Likelihood for Improved Speech Recognition, I EEE International Conference on Acoustics,
Speech, and Signal Processing, May 2004, Montreal.
RAJ, B., SINGH, R., and STERN, R. M. (2004). On Tracking Noise with Linear Dynamical Sys-
tem Models, I EEE International Conference on Acoustics, Speech, and Signal Processing, May
2004, Montreal.
SELTZER, M. L., and STERN, R. M. (2003). Parameter Sharing in Subband Likelihood-Maxi-
R ichard M. Stern, Jr. Page 12
mizing Beamforming for Speech Recognition using Microphone Arrays, I EEE International Con-
ference on Acoustics, Speech, and Signal Processing, May 2004, Montreal.
LI, X., and STERN, R. M. (2003). Feature Generation Based on Maximum Classification Proba-
bility for Improved Speech Recognition," P roc. Eurospeech-2003, September, 2003, Geneva,
Switzerland.
NEDEL, J. P., and STERN, R. M. (2003). Duration Normalization and Hypothesis Combination
for Improved Spontaneous Speech Recognition, P roc. Eurospeech-2003, September, 2003,
Geneva, Switzerland.
OBUCHI, Y., and STERN, R. M. (2003). Normalization of Time-Derivative Parameters using
Histogram Equalization, P roc. Eurospeech-2003, September, 2003, Geneva, Switzerland.
LI, X., and STERN, R. M. (2003). Training of Stream Weights for the Decoding of Speech using
Parallel Feature Streams, I EEE International Conference on Acoustics, Speech, and Signal
Processing, April 2003, Hong Kong.
SELTZER, M. L., and STERN, R. M. (2003). Subband Parameter Optimization of Microphone
Arrays for Speech Recognition in Reverberant Environments, I EEE International Conference
on Acoustics, Speech, and Signal Processing, April 2003, Hong Kong.
LI, X., SINGH, R., and STERN, R. M. (2002). "Lattice Combination for Improved Speech Recog-
nition," P roc. of the International Conference of Spoken Language Processing, September,
2002, Denver, Colorado.
SELTZER, M. L., RAJ, B., and STERN, R. M. (2002). Speech Recognizer-Based Microphone
Array Processing for Robust Hands-Free Speech Recognition, Proc. IEEE Conf. on Acoustics,
Speech, and Sig. Proc., May, 2002, Orlando, Florida.
RAJ, B., SELTZER, M. L., and STERN, R. M. (2001). Robust Speech Recognition: The Case
for Restoring Missing Features, Proc. of the Workshop on Consistent and Reliable Auditory
Cues, September, 2001, Aalborg, Denmark.
SINGH, R., SELTZER, M. L., RAJ, B., and STERN, R. M. (2001). Speech in Noisy Environ-
ments: Robust Automatic Segmentation, Feature Extraction, and Hypothesis Combination,
Proc. IEEE Conf. on Acoustics, Speech, and Sig. Proc., May, 2001, Salt Lake City, Utah.
NEDEL, J. N., and STERN, R. M. (2001). Duration Normalization for Improved Recognition of
Spontaneous and Read Speech via Missing Feature Methods, P roc. IEEE Conf. on Acoustics,
Speech, and Sig. Proc., May, 2001, Salt Lake City, Utah.
DOH, S.-H., and STERN, R. M. (2000). Using Class Weighting in Inter-Class MLLR, P roc. of
the International Conference of Spoken Language Processing, October, 2000, Beijing, China.
HUERTA, J. M., and STERN, R. M. (2000). Instantaneous Distortion-Based Weighted Acoustic
Modeling for Robust Recognition of Coded Speech, P roc. of the International Conference of
Spoken Language Processing, October, 2000, Beijing, China.
NEDEL, J. P., SINGH, R., and STERN, R. M. (2000a). Automatic Subword Unit Refinement for
Spontanteous Speech Recognition via Phoneword Splitting, P roc. of the International Confer-
ence of Spoken Language Processing, October, 2000, Beijing, China.
NEDEL, J. P., SINGH, R., and STERN, R. M. (2000b). Phone Transition Acoustic Modeling:
R ichard M. Stern, Jr. Page 13
Application to Speaker Independent and Spontaneous Speech Systems, P roc. of the Interna-
tional Conference of Spoken Language Processing, October, 2000, Beijing, China.
RAJ, B., SELTZER, M. L., and STERN, R. M. (2000). Reconstruction of Damaged Spectro-
graphic Features for Robust Speech Recognition, P roc. of the International Conference of Spo-
ken Language Processing, October, 2000, Beijing, China.
SELTZER, M. L., RAJ, B., and STERN, R. M. (2000). Classifier-Based Mask Estimation for
Missing Feature Methods of Robust Speech Recognition, P roc. of the International Conference
of Spoken Language Processing, October, 2000, Beijing, China.
SINGH, R., RAJ, B., and STERN, R. M. (2000). Structured Redefinition of Sound Units by
Merging and Splitting for Improved Speech Recognition, P roc. of the International Conference
of Spoken Language Processing, October, 2000, Beijing, China.
DOH, S.-J., and STERN, R M. (2000). Inter-Class MLLR for Speaker Adaptation, P roc. IEEE
Conf. on Acoustics, Speech, and Sig. Proc., June, 2000, Istanbul, Turkey.
SINGH, R., RAJ, B., and STERN, R. M. (2000). Automatic Generation of Phone Sets and Lexi-
cal Transcriptions, P roc. IEEE Conf. on Acoustics, Speech, and Sig. Proc., June, 2000, Istan-
bul, Turkey.
DOH, S.-J., and STERN, R M. (1999). Weighted Principal Component MLLR For Speaker
Adaptation, P roc. of the IEEE Workshop on Automatic Speech Recognition and Understanding,
December, 1999, Keystone, Colorado.
SINGH, R., RAJ, B., and STERN, R. M. (1999). Domain Adduced State Tying for Cross-domain
Acoustic Modelling, P roc. Eurospeech-99, September, 1999, Budapest, Hungary.
SINGH, R., RAJ, B., and STERN, R. M. (1999). Automatic Clustering And Generation Of Con-
textual Questions For Tied States In Hidden Markov Models, P roc. IEEE Conf. on Acoustics,
Speech, and Sig. Proc., March, 1999, Phoenix, Arizona.
HUERTA, J. M., and STERN, R. M. (1999). Distortion-class weighted acoustic modeling for
Robust Speech Recognition under GSM RPE-LTP coding, P roc. of the Workshop on Robust
Methods for Speech Recognition in Adverse Conditions, Tampere, Finland, 1999.
HUERTA, J. M., AND STERN, R. M. (1998). Speech Recognition from GSM CODEC Parame-
ters, P roc. of the International Conference of Spoken Language Processing, Sydney Australia,
December, 1998.
RAJ, B., SINGH, R., and STERN, R. M. (1998). Inference of Missing Spectrographic Features
for Robust Speech Recognition, P roc. of the International Conference of Spoken Language
Processing, Sydney Australia, December, 1998.
RAJ, B., GOUVEA, E., and STERN, R. M. (1997). Cepstral Compensation using Statistical Lin-
earization, P roc. of the ESCA Tutorial and Research Workshop on Robust Speech Recognition
for Unknown Communication Channels, Pont-au-Mousson, France, April, 1997.
GOUVEA, E. B., and STERN, R. M. (1997). Speaker Normalization through Formant-Based
Warping of the Frequency Scale, P roc. Eurospeech-97, September, 1997, Rhodes, Greece.
HUERTA, J. M., and STERN, R. M. (1997). Compensation for Environmental and Speaker Vari-
ability by Normalization of Pole Locations, P roc. Eurospeech-97, September, 1997, Rhodes,
R ichard M. Stern, Jr. Page 14
Greece.
RAJ, B, PARIKH, V., and STERN, R. M. (1997). The Effects of Background Music on Speech
Recognition Accuracy, P roc. IEEE Conf. on Acoustics, Speech, and Sig. Proc., April, 1997,
Munich, Germany.
RAJ, B., GOUVEA, E., MORENO, P. J., and STERN, R. M. (1996). Cepstral Compensation by
Polynomial approximation for Environment-Independent Speech Recognition, P roc. of the
International Conference on Spoken Language Processing, P hiladelphia, Pennsylvania, Octo-
ber, 1996.
MORENO, P. J., RAJ, B., and STERN, R. M. (1996). A Vector Taylor Series Approach for Envi-
ronment-Independent Speech Recognition, P roc. of the IEEE International Conference on
Acoustics, Speech, and Signal Processing, Atlanta, Georgia, 1996.
MORENO, P. J., RAJ, B., and STERN, R. M. (1995). A Unified Approach to Robust Speech
Recognition, P roc. of Eurospeech-95, Madrid, Spain, September, 1995.
MORENO, P. J., RAJ, B., GOUVEA, E., and STERN, R. M. (1995). Multivariate- Gaussian-
Based Cepstral Normalization for Robust Speech Recognition, P roc. of the IEEE International
Conference on Acoustics, Speech, and Signal Processing, Detroit, Michigan, 1995.
SIEGLER, M. A., and STERN, R. M. (1995). On the Effects of Speech Rate in Large Vocabulary
Speech Recognition Systems, P roc. of the IEEE International Conference on Acoustics,
Speech, and Signal Processing, Detroit, Michigan, May, 1995.
STERN, R. M., LIU, F.-H., MORENO, P. J., and ACERO, A. (1994). Signal Processing for
Robust Speech Recognition, P roc. of the International Conference on Spoken Language Pro-
cessing, Yokohama, Japan, September, 1994.
HANAI, N., and STERN, R. M. (1994). Robust Speech Recognition in the Automobile, P roc. of
the International Conference on Spoken Language Processing, Yokohama, Japan, September,
1994.
OHSHIMA, Y., and STERN, R. M. (1994). Environmental Robustness in Automatic Speech
Recognition Using Physiologically-Motivated Signal Processing, P roc. of the International Con-
ference on Spoken Language Processing, Yokohama, Japan, September, 1994.
LIU, F.-H., STERN, R. M., ACERO, A., and MORENO, P. J. (1994). Environment Normalization
for Robust Speech Recognition using Direct Cepstral Comparison, P roc. of the IEEE Interna-
tional Conference on Acoustics, Speech, and Signal Processing, Adelaide, Australia, pp. II-61 -
II-64.
MORENO, P. J., and STERN, R. M. (1994). Sources of Degradation of Speech Recognition in
the Telephone Network, P roc. of the IEEE International Conference on Acoustics, Speech, and
Signal Processing, Adelaide, Australia, pp. I-109 - I-112.
LIU, F.-H., MORENO, P. J., STERN, R. M., and ACERO, A. (1994). Signal Processing For
Robust Speech Recognition, P roceedings of the Seventh ARPA Workshop on Human Lan-
guage Technology, Princeton, New Jersey, Morgan Kaufmann, C. J. Weinstein, Ed.
LIU, F.-H., MORENO, P. J., STERN, R. M., and ACERO, A. (1994). Signal Processing For
Robust Speech Recognition, P roceedings of the ARPA Workshop on Spoken Language Tech-
nology, Princeton, New Jersey, R. M. Stern, Ed.
R ichard M. Stern, Jr. Page 15
LIU, F.-H., STERN, R. M., HUANG, X., and ACERO, A. (1993). Efficient Cepstral Normalization
For Robust Speech Recognition, P roceedings of the Sixth ARPA Workshop on Human Lan-
guage Technology, Princeton, New Jersey, Morgan Kaufmann, M. Bates, Ed., pp. 69-74.
SULLIVAN, T. M., and STERN, R. M. (1993). Multi-Microphone Correlation- Based Processing
for Robust Speech Recognition, P roc. of the IEEE International Conference on Acoustics,
Speech, and Signal Processing, Minneapolis, Minnesota, 2 : 91-94.
STERN, R. M., LIU, F.-H., OHSHIMA, Y., SULLIVAN, T. M., and ACERO, A. (1992a). Multiple
Approaches to Robust Speech Recognition, P roc. of the Fifth DARPA Speech and Natural Lan-
guage Workshop, Harriman, New York, February, 1992.
STERN, R. M., LIU, F.-H., OHSHIMA, Y., SULLIVAN, T. M., and ACERO, A. (1992b). Multiple
Approaches to Robust Speech Recognition, P roc. of the Second International Conference on
Spoken Language Processing, Banff, Alberta, Canada, pp. 695-698, October, 1992.
LIU, F.-H., ACERO, A., and STERN, R. M. (1992). Efficient Joint Compensation of Speech for
the Effects of Additive Noise and Linear Filtering, P roc. of the IEEE International Conference
on Acoustics, Speech, and Signal Processing, San Francisco, California, pp. 865-868.
WARD, W., ISSAR, S., HUANG, X., HON, H.-W., HWANG, M.-Y., YOUNG, S., MATESSA, M.,
LIU, F.-H., and STERN, R. (1992). Speech Understanding in Open Tasks, P roc. of the Fifth
DARPA Speech and Natural Language Workshop, Harriman, New York, February, 1992.
STERN, R. M., XU, X., and TAO, S. (1991). A Coincidence-Based Model that Describes
Straightness Weighting in Binaural Perception, A bstracts of the Fourteenth Midwinter
Research Meeting of the Association for Research in Otolaryngology, St. Petersburg Beach,
Florida, p. 33(A).
STERN, R. M., and TRAHIOTIS, C. (1991). The Role of Consistency of Interaural Timing over
Frequency in Binaural Lateralization, P roc. of the Ninth International Symposium on Auditory
Physiology and Perception, Carcans, France.
ACERO, A. and STERN, R. M. (1991). Robust Speech Recognition by Normalization of the
Acoustic Space, P roc. of the IEEE International Conference on Acoustics, Speech, and Signal
Processing, Toronto, Ontario, pp. 893-896.
ROZZI, W. A. and STERN, R. M. (1991). Fast Estimation of Mean Vectors using Adaptive Fil-
tering, P roc. of the IEEE International Conference on Acoustics, Speech, and Signal Process-
ing, Toronto, Ontario, pp. 865-868.
ACERO, A. and STERN, R. M. (1990a). Environmental Robustness in Automatic Speech Rec-
ognition, P roc. of the IEEE International Conference on Acoustics, Speech, and Signal Pro-
cessing, Albuquerque, New Mexico, pp. 849-852.
ACERO, A. and STERN, R. M. (1990b). Toward Microphone-Independent Spoken Language
Systems, P roceedings of the DARPA Speech and Natural Language Workshop, Hidden Valley,
PA, R. M. Stern, Ed., Morgan Kaufmann Publishers, Inc., San Mateo, CA,
ACERO, A. and STERN, R. M. (1990c). Acoustical Pre-Processing for Robust Spoken Lan-
guage Systems, P roc. First International Conference on Spoken Language Processing, pp.
1121-1124, Kobe, Japan, November, 1990.
STERN, R. M., ZEPPENFELD, T., and SHEAR, G. D. (1990). Lateralization of Rectangularly-
R ichard M. Stern, Jr. Page 16
Modulated Noise: An Explanation for Illusory Reversals, A bstracts of the Thirteenth Midwinter
Research Meeting of the Association for Research in Otolaryngology, St. Petersburg Beach,
Florida, pp. 163-164(A).
STERN, R. M. and ACERO, A. (1989). Acoustical Pre-Processing for Robust Speech Recogni-
tion, presented at the October, 1989, DARPA Workshop on Speech and Natural Language.
WARD, W. H., HAUPTMANN, A. G., STERN, R. M., and CHANAK, T. (1988). Parsing Spoken
Phrases Despite Missing Words, P roc. of the IEEE International Conference on Acoustics,
Speech, and Signal Processing, pp. 275-278.
COAST, D. A., STERN, R. M., CANO, G. C., and BRILLER, S. A. (1987). Cardiac Arrhythmia
Analysis Using Hidden Markov Models, presented at the IEEE Engineering in Medicine Society
Conference, November, 1987.
STERN, R. M. (1988). Nuevos Enfoques en Reconocimiento Automatico de Habla (New Direc-
tions in Automatic Speech Recognition), invited lecture at the XIII Simposium Internacional de
Sistemas Computacionales, Monterrey, Mexico.
STERN, R. M., WARD, W. H., HAUPTMANN, A. G., and LEON, J. (1987). Sentence Parsing
with Weak Grammatical Constraints, P roc. of the IEEE International Conference on Acoustics,
Speech, and Signal Processing, pp. 380-383.
LASRY, M.J., AND STERN, R.M. (1984). Unsupervised Adaptation to New Speakers in Fea-
ture-Based Letter Recognition, P roc. of the IEEE International Conference on Acoustics,
Speech, and Signal Processing, pp. 17.6.1-17.6.4.
STERN, R. M., and BACHORSKI, S. J. (1983). Dynamic Cues in Binaural Perception, in H ear-
ing-Physiological Bases and Psychophysics, R. Klinke and R. Hartmann, Eds., Springer Verlag
Press, Heidelberg.
STERN, R. M., and LASRY, M. J. (1983). Dynamic Speaker Adaptation for Isolated Letter Rec-
ognition Using MAP Estimation, P roc. of the IEEE International Conference on Acoustics,
Speech, and Signal Processing, pp. 734-737.
COLE, R. A., STERN, R. M., PHILLIPS, M. S., BRILL, S. M., PILANT, A. P., and SPECKER, P.
(1983). Feature-based Speaker-Independent Recognition of Isolated English Letters, P roc. of
the IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 731-734.
STERN, R. M., Jr. and RUBINOV, E. M. (1980). Subjective Laterality of Noise-Masked Binaural
Targets, in P sychological, Physiological, and Behavioural Studies of Hearing, G. v.d. Brink and
F. A. Bilsen, Eds., Delft University Press, Delft.
Non-Reviewed Submitted Conference Papers and Other Presentations
STERN, R. M. (2003). Signal Processing for Robust Recognition, Speech Information Technol-
ogy Research Center Seminar Series, Seoul, Korea, May, 2003.
NEDEL, J. P., and STERN, R. M. (2002). Duration Normalization for Improved Automatic
Speech Recognition, J . Acoust. Soc. Amer., 1 12 : 2321 (A).
HUERTA, J. M., CHEN, S. J., and STERN, R. M. (1999). The 1998 Carnegie Mellon University
Sphinx-3 Spanish Broadcast News Transcription System, P roc. of the DARPA Broadcast News
Transcription and Understanding Workshop, March, 1999, Herndon, Virginia.
R ichard M. Stern, Jr. Page 17
HUERTA, J. M., THAYER, E., RAVISHANKAR, M., and STERN, R. M. (1998). The Develop-
ment of the 1997 CMU Spanish Broadcast News Transcription System, P roc. of the DARPA
Broadcast News Transcription and Understanding Workshop, February, 1998, Landsdowne, Vir-
ginia.
SEYMORE, K., CHEN, S., DOH, S.-J., ESKENAZI, M., GOUVEA, E., RAJ, B., RAVISHANKAR,
M. ROSENFELD, R., SIEGLER, M. A., STERN, R. M., AND THAYER, E. (1998). The 1997 CMU
Sphinx-3 English Broadcast News Transcription System, P roc. of the DARPA Broadcast News
Transcription and Understanding Workshop, February, 1998, Landsdowne, Virginia.
SIEGLER, M. A., JAIN, U., RAJ, B., and STERN, R. M. (1997). Automatic Segmentation, Clas-
sification, and Clustering of Broadcast News Audio, P roc. DARPA Speech Recognition Work-
shop, February, 1997, Chantilly, Virginia.
PARIKH, V. N., RAJ, B., and STERN, R. M. (1997). Speaker Adaptation and Environmental
Compensation for the 1996 Broadcast News Task, P roc. DARPA Speech Recognition Work-
shop, February, 1997, Chantilly, Virginia.
PLACEWAY, P., CHEN, S., ESKENAZI, M., JAIN, U., PARIKH, V., RAJ, B., RAVISHANKAR, M.,
ROSENFELD, R., SEYMORE, K., SIEGLER, M., STERN, R., and THAYER, E. (1997). The
1996 Hub-4 SPHINX-3 System, P roc. DARPA Speech Recognition Workshop, February, 1997,
Chantilly, Virginia.
STERN, R. M. (1997). Specification of the 1996 Hub 4 Broadcast News Evaluation, P roc.
DARPA Speech Recognition Workshop, February, 1997, Chantilly, Virginia.
GOUVEA, E. B., MORENO, P. J., RAJ, B., SULLIVAN, T. M., and STERN, R. M. (1996). Adap-
tation and Compensation: Approaches To Microphone And Speaker Independence In Automatic
Speech Recognition, P roceedings of the ARPA Workshop on Speech Recognition Technology,
Harriman, NY, Morgan Kaufmann, D. Pallett, Ed.
JAIN, U., SIEGLER, M. A., DOH, S.-J., GOUVEA, E., MORENO, P. J., RAJ, B. and STERN, R.
M. (1996). Recognition Of Continuous Broadcast News With Multiple Unknown Speakers And
Environments, P roceedings of the ARPA Workshop on Speech Recognition Technology, Harri-
man, NY, Morgan Kaufmann, D. Pallett, Ed.
STERN, R. M. (1996). Specification of the 1995 ARPA Hub 3 Evaluation: Unlimited Vocabulary
NAB News Baseline, P roceedings of the ARPA Workshop on Speech Recognition Technology,
Harriman, NY, Morgan Kaufmann, D. Pallett, Ed.
MORENO, P. J., SIEGLER, M. A., JAIN, U., and STERN, R. M. (1995). Continuous Recognition
of Large- Vocabulary Telephone-Quality Speech, P roceedings of the ARPA Workshop on Spo-
ken Language Technology, Austin, TX, Morgan Kaufmann, J. Cohen, Ed.
MORENO, P. J., JAIN, U., RAJ, B., and STERN, R. M. (1995). Approaches to Microphone Inde-
pendence in Automatic Speech Recognition, P roceedings of the ARPA Workshop on Spoken
Language Technology, Austin, TX, Morgan Kaufmann, J. Cohen, Ed.
LEE, W., and STERN, R. M. (1994). Consistency Over Frequency in High- Frequency Binaural
Lateralization, J . Acoust. Soc. Amer., 9 5 : 2896 (A).
STERN, R. M., and SULLIVAN, T. M. (1993). Microphone-Array Algorithms for Robust Speech
Recognition, talk at the Speech Research Symposium-XIII, Baltimore, Maryland, June, 1993.
R ichard M. Stern, Jr. Page 18
TAO, S. H., and STERN, R. M. (1992). Additive versus Multiplicative Combination of Differ-
ences of Interaural Time and Intensity, J . Acoust. Soc. Amer ., 91: 2414(A).
STERN, R. M., LIU, F.-H., OHSHIMA, Y., SULLIVAN, T. M., and ACERO, A. (1992c). Alterna-
tive Approaches to Acoustical Pre-Processing for Robust Speech Recognition, presented at the
Speech Research Symposium-XII, June, 1992.
ACERO, A., and STERN R. M. (1992). Cepstral Normalization for Robust Speech Recognition,
Proc. of the ESCA Workshop on Speech Processing in Adverse Conditions, Cannes-Mandelieu,
France, pp 89-92.
STERN, R. M., and ACERO, A. (1990a). Acoustic Pre-Processing for Robust Spoken Language
Systems, invited presentation at the Speech Research Symposium- X, Baltimore, MD, October
16, 1990.
STERN, R. M., and ACERO, A. (1990b). Signal Processing for Robust Spoken Language Sys-
tems, invited tutorial lecture at the Satellite Workshop on Statistical Approaches to Spoken
Language Processing, Tokyo, Japan, November 17, 1990.
STERN, R. M., and RUDNICKY, A. I. (1990). Spoken-Language Workstations in the Office
Environment, invited lecture at the SpeechTech90 Conference, New York City.
STERN, R. M., SHEAR, G. D., and ZEPPENFELD, T. (1988). Lateralization Predictions for
High-Frequency Binaural Stimuli, J . Acoust.Soc. Amer . 8 4 : S80 (A).
SHEAR, G. D., and STERN, R. M. (1987). Extending the Position-Variable Model: Dependence
of Lateralization on Various Spectral Parameters, J . Acoust. Soc. Amer . 8 1 : S27 (A).
STERN, R. M., and ZEIBERG, A. S. (1986). A Weighted Image Model for Binaural Lateraliza-
tion, J . Acoust. Soc. Amer. 8 0 : S107 (A).
BEECHER, L., and STERN, R. M. (1986). Perception of Modulations in Pitch and Lateraliza-
tion, J . Acoust. Soc. Amer . 7 9 : S22 (A).
STERN, R.M., ELSNER, A.E., and SCHIANO, J.L. (1984). Interaural Time Discrimination in
Tonal Maskers, J . Acoust. Soc. Amer . 7 6 : S91(A).
LASRY, M. J., and STERN, R. M. (1984). Unsupervised Speaker Adaptation in Feature-Based
Isolated Letter Recognition, J . Acoust. Soc. Amer . 7 4 : S16(A).
BACHORSKI, S. J., and STERN, R. M. (1983). Dynamic Cues in Binaural Perception, J .
Acoust. Soc. Amer . 7 3 : S42(A).
STERN, R. M. and LASRY, M. J. (1982). Tuning to the Speaker: Dynamic Adaptation of Statis-
tical Parameters in Isolated Letter Recognition, J . Acoust. Soc. Amer . 72 S31(A).
BRILL, S. M., PHILLIPS, M. S., LASRY, M. J., and STERN, R. M. (1982). Decisions about Fea-
tures, J . Acoust. Soc. Amer . 72 S32(A).
KAISER, D. L., and STERN, R. M. (1981). Interaural Time Discrimination in Tonal Maskers, J .
Acoust. Soc. Amer . 70, S88 (A).
SLOCUM, J. E., and STERN, R. M., Jr., (1980). Interaural Time and Amplitude Discrimination
in Noise, J . Acoust. Soc. Amer . 68, S60(A).
R ichard M. Stern, Jr. Page 19
STERN, R. M., Jr. and WAIBEL, A. H. (1980). Audibility of Phase Changes in Vowel Sounds
and Complex Tones, J . Acoust. Soc. Amer . 68, S50(A).
DUMOND, G. J. and STERN, R. M., Jr. (1979). A Forced-Choice Paradigm for Pulsation-
Threshold Measurements, J . Acoust. Soc. Amer . 65, S58(A).
RUBINOV, E. M. and STERN, R. M., Jr. (1979). Effects of Binaural Maskers on the Subjective
Laterality of Diotic Targets, J . Acoust. Soc. Amer ., 65, S121(A).
STERN, R. M., Jr. (1979). On the Use of Multiple Perceptual Images in Binaural Discrimination
Experiments, J . Acoust. Soc. Amer . 65, S122(A).
RUOTOLO, B. R., STERN, R. M., Jr., and COLBURN, H. S. (1977). Discrimination of Symmet-
ric, Time-Intensity Traded Stimuli, J . Acoust. Soc. Amer ., 61, S60(A).
STERN, R. M., Jr. (1977). Lateralization of the MLD: Detection-Threshold Performance of an
Auditory-Nerve-Based Model for Lateral Position, J . Acoust. Soc. Amer ., 61, S60(A).
COLBURN, H. S., DOMNITZ, R. H., STERN, R. M., Jr., and DURLACH, N. I. (1976). Current
Problems in Binaural Hearing Research, J . Acoust. Soc. Amer ., 59, S16(A).
STERN, R. M., Jr. (1976a). Lateral Position, Interaural Discrimination, and Binaural Detection:
Model Based on Auditory-Nerve Activity, J . Acoust. Soc. Amer ., 59, S23(A).
STERN, R. M., Jr. (1972). Perception of Simultaneously-Presented Musical Timbres, Q uart.
Prog. Rep. No. 106, Res. Lab. of Electronics, MIT, Cambridge, MA.
Ph.D. THESES SUPERVISED
D. A. Coast, C ardiac Arrhythmia Analysis Using Hidden Markov Models, September, 1988.
A. Acero, A coustical and Environmental Robustness for Automatic Speech Recognition, Sep-
tember, 1990.
W. A. Rozzi, S peaker Adaptation in Automatic Speech Recognition via Estimation of Correlated
Mean Vectors, May, 1991.
Y. Ohshima, E nvironmental Robustness in Speech Recognition using Physiologically-Motivated
Signal Processing, December, 1993.
F.-H. Liu, E nvironmental Adaptation for Robust Speech Recognition, June, 1994.
P. J. Moreno, S peech Recognition in Noisy Environments, May, 1996.
T. M. Sullivan, M ulti-Microphone Correlation-Based Processing for Robust Automatic Speech
Recognition, June, 1996.
E. Gouvea, A coustic-Feature-Based Frequency Warping for Speaker Normalization, F ebruary,
1999.
M. Siegler, I ntegration of Continuous Speech Recognition and Information Retrieval for Mutually
Optimal Performance, December, 1999.
R ichard M. Stern, Jr. Page 20
B. Raj, R econstruction of Incomplete Spectrograms for Robust Speech Recognition, A pril, 2000.
J. Huerta, R obust Speech Recognition in GSM Codec Environments, April, 2000.
S.-J. Doh, E nhancements to Transformation-Based Speaker Adaptation: Principal Component
and Inter-Class Maximum Likelihood Linear Regression, July, 2000.
M. Seltzer, M icrophone Arrays for Robust Speech Recognition, J uly, 2003.
J, Nedel, D uration Normalization for Robust Recognition of Spontaneous Speech via Missing
Feature Methods, A pril, 2004.
X. Li, C ombination and Generation of Parallel Feature Streams for Improved Speech Recogni-
tion, February, 2005.
Z. Al Bawab, A n Analysis-by-Synthesis Approach to Vocal Tract Modeling for Robust Speech
Recognition, S eptember, 2009.
Y.-H. Chiu, Learning-Based Auditory Encoding for Robust Speech Recognition, April, 2010.
L. Gu, Single-Channel Speech Separation Based on Instantaneous Frequency, May, 2010.
C. Kim, S ignal Processing for Robust Speech Recognition Motivated by Auditory Processing,
September, 2010.
K. Kumar, A S pectro-Temporal Framework for Compensation of Reverberation for Speech Rec-
ognition, J anuary, 2011.
CURRENT Ph.D. STUDENTS
A. Moghimi, joined group in January 2009.
G. Romigh, entered August 2009.
M. Harvilla, entered August 2010.
A. Menon, entered August 2011.
M.S. THESES and PROJECTS SUPERVISED
K. G. Murti, S onometric Evaluation of Eustacian-Tube Function, May, 1979.
G. J. DuMond, A F orced-Choice Paradigm for Pulsation-Threshold Measurement of Monaural
and Binaural Phenomena, August, 1979.
E. M. Rubinov, A uditory Lateralization in Noise, September, 1979.
J. E. Slocum, D iscrimination of Interaural Time and Intensity in No