Hua Yu
Language Technology Institute 412-***-**** (o)
School of Computer Science 412-***-**** (h)
Carnegie Mellon University abqikw@r.postjobfree.com
Pittsburgh, PA 15213 http://www.cs.cmu.edu/~hyu
Objective
A challenging position in speech recognition, statistical machine
learning, natural language processing and related fields.
Research Interests
* Large vocabulary conversational speech recognition, handwriting
recognition, or sequence modeling in general
* Statistical machine learning, pattern recognition
Education
Doctor of Philosophy in Language Technologies August 2004
Thesis: Recognizing Sloppy Speech
Carnegie Mellon University, Pittsburgh, PA
Recipient of Graduate Research Fellowship, Language Technologies
Institute, 1996-2004
Master of Science in Language Technologies May 1998
Carnegie Mellon University, Pittsburgh, PA
Master of Science and Engineering in Computer Science June 1996
Tsinghua University, Beijing, China
Recipient of Motorola Scholarship, 1994
Bachelor of Science and Engineering in Computer Science June 1994
Tsinghua University, Beijing, China
Recipient of first class prize for excellent students, 1989-1994
Research Experiences
* Research Assistant Sept. 1996 - present
School of Computer Science, Carnegie Mellon University Pittsburgh,PA
I have extensive experiences in building LVCSR systems, which
involves developing and maintaining a large software system of
~200K lines of C code. I have led the efforts in developing:
+ the ISL Switchboard system, which achieves 23.4% word error
rate on the RT-03 spring evaluation;
+ the Broadcast News transcription system, which is also used
for a live dictation demo;
+ automatic meeting transcription system.
My thesis topic is to improve the recognition of sloppy speech.
To this end, I have explored several novel approaches,
including single tree clustering, Gaussian transition modeling
and thumbnail features.
I have also been involved in the following research projects:
+ face recognition: We developed a new, direct LDA algorithm
for classification of high dimensional data, such as face
images.
+ automatic segmentation and clustering of broadcast
news/meeting data.
+ grapheme-to-phoneme mapping: We developed a program that
consults DECTalk, MITalk, as well as static dictionaries to
answer pronunciation queries from the network.
+ voice-driven web browser: Sphinx-II is used to recognize
hyperlinks as well as a number of navigation commands.
+ automatic clustering of text documents.
* Research Assistant 1994-1996
Speech Lab, Tsinghua University Beijing, China
I worked on automatic language identification and
speaker-independent Chinese syllable/phrase recognition. I also
volunteered as a system/network administrator in the lab.
* Various Side Projects
+ Designing an intelligent controller for brushless DC motor
with a single-chip controller;
+ Tracking down a new virus and developing an anti-virus
program;
+ Developing a postal service window system;
+ Many others.
Other Experiences
* Teaching two lectures on HMMs for CMU CS11-751: Speech Recognition
and Understanding
* Reviewer for Pattern Recognition and ICMI'2003
* Consultant for the Spoken Language Technology group, Sony Inc. May
2000. My job is to assist them in developing an LVCSR system.
* Teaching Assistant for CMU CS15-229: Multimedia Signal Processing,
by Prof. R. Thibadeau, Prof. R. Dannenberg and Prof. R. Reddy,
1999
* LTI admission committee member, 1998
Programming Skills
Proficient in C, Perl, TclTk, Linux/Unix System Administration, LaTeX, TCP/IP
Experienced in C++, Matlab, VisualBasic, 80x86 Assembly, PASCAL, Lisp, etc.
Publications (Speech Related)
1. H. Yu. Phase Space Representation of Speech -- Revisiting the
delta and double-delta features. ICSLP, Jeju Island, Korea, 2004
2. H. Soltau, H. Yu, F. Metze, C. Fuegen, Q. Jin and S. Jou. The ISL
Transcription System for Conversational Telephony Speech. ICASSP,
Montreal, 2004
3. H. Yu and A. Waibel. Integrating Thumbnail Features for Speech
Recognition Using Conditional Exponential Models. ICASSP,
Montreal, 2004
4. H. Yu and T. Schultz. Enhanced Tree Clustering with Single
Pronunciation Dictionary for Conversational Speech Recognition.
Eurospeech, Geneva, 2003
5. H. Yu and T. Schultz. Implicit Trajectory Modeling through
Gaussian Transition Models for Speech Recognition. HLT-NAACL,
Edmonton, 2003
6. H. Yu and A. Waibel. Flexible Parameter Tying for Conversational
Speech Recognition. ISCA & IEEE Workshop on Spontaneous Speech
Processing and Recognition, Tokyo, 2003
7. H. Soltau, H. Yu, F. Metze, C. Fuegen, Q. Jin and S. Jou. The ISL
RT-03 Conversational Telephone Speech Recognition System. Rich
Transcription Workshop, Boston, MA, 2003
8. S. Burger, V. MacLaren and H. Yu. The ISL Meeting Corpus: The
Impact of Meeting Type on Speech Style. ICSLP, Denver, 2002
9. H. Soltau, H. Yu, F. Metze, C. Fuegen, Y. Pan and S. Jou. ISL
Meeting Recognition. Rich Transcription Workshop, Vienna, VA, 2002
10. C. Hori, S. Furui, R. Malkin, H. Yu and A. Waibel. Automatic
Speech Summarization Applied to English Broadcast News Speech.
ICASSP, Orlando, 2002
11. A. Waibel, M. Bett, F. Metze, K. Ries, T. Schaaf, T. Schultz, H.
Soltau, H. Yu and K. Zechner. Advances in Automatic Meeting Record
Creation and Access. ICASSP, Salt Lake City, 2001
12. A. Waibel, H. Yu, M. Westphal, H. Soltau, T. Schultz, T. Schaaf,
Y. Pan, F. Metze and M. Bett. Advances in Meeting Recognition.
HLT, San Diego, 2001
13. H. Yu and A. Waibel. Streamlining the Front-End of a Speech
Recognizer. ICSLP, Beijing, 2000
14. H. Yu, T. Tomokiyo, Z. Wang and A. Waibel. New Developments in
Automatic Meeting Transcription. ICSLP, Beijing, 2000
15. R. Gross, M. Bett, H. Yu, X. Zhu, Y. Pan, J. Yang and A. Waibel.
Towards a Multimodal Meeting Record. ICME, New York, 2000
16. H. Yu, M. Finke and A. Waibel. Progress in Automatic Meeting
Transcription. Eurospeech, 1999
17. H. Yu, C. Clark, R. Malkin and A. Waibel. Experiments in Automatic
Meeting Transcription using JRTk. ICASSP, Seattle, USA, May 1998
18. D. Fang, H. Yu and S. Li. Speech Recognition based on Normal
Distribution Hypothesis. Intl. Conf. on Chinese Computing,
Singapore, 1994
19. H. Yu, S. Li, S. Qing and D. Fang. Speaker-independent Isolated
Word/Phrase Recognition -- a Statistical Approach. National Conf.
on Human-Machine Communication, Chongqing, Oct. 1994
Publications (Non-Speech Areas)
1. H. Yu and J. Yang. A Direct LDA Algorithm for High-Dimensional
Data -- with Application to Face Recognition. Pattern Recognition
34(10), 2001, pp. 2067-2070
2. J. Yang, H. Yu and W. Kunz. An Efficient LDA Algorithm for Face
Recognition. ICARCV, Singapore, 2000
3. H. Yu. Automatically Determining Number of Clusters. Information
Retrieval course (CMU CS11-741) final report, 1998
4. H. Yu and Z. Wang, A Survey on Anonymous Digital Cash Systems.
Security and Cryptography course(CMU CS15-827) final report, 1997
Selected Presentations
* Phase Space Representation of Speech -- Revisiting the delta and
double-delta features Sphinx Speech Group, Carnegie Mellon
University, Pittsburgh, PA, July, 2003
* Implicit Pronunciation Modeling for Conversational Speech
Recognition. Joint Speech Seminar, Carnegie Mellon University,
Pittsburgh, PA, May 2, 2003
* Development of the Broadcast News System. Interactive Systems
Labs, Carnegie Mellon University, Pittsburgh, PA, May, 1999
* Experiments in Automatic Meeting Transcription using JRTk. ICASSP,
Seattle, USA, May, 1998
Personal
Native speaker of Chinese, fluent in English. Citizenship: China. US
permanent resident.
References
Available upon request