THE DISTANCE MEASURE FOR LINE SPECTRUM PAIRS
APPLIED TO SPEECH RECOGNITION
Fang Zheng, Zhanjiang Song, Ling Li, Wenjian Yu, Fengzhou Zheng, and Wenhu Wu
Speech Laboratory, Department of Computer Science and Technology,
Tsinghua University, Beijing, 100084, P. R. China
******@**.**.********.***.**
coefficients both theoretically and practically[3], therefore the
ABSTRACT Euclidean distance measure is good for the LPC derived
cepstral vectors[3][10]. The solution of cepstrum distance
The Line Spectrum Pair (LSP) based on the principle of linear measure makes the cepstra and their derived parameters play a
predictive coding (LPC) plays a very important role in the very important role in speech recognition and become a kind of
speech synthesis; it has many interesting properties. Several dominating feature.
famous speech compression / decompression algorithms,
However, the cepstra, either based on LPC principle or based
including the famous code excited linear predictive coding
on the Fourier transformation, have a great disadvantage,
(CELP), are based on the LSP analysis, where the information
which often cannot distinguish two quite different phones, for
loss or predicting errors are often very small due to the LSP s
example, the er and ba in Chinese. Now that LSP is a kind
characteristics. Unfortunately till now there is not a satisfying
of successful feature in speech coding/decoding and its basic
kind of distance measure available for LSP so that this kind of
principle and behavior are somewhat different from those of
features can be used for speech recognition applications. In this
LPC parameters, we have strong reason to do more research in
paper, the principle of LSP analysis is studied at first, and then
LSP distance measure. Our motivation is to make it another
several distance measures for LSP are proposed which can
kind of feature for speech recognition besides the cepstrum.
describe very well the difference between two groups of
different LSP parameters. Experimental results are also given
The paper is organized as follows. At first, we introduce the
to show the efficiency of the proposed distance measures.
basic principle for LSPs and its relationship with LPC
parameters and the transfer function of its all-pole model.
1. INTRODUCTION
Secondly, we proposed some LSP distance measures based on
above principle. Thirdly, we give the experimental result to
Line Spectrum Pair (LSP) was first introduced by Itakura [4][8]
as an alternative kind of LPC spectral representation. It was support our proposal. At last we come to the final conclusions.
found that this new representation has such interesting
properties as (1) all zeros of LSP polynomials are on the unit 2. PRINCIPLE OF LINE SPECTRUM
circle, (2) the corresponding zeros of the symmetric and anti-
PAIR (LSP)
symmetric LSP polynomials are interlaced, and (3) the
reconstructed LPC all-pole filter preserves its minimum phase
2.1. Line Spectrum Pairs (Line Spectrum
property if (1) and (2) are kept intact through a quantization
procedure. Soong proved all these properties via a phase Frequencies)
function in his paper[7].
Given a specific order P for the vocal track model of the
After introduced, LSP parameters accompanying with the speech to be analyzed, LPC analysis results in an all-zero
vector quantization (VQ) technique play a very important role inverse filter
in speech coding/decoding and speech synthesis[7]. The
def P
famous code excited linear predictive (CELP) coding [1][2] is
A( z ) = AP ( z ) = 1 + a p z p, (1)
a good example in using the principles of linear predictive p =1
coding (LPC) and LSP to producing high quality speech in
which minimizes the residual energy [5]. In speech
very low bit rate.
compression and quantization based speech recognition, the
LPC coefficients {a1, a2 a P } are known to be inappropriate
But till now the LSP parameters are seldom applied to speech
recognition that is based on the statistical modeling and the VQ
for quantization because of their relatively large dynamic range
technique. Though there is a certain kind of distance measure and possible filter instability problems. Different set of
adopted in the VQ technique for speech coding, but the parameters representing the same spectral information, such as
measure is far from being suitable for speech recognition. reflection coefficients and log area ratios, etc., were thus
proposed for quantization in order to alleviate the above-
In speech recognition based on statistical modeling (such as mentioned problems. LSP is one such kind of representation of
hidden Markov models (HMM)) and the VQ technique, the spectral information. LSP parameters have both well-behaved
distance measure for features is of a critical importance. The dynamic range and filter stability preservation property, and
most recently used features nowadays are the cepstra. Because can be used to encode LPC spectral information even more
the Itakura distance measure has been proved suitable for LPC efficiently than any other parameters.
th
symmetric polynomial, are interlaced with each other in the
The LSP representation is rather artificial. For a given P order
interval (0, ) . That is to say
inverse filter as in (1), we can extend the order to (P+1)
th
without introducing any new information by letting the (P+1)
0