Post Job Free
Sign in

Computer Science Information

Location:
China
Posted:
February 04, 2013

Contact this candidate

Resume:

THE DISTANCE MEASURE FOR LINE SPECTRUM PAIRS

APPLIED TO SPEECH RECOGNITION

Fang Zheng, Zhanjiang Song, Ling Li, Wenjian Yu, Fengzhou Zheng, and Wenhu Wu

Speech Laboratory, Department of Computer Science and Technology,

Tsinghua University, Beijing, 100084, P. R. China

******@**.**.********.***.**

coefficients both theoretically and practically[3], therefore the

ABSTRACT Euclidean distance measure is good for the LPC derived

cepstral vectors[3][10]. The solution of cepstrum distance

The Line Spectrum Pair (LSP) based on the principle of linear measure makes the cepstra and their derived parameters play a

predictive coding (LPC) plays a very important role in the very important role in speech recognition and become a kind of

speech synthesis; it has many interesting properties. Several dominating feature.

famous speech compression / decompression algorithms,

However, the cepstra, either based on LPC principle or based

including the famous code excited linear predictive coding

on the Fourier transformation, have a great disadvantage,

(CELP), are based on the LSP analysis, where the information

which often cannot distinguish two quite different phones, for

loss or predicting errors are often very small due to the LSP s

example, the er and ba in Chinese. Now that LSP is a kind

characteristics. Unfortunately till now there is not a satisfying

of successful feature in speech coding/decoding and its basic

kind of distance measure available for LSP so that this kind of

principle and behavior are somewhat different from those of

features can be used for speech recognition applications. In this

LPC parameters, we have strong reason to do more research in

paper, the principle of LSP analysis is studied at first, and then

LSP distance measure. Our motivation is to make it another

several distance measures for LSP are proposed which can

kind of feature for speech recognition besides the cepstrum.

describe very well the difference between two groups of

different LSP parameters. Experimental results are also given

The paper is organized as follows. At first, we introduce the

to show the efficiency of the proposed distance measures.

basic principle for LSPs and its relationship with LPC

parameters and the transfer function of its all-pole model.

1. INTRODUCTION

Secondly, we proposed some LSP distance measures based on

above principle. Thirdly, we give the experimental result to

Line Spectrum Pair (LSP) was first introduced by Itakura [4][8]

as an alternative kind of LPC spectral representation. It was support our proposal. At last we come to the final conclusions.

found that this new representation has such interesting

properties as (1) all zeros of LSP polynomials are on the unit 2. PRINCIPLE OF LINE SPECTRUM

circle, (2) the corresponding zeros of the symmetric and anti-

PAIR (LSP)

symmetric LSP polynomials are interlaced, and (3) the

reconstructed LPC all-pole filter preserves its minimum phase

2.1. Line Spectrum Pairs (Line Spectrum

property if (1) and (2) are kept intact through a quantization

procedure. Soong proved all these properties via a phase Frequencies)

function in his paper[7].

Given a specific order P for the vocal track model of the

After introduced, LSP parameters accompanying with the speech to be analyzed, LPC analysis results in an all-zero

vector quantization (VQ) technique play a very important role inverse filter

in speech coding/decoding and speech synthesis[7]. The

def P

famous code excited linear predictive (CELP) coding [1][2] is

A( z ) = AP ( z ) = 1 + a p z p, (1)

a good example in using the principles of linear predictive p =1

coding (LPC) and LSP to producing high quality speech in

which minimizes the residual energy [5]. In speech

very low bit rate.

compression and quantization based speech recognition, the

LPC coefficients {a1, a2 a P } are known to be inappropriate

But till now the LSP parameters are seldom applied to speech

recognition that is based on the statistical modeling and the VQ

for quantization because of their relatively large dynamic range

technique. Though there is a certain kind of distance measure and possible filter instability problems. Different set of

adopted in the VQ technique for speech coding, but the parameters representing the same spectral information, such as

measure is far from being suitable for speech recognition. reflection coefficients and log area ratios, etc., were thus

proposed for quantization in order to alleviate the above-

In speech recognition based on statistical modeling (such as mentioned problems. LSP is one such kind of representation of

hidden Markov models (HMM)) and the VQ technique, the spectral information. LSP parameters have both well-behaved

distance measure for features is of a critical importance. The dynamic range and filter stability preservation property, and

most recently used features nowadays are the cepstra. Because can be used to encode LPC spectral information even more

the Itakura distance measure has been proved suitable for LPC efficiently than any other parameters.

th

symmetric polynomial, are interlaced with each other in the

The LSP representation is rather artificial. For a given P order

interval (0, ) . That is to say

inverse filter as in (1), we can extend the order to (P+1)

th

without introducing any new information by letting the (P+1)

0



Contact this candidate