next up previous contents
Next: Experimental Results Up: Discussion of the Proposed Previous: Estimation of Mouth Elongation

Word Recognition

For the purpose of word recognition, each utterance is represented by three seperate waveforms, which describe the movement of the two lips and the mouth elongation over time. A simple time-warping step normalises the length of each waveform to 50 sample points. In this representation the waveforms are then treated as patterns (pattern-vectors of dimension 50), subject to our classification method developed in Section 3.6.

During the training phase we calculate for each class of waveforms a set of about 10 basis vectors, the covariance matrix reflecting the distribution of the patterns in the eigenspace, as well as the variance of the reconstruction error. All this is done separately for each class of words that is to be recognised.

In the recognition step, we calculate for each incoming utterance the distance of its three waveforms to each of the correspoding word classes. As distance measure we apply the expression for tex2html_wrap_inline3210 obtained in Section 3.6.



Markus Weber
Tue Jan 7 15:44:13 PST 1997