In a second experiment we tested the performance for speaker independent recognition. We therefore used examples by one speaker as training set and examples from a second speaker as test set. The results for this experiment are shown in Figure 21. These results are clearly not satisfactory. On the other hand, no systematic error can be deduced from the data, so that a more elaborate investigation is necessary to determine why the system is performing badly in this case. It may be possible that the normalisation with respect to different talking speeds is insufficient. In this case, a standard time-warping algorithm might help us to bypass this problem, but we have not yet been able to test this.
Figure 21: Confusion matrix for speaker independent recognition