The first experiment demonstrates the performance for speaker dependent recognition. Therefore we acquired between 40 and 70 samples of each digit spoken by the same speaker. The system was then trained with 15 examples per digit and finally tested on the whole set. For two different speakers, the results we obtained are presented in form of a confusion matrix in Figure 20. The results for this experiment seem very promising. In effect, the system has made 10 errors for more than 500 words to be classified, which leaves us with an recognition rate of more than 98%.
Figure 20: Confusion matrix for speaker dependent recognition