next up previous contents
Next: Discussion of the Proposed Up: Classification Previous: An Optimal Decision Rule

The Probabilistic Eigenspace Approach


In the previous section we have shown that we can perform classification based upon the class conditional probability densities. In order to make use of this fact, it is necessary to calculate, or at least estimate, tex2html_wrap_inline2928 . While this may not be a difficult problem in low dimensions, it definitely is not simple to obtain accurate and fast estimates for high dimensional spaces, especially when the number of training examples is limited. Therefore we present in this section a method that uses principal components analysis to model probability densities in higher dimensions from a reasonably small set of training samples.

Figure 7: The probabilistic eigenspace approach

We assume that we have applied principal components analysis to a set of N-dimensional training examples belonging to one class tex2html_wrap_inline2926 , so that we can now approximately describe these examples by their mean vector tex2html_wrap_inline2952 and a linear combination of n basis vectors tex2html_wrap_inline2956 , as in Section 3.4. We arrange these basis vectors as columns of a matrix tex2html_wrap_inline2958 and we are going to omit the index i from now on and assume that all of the following calculations are applied seperately to each of the classes. Furthermore, let tex2html_wrap_inline2962 be the coordinate vector of a point with respect to the basis tex2html_wrap_inline2958 and let tex2html_wrap_inline2966 be an orthonormal basis for the orthogonal complement of tex2html_wrap_inline2958 . With these definitions we can write:


This equation is illustrated in Figure 7. Here tex2html_wrap_inline2974 is the reconstruction error vector and can be described by the (N-n)-dimensional basis tex2html_wrap_inline2966 . We are now going to model the reconstruction error as white Gaussian noise with variance tex2html_wrap_inline2980 so that for a given tex2html_wrap_inline2962 we obtain the following probability density:


Here, tex2html_wrap_inline2988 shall denote a multivariate gaussian distribution with mean vector tex2html_wrap_inline2710 and covariance matrix tex2html_wrap_inline2992 . Note also, that the noise variance tex2html_wrap_inline2994 should be subscripted by i in order to establish a slightly more general result. The class-conditional density of tex2html_wrap_inline2998 can be obtained by multiplying Equation (42) by tex2html_wrap_inline3000 and integrating over tex2html_wrap_inline2962 . Thus,


In the derivation above we used the fact that


which holds as a consequence of the fact that tex2html_wrap_inline2974 , tex2html_wrap_inline3082 and tex2html_wrap_inline3084 form a rectangular triangle. In order to further simplify our result, we will now assume a particular density for tex2html_wrap_inline3000 , namely tex2html_wrap_inline3088 . That is, we assume the projections of the samples are Gaussian distributed (in the eigenspace) with mean tex2html_wrap_inline3090 and covariance tex2html_wrap_inline3092 . With this assumption, we can perform the integration in Equation (43) in closed form (refer to the proof in the appendix for details). Hence, the exact density of tex2html_wrap_inline2998 is given by:


For convenience we will now focus on the logarithm of this expression.


Which can be simplified to the following




We have now at our disposition a distance measure that allows us to assign each object to its most likely class. Our method has two main advantages over the standard eigenspace approach where only the reconstruction error is taken into account. First of all we make use of the probability distribution of the reconstruction error of each class, secondly we also consider the distribution of the class members in the eigenspace.

next up previous contents
Next: Discussion of the Proposed Up: Classification Previous: An Optimal Decision Rule

Markus Weber
Tue Jan 7 15:44:13 PST 1997