In the previous section we have shown that we can perform classification based upon the class conditional probability densities. In order to make use of this fact, it is necessary to calculate, or at least estimate, . While this may not be a difficult problem in low dimensions, it definitely is not simple to obtain accurate and fast estimates for high dimensional spaces, especially when the number of training examples is limited. Therefore we present in this section a method that uses principal components analysis to model probability densities in higher dimensions from a reasonably small set of training samples.

**Figure 7:** The probabilistic eigenspace approach

We assume that we have applied principal components analysis to a set of
*N*-dimensional training examples belonging to one class , so that
we can now approximately describe these examples by their mean vector
and a linear combination of *n* basis vectors , as in Section 3.4. We arrange these basis vectors as
columns of a matrix and we are going to omit the index *i* from now
on and assume that all of the following calculations are applied seperately to
each of the classes. Furthermore, let be the coordinate vector of a
point with respect to the basis and let be an
orthonormal basis for the orthogonal complement of . With these
definitions we can write:

This equation is illustrated in Figure 7.
Here is the *reconstruction error vector* and can be described by
the (*N*-*n*)-dimensional basis .
We are now going to model the reconstruction error as white Gaussian noise with variance
so that for a given we obtain the following probability density:

Here, shall denote a
multivariate gaussian distribution with mean vector and covariance matrix
. Note also, that the noise variance should be subscripted by
*i* in order to establish a slightly more general result. The
class-conditional density of can be obtained by multiplying
Equation (42) by and integrating over .
Thus,

In the derivation above we used the fact that

which holds as a consequence of the fact that , and form a rectangular triangle. In order to further simplify our result, we will now assume a particular density for , namely . That is, we assume the projections of the samples are Gaussian distributed (in the eigenspace) with mean and covariance . With this assumption, we can perform the integration in Equation (43) in closed form (refer to the proof in the appendix for details). Hence, the exact density of is given by:

For convenience we will now focus on the logarithm of this expression.

Which can be simplified to the following

where

We have now at our disposition a distance measure that allows us to assign each object to its most likely class. Our method has two main advantages over the standard eigenspace approach where only the reconstruction error is taken into account. First of all we make use of the probability distribution of the reconstruction error of each class, secondly we also consider the distribution of the class members in the eigenspace.

Tue Jan 7 15:44:13 PST 1997