Principle Components Analysis (PCA), also known as Karhunen-Loève expansion or Eigen-XY analysis, has in recent years found a number of applications in the fields of computer vision and pattern recognition. As mentioned in Section 2.1, PCA is based on representing typical images in terms of a compact set of orthogonal basis images. We will now provide a concise summary of the main ideas behind PCA. For more details, refer to Kirby et al. [KWD93] or Turk and Pentland [TP91].
Assume that we have M column vectors representing M different
sampled signals, which we will label
. Furthermore, let
represent the mean of
the
's,
and let
be defined as
For illustrative purposes, we will assume that each
represents
an
image of a face which has been concatenated into an
column vector by placing each column of the image atop
the next.
In the huge
-dimensional space of possible images, it is reasonable
to assume that the set of points corresponding to human faces are somewhat
clustered to a relatively low-dimensional subspace. The goal of PCA,
in the present context, is to
determine from the set of
's a basis for the subspace within
which most faces can be represented with a small amount of error.
Mathematically, we can express this goal as follows: we wish to find
a set of M orthonormal vectors
for which the quantity
is maximised, subject to the orthonormality constraint
In other words, our goal is to find a set of
's which have
the largest possible projection onto each of the
's.
Assuming
, the set of
's offer us an efficient and
compact means of representing faces as linear combinations of precomputed
basis vectors. In [TP91], this set of basis vectors, i.e. the
's, are referred to as ``eigenfaces.''
Using Rayleigh's principle, it can be shown that the
's
and
's are given by the eigenvectors and eigenvalues of
the covariance matrix
where
is a matrix comprised of the column vectors
placed side by side.
Since the matrix
is often enormous (
), it is
not practical to solve for the eigenvectors of
directly. The number of sample points in each vector,
, is generally
much larger than the dimension M of the matrix
, however, and as
a result, all but M-1 of
's eigenvalues are equal to zero.
(The -1 arises from the subtraction of the mean vector
.)
As described in [TP91], the vectors
and scalars
can be obtained indirectly by solving for the eigenvectors
and eigenvalues of the
matrix
.
To summarise, given the eigenvectors
and eigenvalues
of
,
we can premultiply both sides by
,
which tells us that the first M-1 eigenvectors and eigenvalues of
are given by
and
,
respectively