In a probabilistic approach, where all available information is contained in the observation
, decision rules are in general based on the conditional probability
. A common procedure is to define a cost function that is minimised
by the optimal decision rule. In that sense we define to be the cost for taking
a decision in favour of class *i* when the correct decision is class *j*.
The overall cost for a decision rule is then

We can simplify this equation with the assumption that the cost is *zero* for a correct
decision and *unity* for a false decision. We obtain

where is the usual Kronecker Delta Function.
Furthermore, since we are assuming a deterministic rule,
the probability that will chose the correct class for *x* can only be
0 or 1, i.e. simplifies to .

It is now easy to see that the cost is minimised by the following rule:

By applying Bayes' Rule

and the assumption that the are equal for all classes, we can obtain the equivalent rule:

which leads to the conclusion that we can construct an optimal classifier that simply maps a pattern to the class for which the class conditional probability density is maximal.

Tue Jan 7 15:44:13 PST 1997