In a probabilistic approach, where all available information is contained in the observation , decision rules are in general based on the conditional probability . A common procedure is to define a cost function that is minimised by the optimal decision rule. In that sense we define to be the cost for taking a decision in favour of class i when the correct decision is class j. The overall cost for a decision rule is then
We can simplify this equation with the assumption that the cost is zero for a correct decision and unity for a false decision. We obtain
where is the usual Kronecker Delta Function. Furthermore, since we are assuming a deterministic rule, the probability that will chose the correct class for x can only be 0 or 1, i.e. simplifies to .
It is now easy to see that the cost is minimised by the following rule:
By applying Bayes' Rule
and the assumption that the are equal for all classes, we can obtain the equivalent rule:
which leads to the conclusion that we can construct an optimal classifier that simply maps a pattern to the class for which the class conditional probability density is maximal.