Classification Error = 1 – max{pj}where
pj
is the probability of the class value j
.
For example, given that
Prob( Bus ) = 4 / 10 = 0.4 # 4B / 10 rows Prob( Car ) = 3 / 10 = 0.3 # 3C / 10 rows Prob( Train ) = 3 / 10 = 0.3 # 3T / 10 rowswe can now compute Classification error as
Classification error = 1 – Max{0.4, 0.3, 0.3} = 1 – 0.4 = 0.60Similar to Entropy and Gini Index, Classification error index of a pure table (consist of single class) is zero because the probability is 1 and
1-max(1)=0
.
The value of classification error index is always between 0 and 1.
In fact the maximum Gini index for a given number of classes is always equal to the maximum of classification error index because for a number of classes n
, we set probability is equal to p=1/n
and maximum Gini index happens at
1 – n×(1/n)2 = 1 – 1/nwhile maximum classification error index also happens at
1 – max{1/n} = 1 – 1/nKnowing how to compute degree of impurity, now we are ready to proceed with decision tree algorithms.
“I think it’s impossible to really understand somebody, what they want, what they believe, and not love them the way they love themselves.” ― Orson Scott Card, Ender’s Game |