Classification Error = 1 – max{pj}
where pj is the probability of the class value j.
For example, given that
Prob( Bus ) = 4 / 10 = 0.4 # 4B / 10 rows
Prob( Car ) = 3 / 10 = 0.3 # 3C / 10 rows
Prob( Train ) = 3 / 10 = 0.3 # 3T / 10 rows
we can now compute Classification error as
Classification error
= 1 – Max{0.4, 0.3, 0.3}
= 1 – 0.4 = 0.60
Similar to Entropy and Gini Index, Classification error index of a pure table (consist of single class) is zero because the probability is 1 and 1-max(1)=0.
The value of classification error index is always between 0 and 1.
In fact the maximum Gini index for a given number of classes is always equal to the maximum of classification error index because for a number of classes n, we set probability is equal to p=1/n and maximum Gini index happens at
1 – n×(1/n)2 = 1 – 1/nwhile maximum classification error index also happens at
1 – max{1/n} = 1 – 1/n
Knowing how to compute degree of impurity, now we are ready to proceed with decision tree algorithms.
|
“I think it’s impossible to really understand somebody, what they want, what they believe, and not love them the way they love themselves.” ― Orson Scott Card, Ender’s Game |