Prev: Naive Bayes with Discretization, Next: Bagging, Up: Build Your Own Learner
The naive Bayesian classification method we will implement in this lesson uses standard naive Bayesian algorithm also described in Michell: Machine Learning, 1997 (pages 177-180). Essentially, if an instance is described with n attributes ai (i from 1 to n), then the class that instance is classified to a class v from set of possible classes V according to naive Bayes classifier is:
We will also compute a vector of elements
which, after normalization so that the sum of elements is equal to 1, will (also by Mitchell) denote the class probabilities. The class probabilities and conditional probabilities (priors) in above formulas are estimated from training data: class probability is equal to the relative class frequency, while the conditional probability of attribute value given class is computed by figuring out the proportion of instances with a value of i-th attribute equal to ai among instances that from class vj.
To complicate things just a little bit, m-estimate (see Mitchell, and Cestnik IJCAI-1990) will be used instead of relative frequency when computing prior conditional probabilities. So (following the example in Mitchell), when assessing P=P(Wind=strong|PlayTennis=no) we find that the total number of training examples with PlayTennis=no is n=5, and of these there are nc=3 for which Wind=strong, than using relative frequency the corresponding probability would be
Relative frequency has a problem when number of instance is small, and to alleviate that m-estimate assumes that there are m imaginary cases (m is also referred to as equivalent sample size) with equal probability of class values p. Our conditional probability using m-estimate is then computed as
Often, instead of uniform class probability p, a relative class frequency as estimated from training data is taken.
We will develop a module called bayes.py that will implement our
naive Bayes learner and classifier. The structure of the module
will be as with previous example.
Again, we will implement two classes, one for learning and the other
on for classification. Here is a Learner
: class
class Learner_Class from bayes.py
Initialization of Learner_Class saves the two attributes, m and
name of the classifier. Notice that both parameters are optional,
and the default value for m is 0, making naive Bayes m-estimate
equal to relative frequency unless the user specifies some other
value for m. Function __call__
is called with the training data
set, computes class and conditional probabilities and calls
classifiers, passing the probabilities along with some other
variables required for classification.
class Classifier from bayes.py
Upon the first invocation, the classifier will store the values of
the parameters it was called with (__init__
). When called with a
data instance, it will first compute the class probabilities using
the prior probabilities sent by the learner. The probabilities will
be normalized to sum to 1. The class will then be found that has
the highest probability, and the classifier will accordingly
predict to this class. Notice that we have also added a method
called show, which reports on m, class probabilities and
conditional probabilities:
uses voting.tab
> python >>> import orange, bayes >>> data = orange.ExampleTable("voting") >>> classifier = bayes.Learner(data) >>> classifier.show() m= 0.0 class prob= [0.38620689655172413, 0.61379310344827587] cond prob= [[[0.79761904761904767, 0.38202247191011235], ...]] >>>
The following script tests our naive Bayes, and compares it to 10-nearest neighbors. Running the script (do you it yourself) reports classification accuracies just about 90% (somehow, on this data set, kNN does better; smrc…).
bayes_test.py (uses bayes.py and voting.tab)
Prev: Naive Bayes with Discretization, Next: Bagging, Up: Build Your Own Learner