This code is for Naive Bayes Spam Classification on the SMS Spam Collection Data Set from the UCI Machine Learning Repository ( )

This particular version of Naive Bayes is based on the The Bernoulli document model classification principle. The Maximum A posteriori Parameter Estimation Technique was used to compute the word Probabilities. The Beta distribution with Beta(2,1) was used as a prior.

The Preproceesing part involved the following steps :

1)Removal of trailing spaces

2)Removal of Non Words

3)Removal of Stop Words