Naive Bayes

September 30, 2023

Naive Bayes

Maths behind Naive Bayes

Introduction

The Naive Bayes algorithm is a simple and efficient probabilistic classification algorithm that is based on Bayes' theorem. It is widely used in various machine learning applications, particularly in natural language processing and text classification.

Maths behind it

The key idea behind Naive Bayes is to use Bayes' theorem to calculate the probability that a given data point belongs to a particular class. Bayes' theorem is expressed as:

Here,

P(C|X) is the posterior probability that data point X belongs to class C
P(X|C) is the likelihood of observing data point X given that it belongs to class C
P(C) is the prior probability of class C
P(X) is the marginal likelihood of observing data point X

In the context of Naive Bayes, we make a naive assumption that the features used to describe the data are conditionally independent given the class label. This simplifies the calculation of P(X|C) as the product of the probabilities of individual features:

P(X|C) = P(x₁|C) . P(x₂|C) ... P(x_n|C)

How the classification works

Calculate the prior probabilities P(C) for each class based on the training data
For each feature x_i in the data point X, calculate P(x_i|C) for each class C based on the training data
Use Bayes' theorem to calculate the posterior probabilities P(C|X) for each class C
Assign the data point X to the class with the highest posterior probability

Example (Most important)

Let's assume the following represents the dataset about wether someone can play a match on a particular day

Weather	Can Play?
Sunny	Yes
Windy	Yes
Sunny	Yes
Rain	No
Windy	No
Windy	No
Windy	No
Rain	Yes
Sunny	No
Rain	No

Likelihood of the table

Weather	Yes	No	Probabiity
Sunny	2	1	3/10=0.3
Windy	1	3	4/10=0.4
Rain	0	3	3/10=0.3
All	3/10=0.3	7/10=0.7

So lets say we want to find out the probability of a player going to play on a sunny day P(Yes|Sunny).
Remember P(C|X) is the posterior probability that data point X belongs to class C. Sunny is X and C is class Yes

P(Yes|Sunny) = {P(Sunny|Yes) . P(Yes)}/ P(Sunny)
Here,
P(Yes) = 0.3 # row all
P(Sunny) = 0.3 # row sunny
P(Sunny|Yes) = Number of times it was Sunny with Yes/ Total number of Yes
= 2 / 3 = 0.66

Thus,
P(Yes|Sunny) = 0.66 * 0.3/ 0.3 = 0.66

Similarly , player not going to play on a sunny day P(No|Sunny)

P(No|Sunny) = {P(Sunny|No) . P(No)}/ P(Sunny)
Here,
P(No) = 0.7 # row all
P(Sunny) = 0.3 # row sunny
P(Sunny|No) = Number of times it was Sunny with No/ Total number of No
= 1 / 7 = 0.14

Thus,
P(No|Sunny) =0.14 * 0.7/ 0.3 = 0.32

Since P(Yes|Sunny) > P(No|Sunny) ie 0.66 > 0.32, the player will play

Cost function

Unlike many other machine learning algorithms, Naive Bayes does not have a cost function that is optimized during training. Instead, it relies on probability calculations and the Bayes' theorem to make classification decisions. The goal of Naive Bayes is to maximize the posterior probability P(C|X) for each class C to make accurate predictions.

References:

Naïve Bayes Classifier Algorithm

Note: Parts of the article are developed by using ChatGPT

Search This Blog

prabhat kumar singh

Naive Bayes

Introduction

Maths behind it

How the classification works

Example (Most important)

Cost function

References:

Comments

Post a Comment

Popular Posts

Math behind Simple and Multiple Linear Regression

Math's behind Correlation