Logistic Regression
Introduction
Logistic Regression is a popular machine learning algorithm used for binary classification tasks, where the goal is to predict one of two possible outcomes (e.g., yes/no, 1/0, spam/not spam). At its core, Logistic Regression models the relationship between a set of independent variables (features) and the probability of a particular outcome. It's called "logistic" because it uses the logistic function (or sigmoid function) to map any real-valued number into a value between 0 and 1. This makes it suitable for estimating probabilities.
Sigmoid Function
- y is the linear combination.
- θ0, θ1, θ2, ... θn are the coefficients.
- X0, X1, X2, ... Xn are the input features.
- p is the predicted probability that the outcome is 1
Decision Boundary
Typically, a threshold (e.g., 0.5) is chosen. If p is greater than the threshold, the predicted outcome is 1; otherwise, it's 0.
Cost Function
n logistic regression, the cost function, often referred to as the log loss or cross-entropy loss, is used to measure the error between the predicted probabilities and the actual binary outcomes (0 or 1). The goal is to find the values of the coefficients that minimize this error.
Here:
- J(θ) is the cost function to be minimized
- n is the number of training examples
- yi is the actual binary outcome (0 or 1) for the iith training example
- pi is the predicted probability that the ith example belongs to class 1
Comments
Post a Comment