What is regularization in machine learning? |Overfitting |Underfitting | Bias |Variance

Question

What is regularization in machine learning? |Overfitting |Underfitting | Bias |Variance

asked Apr 2 in Artificial Intelligence(AI) & Machine Learning by Nisha Goeduhub's Expert (3.1k points)
edited Apr 3 by Nisha

This article deals with basic concept of machine learning is regularization and how it works as well as different techniques to use it, that is , ridge regression and lasso regression.

Goeduhub's Online Courses @Udemy

For Indian Students- INR 570/- || For International Students- $12.99/-

S.No.	Course Name	Apply Coupon
1.	Tensorflow 2 & Keras:Deep Learning & Artificial Intelligence	Apply Coupon
2.	Computer Vision with OpenCV \| Deep Learning CNN Projects	Apply Coupon
3.	Complete Machine Learning & Data Science with Python	Apply Coupon
4.	Natural Language Processing-NLP with Deep Learning in Python	Apply Coupon
5.	Computer Vision OpenCV Python \| YOLO\| Deep Learning in Colab	Apply Coupon
6.	Complete Python Programming from scratch with Projects	Apply Coupon

1 Answer

answered Apr 3 by Nisha Goeduhub's Expert (3.1k points)
edited Apr 3 by Nisha

Best answer

Before starting with this blog you should know Concepts of underfitting, overfitting, bias and variance in machine learning .

Regularization:

Regularization is one of the basic and most important concept in machine learning. We know that overfitting of models is tends to low accuracy and high error.

And this happens because the model is trying too hard to capture the noise and unnecessary data in the training dataset.

The noise is basically data points that don’t really represent the true properties of the data but represent random data and this lead high variance (variance is a prediction error) and low bias.

In overfitting we get high error for testing data and less error for training, it happens because our model not generalized test data or unseen data.

To avoid overfitting or to deal with it we have several techniques for example cross-validation on test data , bias-variance balance, ensemble algorithms and regularization etc...

Here we will discuss regularization technique to deal with problem of overfitting.

Regularization means to make things regular or acceptable. This is exactly what we are doing here. We know overfitting occurs mostly when we try to train a complex model the regularization in simple terms try to discourage learning a more complex or flexible model, so as to avoid the risk of overfitting.

By definition regularization is the process of adding information in order to solve an ill-posed problem or to prevent overfitting.

How it works ?

To understand regularization let's take a look to simple linear regression equation.

linear_regression

where Y represent the predicted value

β represent weights or magnitude attached to the features or coefficient estimate.

β0 represent the bias of model

X1, X2...Xp are features

The loss function for this fitting procedure is residual sum of square (RSS)

linear_regression_loss

Now here we are trying to adjust the coefficients or weights based on training data as we adjust these coefficients based on training data these coefficients won’t generalize well to the test data or random data.

And this is the situation where we use generalization to shrinks or regularizes these learned estimates towards zero. This leads in optimizing parameters (weights and bias) so as to reduce RSS (error) to predict Y as actual value.

Regularization Techniques

There are two types of regularization techniques that is

Ridge Regression
Lasso Regression

Ridge Regression

Ridge regression is one of the type of linear regression where RSS is modified by adding the shrinkage quantity. consider the formula given below (Ridge regression cost function)

ridge regression

In the formula you can see that the formula is similar to RSS (Residual square sum) with some additional terms in it that is λ multiplied with sum of squared coefficients of each individual feature.

This additional term here is known as penalty term and the amount of bias added here is known as Ridge Regression penalty.

By adding penalty term we basically regularizes the coefficients of the model, and hence ridge regression reduces the amplitudes of the coefficients that decreases the complexity of the model.

If λ tend to zero, the equation of cost function will become equation of cost function of linear regression. Hence, the value of λ plays an important role here and can be seen that selecting a good value of λ is critical.

It is also called as L2 regularization and used when there is a lot of parameters and collinearity in model.

Lasso Regression:

Lasso is also a type of regression stand for Least Absolute Shrinkage and Selection Operator. See the cost function for Lasso regression.

lasso_regression

As you can see it is similar to Ridge Regression except that the penalty term contains only the absolute weights instead of a square of weights.

It is also known as L1 regularization. Lasso regression also used to reduce overfitting and regularize the coefficient.

The difference between ridge and lasso regression is that it tends to make coefficients to absolute zero as compared to Ridge which never sets the value of coefficient to absolute zero.

Online Courses	Free Tutorials	Go to Your University	Placement Preparation

Online Training - Youtube Live Class Link

What is regularization in machine learning? |Overfitting |Underfitting | Bias |Variance

Goeduhub's Online Courses @Udemy

For Indian Students- INR 570/- || For International Students- $12.99/-

Please log in or register to answer this question.

1 Answer

Regularization:

Ridge Regression

Lasso Regression:

Please log in or register to add a comment.

Our Mentors(For AI-ML)

Related questions