**December 24, 2020**

In ridge regression the loss function in addition to RSS (or other error function) has lambda (a constant) times summation of squares of coefficients of independent variables. By doing this we make loss proportional to magnitude of coefficients of independent variables and using the gradient descent we get the new value of coefficients which are lower than their previous values. Thus reducing overfitting.

It is generally used when the multicollinearity is relatively low. The value of lambda is a hyperparameter and is calculated during cross validation.

by : Monis Khan

**Quick Summary**:

In ridge regression the loss function in addition to RSS (or other error function) has lambda (a constant) times summation of squares of coefficients of independent variables. By doing this we make loss proportional to magnitude of coefficients of independent variables and using the gradient descent we get the new value of coefficients which are […]