Machine Learning

December 24, 2020

What is normalization?

Normalization basically means rescaling the data so that it falls in a smaller range. In machine learning problems normalization is used to change the values of numeric columns in the dataset to use a common scale, without distorting the differences in range values or losing information. Suppose you’re working on a problem where one attribute may be in kilograms, other in meters while yet another is present as hours. Owing to the difference in the units they will have difference in magnitude, for ex- change of 1 unit in hour column may correspond to change of 2000 m in distance column.

Machine learning equations operate on certain basic assumptions and difference in magnitude of variance in input variables could adversely affect their performance. For example distance based algorithms and gradient descent algorithms malfunction when input variables are on significantly different scales. What we mean by malfunction is that the learning is dominated by variable whose magnitude of variation is higher than others.

by : Monis Khan

Quick Summary:

Normalization basically means rescaling the data so that it falls in a smaller range. In machine learning problems normalization is used to change the values of numeric columns in the dataset to use a common scale, without distorting the differences in range values or losing information. Suppose you’re working on a problem where one attribute […]