**January 8, 2021**

Decision Trees start select a root node based on a given condition and split the data into non overlapping subsets. Then the same process is repeated on the newly formed branches and the process continues till we reach desired result i.e. answer to our business question. The condition for splitting is the decided by the value of cost function. The node, splitting which would lead to maximum reduction in value of cost function, at that stage, is chosen.

Here the cost function can either be entropy or Gini Impurity. Performance wise both are similar buy Gini Impurity is chosen when dealing with large datasets, owing to it bein less computationally intensive.

by : Monis Khan

**Quick Summary**:

Decision Trees start select a root node based on a given condition and split the data into non overlapping subsets. Then the same process is repeated on the newly formed branches and the process continues till we reach desired result i.e. answer to our business question. The condition for splitting is the decided by the […]