Impurity criterion in Decision Tree

AI Maverick
3 min readJun 10, 2022

--

MSE criterion in Regressor Tree

Introduction

Decision Tree Regressor is an important Machine Learning model used in well-known Gradient Boosting Machines, including XGBoostt, LightGBM, GBM, etc. Note the Decision Tree Regressor is also used in the most derivatives of the Gradient Boosting Machine.

The Gradient boosting Machines are the Ensemble models and used the Decision Tree Regressors as their weak learners to predict the residual.

The focus of this article is to review the training approach of the Decision Tree Regressor, and how the tree structure and leaves build.

Terminology

  • Decision Tree Regressors: Individual Regression models with binary rules to return the target value.
  • Root Node: Includes the selected sample and features in a branch.
  • Parent and Child Node: A divided node is the parent, and the sub-nodes are the children.
  • Splitting: Node dividing strategy into the future nodes.
  • split criterion: A metric to divide the parent node into the children nodes.
  • Terminal Node: The latest split node.
  • Terminal region: The final zone that includes the terminal nodes and branch.

Overall structure

The Decision Tree is designed to estimate the target by building a tree and applying the binary rules ib the building process. The binary rules include a collection of a specific question determined by the tree. The estimated values at the end are stored in the Terminal or leaf nodes.

Sample of the Decision Tree Regressor

Splitting

The vital part of a decision tree is the split criterion, and due to the problem, (Classification or Regression) considers different criteria, which would be one of the following criteria.

  • Gini
  • entropy
  • log_loss
  • MSE

As Most of the Gradient Boosting methods consider the Regressor Tree as the base learner, we only review the MSE criterion for the regressor tree.

During tree building, at each node, it calculates the Mean Squared Error to build the following nodes. The single value of the MSE would be the metric that tells the tree the split quality.

Mean squared error impurity criterion

The MSE is a regression metric that measures the mean of the squares of the error. In simple words, the average of the squared difference between predicted and the real values. The ideal value would be the lowest value and closer to zero.

Evaluate the impurity

Evaluate the impurity of the current node with the MSE.

MSE of the current node

The best MSE selected for the split would be the lowest value.

Conclusion

In this review, we introduced a well-known ensemble model, named Gradient Boosting Machine and the related terminology. The split criterion was the focus of this study for the regression problems in the base learner of the ensemble which is the Decision Regressor Tree.

--

--

No responses yet