• Tree Pruning is performed in order to remove anomalies in training data due to noise or outliers.
  • The pruned trees are smaller and less complex.
  • Pruning is a technique in machine learning that reduces the size of decision trees by removing sections of the tree that provide little power to classify instances.
  • The dual goal of pruning is reduced complexity of the final classifier as well as better predictive accuracy by the reduction of overfitting and removal of sections of a classifier that may be based on noisy or erroneous data.
  • One of the questions that arise in a decision tree algorithm is the optimal size of the final tree.
  • A tree that is too large risks overfitting the training data and poorly generalizing to new samples.
  • A small tree might not capture important structural information about the sample space.
  • It is hard to tell when a tree algorithm should stop because it is impossible to tell if the addition of a single extra node will dramatically decrease error.
  • A common strategy is to grow the tree until each node contains a small number of instances then use pruning to remove nodes that do not provide additional information.
  • Pruning should reduce the size of a learning tree without reducing predictive accuracy as measured by a test set or using cross-validation.
  • There are many techniques for tree pruning that differ in the measurement that is used to optimize performance.

Tree pruning approaches

  1. Prepruning - The tree is pruned by halting its construction early.
  2. Postpruning - This approach removes subtree form fully grown tree.

Pre-pruning

  • Based on statistical significance test.
  • Stop growing the tree when there is no statistically significant association between any attribute and the class at a particular node
  • Most popular test: chi-squared test
  • ID3 used chi-squared test in addition to information gain.
  • Only statistically significant attributes were allowed to be selected by information gain procedure.
  • Pre-pruning may stop the growth process prematurely: early stopping
  • Pre-pruning faster than post-pruning

Post-pruning

  • First, build full tree then, prune it.
  • Fully-grown tree shows all attribute interactions 
  • Problem: some subtrees might be due to chance effects
  • Two pruning operations:
  • Subtree replacement
  • Subtree raising

Possible strategies:

    • error estimation
    • significance testing
    • MDL principle
  • Subtree replacement selects a subtree and replaces it with a single leaf.
  • Subtree raising selects a subtree and replaces it with the child one ie, a "sub-subtree" replaces its parent)