Dan Steinberg's Blog
On Demand Introductory Videos
Download Now Instant Evaluation
Get Price Quote

Regression Tree Ensembles

Many have asked if RandomForests (RF) supports regression analysis.

 

The short answer is: not with the current implementation. Salford Systems plans to support RF regression in our next release.

That said, if you have been thinking about RF regression we urge you to consider using TreeNet regression instead. Some reasons follow:

  1. TreeNet originally was designed to be about regression and not classification. Friedman's original name for the TreeNet technology was Multiple Additive Regression Trees.

  2. TreeNet is a superb performer for the regression problem; we have used it in a number of demanding real world applications.

  3. TreeNet develops multiple tree models but the trees are generally quite small and remain small regardless of the size of the training data file. By contrast, RF trees grow with the size of the training data and can become unmanageable, particularly in deployment.

  4. RF was originally designed for the classification problem and much of the post-processing of the RF trees focuses on the class membership of the records (with-in and -out of bag). None of this elaborate machinery is useful for regression.

  5. Leo Breiman left regression out of the original RF stream of his work. Only after years of focusing on the classification problem did he address regression and this work was never completed. As a result, we do not know where Leo was going with RF regression, although we do know that he wanted to use a completely new code base for it. His co-author and collaborator, Adele Cutler, also has remained focused on classification; thus, RF regression has languished. (TreeNet regression thrives and is being enhanced on an ongoing basis.)

  6. TreeNet delivers useful "partial dependency plots" that reveal the true conditional relationship between the target Y and any predictor X and can also be used to definitively identify key interactions. We know of no other technology that can offer this kind of insight into the data-generating process. (TreeNet Pro Ex can do this automatically.)

[J#40:1603]

Tags: Blog, Regression