Dan Steinberg's Blog
On Demand Introductory Videos
Download Now Instant Evaluation
Get Price Quote

Why Use Cross-Validation?

Let's start with the idea of the BATTERY. The BATTERY mechanism is an automated system for running experiments and trying out different modeling ideas. Instead of you having to think about how you would like to tweak your model to try to make it better the BATTERY does it for you. Each BATTERY is a planned experiment in which we take some useful modeling control and run a series of models in which we systematically change that control. The best part of this is the SUMMARY which provides you with an executive summary of the results and points you to the best performing model. We recommend that you use the BATTERY often; some modelers don't do anything without setting up pre–packaged or user customized batteries.

Salford Predictive Modeler™ and its component data mining engines CART®, MARS®, TreeNet®, and RandomForests® contain a variety of tools to help modelers work quickly and efficiently. One of the most effective tools for rapid model development is found in the BATTERY tab of the MODEL Set Up dialog. Because there are so many tools embedded in that dialog we are going to start a series of posts going through the principal BATTERY choices, one at a time.

Let's start with the idea of the BATTERY. The BATTERY mechanism is an automated system for running experiments and trying out different modeling ideas. Instead of you having to think about how you would like to tweak your model to try to make it better the BATTERY does it for you. Each BATTERY is a planned experiment in which we take some useful modeling control and run a series of models in which we systematically change that control. The best part of this is the SUMMARY which provides you with an executive summary of the results and points you to the best performing model. We recommend that you use the BATTERY often; some modelers don't do anything without setting up pre–packaged or user customized batteries.

Today I want to address the "Repeated Cross–Validation" or BATTERY CVR. Cross–validation is a method for testing models and arriving at honest assessments of their performance on future unseen data. Cross–validation is typically used when the training data set is small or when the number of events of interest is small. Although crossvalidation was not invented by the CART authors Breiman, Friedman, Olshen, and Stone, they did show for the first time how this methodology could be used in the context of decision trees. The parts of the CART monograph devoted to cross–validation have had enormous impact in the field of data mining.

Why use cross–validation?

Sometimes you really have no choice. In one of the examples in the CART monograph the authors work with a medical data set that contained data for just 215 patients. Such a small data set will not support a useful division of the data into train and test partitions. So we use cross–validation to be able to both train with all the data and then indirectly test with all the data as well. The magic of cross–validation is explained in other of our posts and videos.

We recently found ourselves using cross–validation on a binary classification data set with over 100,000 records. The principal reason was that the event of interest was relatively rare; there were only about 6,000 "events" and this was what we were trying to study. So don't think that cross–validation is only for researchers with tiny data sets.

So, given the solid scientific foundations of cross–validation, we can just use it and obtain the reliable performance results we want, right? Here is where the BATTERY CVR is relevant. It is important to understand that cross–validation involves the running of a random experiment. The randomness stems from the random division of the data into the separate partitions that define the separate folds of the cross–validation. It is conventional to run cross–validation using 10 "folds." Because we divide the data into 10 partitions and average performance/test results across the partitions users sometimes think they have done all that is required to obtain reliable results. But the results of any CV run are the results of a single random experiment, and the results will change if you run the experiment with a different random number seed. BATTERY CVR is intended to help you understand just how much your results depend on the specific experiment you might have just run.

We recommend running BATTERY CVR with at least 10 replications, although 30 would be preferable, and some sampling reuse experts prefer many more replications. But the idea is always the same: to review the stability of the results across multiple replications. If your results are tightly clustered around their average you will be much more confident of your results than if they vary wildly from run to run.

So what to do with the results of BATTERY CVR? First, the results will give you an honest assessment of the possible range of performance results for your preferred model. A single CV run might tell you that your test area under the ROC curve is .74 and the BATTERY may show you a range from .71 to .76. Second, if the results are too erratic you might consider reworking your model to arrive at more stable results. This could involve model simplification, elimination of problematic predictors, or changing of some other model construction controls (e.g. ATOM, MINCHILD, or splitting rules).

Look for our power point slides and video which work through a specific example using SPM 6.8 for further instruction.

[J#74:1603]

Tags: SPM, BATTERY, Cross Validation