Dan Steinberg's Blog
On Demand Introductory Videos
Download Now Instant Evaluation
Get Price Quote

SPM Command Line Options

Non-GUI (aka command line) SPM includes a number of command line options, offering the user some additional flexibility. These can be displayed by launching SPM with the -h flag, like so:

C:\> spm -h

This displays the following:

Command line syntax is:

spm [options] [commandfile] [options]

Options are

-e Echo results to console

-q Quiet, suppress all output including errors

-o Direct text results to a file

-u Attach to a dataset

-w Identify Stat/Transfer dll path

-t Identify scratch file path

-s Data amount in MB, subject to license threshold

-m Model space in MB, subject to hardware limits

-l Error/warnings to text logfile

-mt Lookup table capacities, 0 to grow without bound

-v Specifies max N variables for the session

E.g.:

spm -e model1.cmd

spm \DataMining\Jobs-1\simulate.cmd -q

spm job1.cmd -o\RESULTS\job1.txt -u\AnalysisData\sample1.sys

spm -u\MyData\joint_data.xls[xls5]

spm -s512 -p64 -m128

Environment variables can be used in lieu of command line switches:

SALFORD_S in lieu of -s

SALFORD_M in lieu of -m

SALFORD_P in lieu of -p

In this article, we will explain and discuss these various options.

Interactive and Batch Modes

If SPM is launched without any arguments, you get something like the following:

Salford Predictive Miner version 6.6.0.067

CART(R), TreeNet(R), MARS(R), RandomForests(R), PathSeeker, LOGIT, PROBIT, TSLS

Copyright, 1991-2010, Salford Systems, San Diego, California, USA

Launched on 9/20/2010 with no expiration.

This session supports up to 32768 variables.

67 MB RAM allocated at launch, partitioned as:

Real : 3866368 cells

Integer : 1114112 cells

Character: 25166336 cells

The license supports up to 9999999 MB of learn sample data.

Processing commands from: e:\salford\SALFORD.CMD

>

You can then type SPM commands interactively and the results will be written back. You can terminate the session with the QUIT command. Interactive mode can also be used to control SPM from an external process, such as a script. The commands are sent to SPM through standard input and the responses can be received through standard output.

Usually, however, the most convenient way to use non-GUI SPM is in batch mode. To launch SPM in batch mode, specify the name of a command file as an argument, like so:

C:\> spm model.cmd

SPM will then execute the commands contained in the file, until it encounters an error, a QUIT command, or the end of the file, whichever comes first. When operating in batch mode, it is usually a good idea to specify a text output file with a OUTPUT command, like the following:

OUTPUT model

This causes text output to be written to MODEL.DAT in the current working directory until another OUTPUT command is given, or the session ends. An OUTPUT command should thus appear close to the top of the command file, before any modeling or analysis commands are given. By default, responses are not written to the screen when SPM is working in batch mode.

To put the output file in a different directory, or to give the name a different extension than the default .DAT, you can put the output file name in quotes. For example:

OUTPUT "e:\analyses\myproj\models\model.dat"

If you don't want to have to specify the output file in the command file, you can also do it by invoking SPM with the -o flag, like this:

E:\analyses\> spm -o model.dat model.cmd

In this case, the default output file is set to model.dat. The -o flag can be overridden by an OUTPUT command.

What To Display on Screen

When operating in interactive mode, SPM will normally write all text output to the screen, but since this can be voluminous, the writing of text output to the screen is disabled by default in batch mode. Instead, we get a display like the following:

Salford Predictive Miner version 6.6.0.068

CART(R), TreeNet(R), MARS(R), RandomForests(R), PathSeeker, LOGIT, PROBIT, TSLS

Copyright, 1991-2010, Salford Systems, San Diego, California, USA

Launched on 9/20/2010 with no expiration.

This session supports up to 32768 variables.

67 MB RAM allocated at launch, partitioned as:

Real : 3866368 cells

Integer : 1114112 cells

Character: 25166336 cells

The license supports up to 9999999 MB of learn sample data.

When operating SPM from inside of a script, or via the UNIX nohup utility, it is usually a good idea to suppress even the above display. To do so, invoke SPM with the -q (quiet) flag, which disables all screen output. This quiet mode can be canceled by putting the command ECHO ON inside of the command file.

Alternatively, many users, especially impatient ones, will want to see the progress of their jobs as they run. This can be accomplished by invoking SPM with the -e (echo) flag. This echo mode can be cancelled with an ECHO OFF command inside of the command file.

It should be noted that the -q and -e flags are mutually exclusive, and only work in batch mode. When in interactive mode, echo is always on at startup, but can always be turned on or off with the ECHO command. When ECHO OFF is in effect and SPM is running interactively, the > prompt will not be displayed.

File Specification Options

While input and output files are normally specified with commands, the default input dataset and the default text output file can be specified with the -u and -o flags, respectively. For example:

E:\Models> spm -u "E:\Datasets\BOSTON.CSV" -o "BOSMOD.DAT" bosmod.cmd

The above command runs bosmod.cmd, setting the default input dataset to E:\Datasets\BOSTON.CSV, and the text output file to BOSMOD.DAT (in the current working directory). These defaults can be changed inside of the command file with the USE and OUTPUT commands, respectively.

The -l flag specifies a log file to which to write error and warning messages. By default, no such file is created.

Directory Specification Options

The location of the Stat/Transfer® data translation engine is normally set with the STATTRAN environment variable, but can be set at the command line with the -w flag, like so:

E:\Models> spm -w c:\stattran

This flag is only effective on MS-Windows systems. Elsewhere, the data translation engine is loaded automatically at startup.

The location of the scratch directory is normally specified by an environment variable, but can be specified by the -t flag. For example:

E:\Models> spm -t d:\temp

In the above example, scratch files are written to d:\temp.

Memory Management Options

The maximum number of variables supported in an SPM session is normally set to 32768 (32*1024), but can be changed at startup with the -v flag:

E:\Models> spm -v 33554432

It should be noted that text datasets with very large numbers of fields can take a long time to open, parse, and validate. Binary formats are usually better for this purpose.

By default, 67 MB of static workspace (used by the preprocessor) is allocated at startup. This can be changed with the -m flag (amount given in megabytes). This static workspace was formerly also used by CART, MARS, and Logit, but as the SPM versions of these procedures allocate their workspace dynamically, there is now little reason to use this flag.

Likewise, ordinary users will have little reason to use the -s flag, which reduces the maximum learn sample size below the limit set by the license string. It is primarily for testing, and to allow system administrators to reduce the maximum learn sample size for certain users.

The -mt flag sets a limit on the size of lookup tables.

Environment Variables

There are a number of environment variables which change SPM's startup defaults. These are described below:

SALFORD

Contains the pathname of the directory containing the license file license.txt. This should always be set. If it is not, SPM will only search for the license file in the current working directory.

STATTRAN (MS-Windows only)

Contains the pathname of the directory containing the Stat/Transfer data translation engine. The -w flag overrides this variable. If neither is set, the data translation engine will not be available, and only text and Systat® format datasets will be supported. On UNIX and Linux systems, the data translation engine, which must be present on platforms where it is supported, is automatically located and loaded on startup, and this environment variable is ignored.

TEMP, TMPDIR, etc.

The directory where SPM will write its scratch files must be defined at startup in order for SPM to run. This can be set by one of a number of environment variables, or by the -t flag. The environment variables are given below in order of precedence. The highest ranking variable found will be used to identify the scratch directory. CARTTEMP SALFORDTEMP TMPDIR TEMP TMP

SALFORD_M

This sets the default static workspace allocation (in megabytes) and is overridden by the -m flag. In recent versions of SPM, this workspace is used only by the preprocessor and the allocation almost never needs to be changed. Static workspace is, however, used by CART 5.0 and 6.0, so SALFORD_M may need to be set on systems where those versions of CART are installed.

SALFORD_S

This sets the maximum learn sample size to a value lower than the one set by the license string in license.txt. It can be overridden by the -s flag.

[J#46:1603]

Tags: Blog, SPM, Command Line