This method constructs a decision tree based classifier that maintains highest accuracy on training data and improves on generalization accuracy as it grows in complexity. The classifier consists of multiple trees constructed systematically by pseudorandomly selecting subsets of components of the feature vector, that is, trees constructed in randomly chosen subspaces.
For more information, see
Tin Kam Ho (1998). The Random Subspace Method for Constructing Decision Forests. IEEE Transactions on Pattern Analysis and Machine Intelligence. 20(8):832-844. URL http://citeseer.ist.psu.edu/ho98random.html.
BibTeX:
@article{Ho1998,
author = {Tin Kam Ho},
journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence},
number = {8},
pages = {832-844},
title = {The Random Subspace Method for Constructing Decision Forests},
volume = {20},
year = {1998},
ISSN = {0162-8828},
URL = {http://citeseer.ist.psu.edu/ho98random.html}
}
Valid options are:
-P
Size of each subspace:
< 1: percentage of the number of attributes
>=1: absolute number of attributes
-S <num>
Random number seed.
(default 1)
-I <num>
Number of iterations.
(default 10)
-D
If set, classifier is run in debug mode and
may output additional info to the console
-W
Full name of base classifier.
(default: weka.classifiers.trees.REPTree)
Options specific to classifier weka.classifiers.trees.REPTree:
-M <minimum number of instances>
Set minimum number of instances per leaf (default 2).
-V <minimum variance for split>
Set minimum numeric class variance proportion
of train variance for split (default 1e-3).
-N <number of folds>
Number of folds for reduced error pruning (default 3).
-S <seed>
Seed for random data shuffling (default 1).
-P
No pruning.
-L
Maximum tree depth (default -1, no maximum)
Options after -- are passed to the designated classifier.
author: Bernhard Pfahringer (bernhard@cs.waikato.ac.nz) author: Peter Reutemann (fracpete@cs.waikato.ac.nz) version: $Revision: 1.3 $ |