| java.lang.Object weka.classifiers.Evaluation
Evaluation | public class Evaluation implements Summarizable(Code) | | Class for evaluating machine learning models.
-------------------------------------------------------------------
General options when evaluating a learning scheme from the command-line:
-t filename
Name of the file with the training data. (required)
-T filename
Name of the file with the test data. If missing a cross-validation
is performed.
-c index
Index of the class attribute (1, 2, ...; default: last).
-x number
The number of folds for the cross-validation (default: 10).
-no-cv
No cross validation. If no test file is provided, no evaluation
is done.
-split-percentage percentage
Sets the percentage for the train/test set split, e.g., 66.
-preserve-order
Preserves the order in the percentage split instead of randomizing
the data first with the seed value ('-s').
-s seed
Random number seed for the cross-validation and percentage split
(default: 1).
-m filename
The name of a file containing a cost matrix.
-l filename
Loads classifier from the given file. In case the filename ends with ".xml"
the options are loaded from XML.
-d filename
Saves classifier built from the training data into the given file. In case
the filename ends with ".xml" the options are saved XML, not the model.
-v
Outputs no statistics for the training data.
-o
Outputs statistics only, not the classifier.
-i
Outputs information-retrieval statistics per class.
-k
Outputs information-theoretic statistics.
-p range
Outputs predictions for test instances (or the train instances if no test
instances provided), along with the attributes in the specified range
(and nothing else). Use '-p 0' if no attributes are desired.
-distribution
Outputs the distribution instead of only the prediction
in conjunction with the '-p' option (only nominal classes).
-r
Outputs cumulative margin distribution (and nothing else).
-g
Only for classifiers that implement "Graphable." Outputs
the graph representation of the classifier (and nothing
else).
-xml filename | xml-string
Retrieves the options from the XML-data instead of the command line.
-threshold-file file
The file to save the threshold data to.
The format is determined by the extensions, e.g., '.arff' for ARFF
format or '.csv' for CSV.
-threshold-label label
The class label to determine the threshold data for
(default is the first label)
-------------------------------------------------------------------
Example usage as the main of a classifier (called FunkyClassifier):
public static void main(String [] args) {
runClassifier(new FunkyClassifier(), args);
}
------------------------------------------------------------------
Example usage from within an application:
Instances trainInstances = ... instances got from somewhere
Instances testInstances = ... instances got from somewhere
Classifier scheme = ... scheme got from somewhere
Evaluation evaluation = new Evaluation(trainInstances);
evaluation.evaluateModel(scheme, testInstances);
System.out.println(evaluation.toSummaryString());
author: Eibe Frank (eibe@cs.waikato.ac.nz) author: Len Trigg (trigg@cs.waikato.ac.nz) version: $Revision: 1.77 $ |
Constructor Summary | |
public | Evaluation(Instances data) Initializes all the counters for the evaluation. | public | Evaluation(Instances data, CostMatrix costMatrix) Initializes all the counters for the evaluation and also takes a
cost matrix as parameter. |
Method Summary | |
final public double | KBInformation() | final public double | KBMeanInformation() Return the Kononenko & Bratko Information score in bits per
instance. | final public double | KBRelativeInformation() | final public double | SFEntropyGain() Returns the total SF, which is the null model entropy minus
the scheme entropy. | final public double | SFMeanEntropyGain() Returns the SF per instance, which is the null model entropy
minus the scheme entropy, per instance. | final public double | SFMeanPriorEntropy() | final public double | SFMeanSchemeEntropy() | final public double | SFPriorEntropy() | final public double | SFSchemeEntropy() | protected void | addNumericTrainClass(double classValue, double weight) Adds a numeric (non-missing) training class value and weight to
the buffer of stored values. | public double | areaUnderROC(int classIndex) Returns the area under ROC for those predictions that have been collected
in the evaluateClassifier(Classifier, Instances) method. | protected static String | attributeValuesString(Instance instance, Range attRange) Builds a string listing the attribute values in a specified range of indices,
separated by commas and enclosed in brackets. | final public double | avgCost() Gets the average cost, that is, total cost of misclassifications
(incorrect plus unclassified) over the total number of instances.
the average cost. | public double[][] | confusionMatrix() Returns a copy of the confusion matrix. | final public double | correct() Gets the number of instances correctly classified (that is, for
which a correct prediction was made). | final public double | correlationCoefficient() Returns the correlation coefficient if the class is numeric. | public void | crossValidateModel(Classifier classifier, Instances data, int numFolds, Random random) Performs a (stratified if class is nominal) cross-validation
for a classifier on a set of instances. | public void | crossValidateModel(String classifierString, Instances data, int numFolds, String[] options, Random random) Performs a (stratified if class is nominal) cross-validation
for a classifier on a set of instances.
Parameters: classifierString - a string naming the class of the classifier Parameters: data - the data on which the cross-validation is to be performed Parameters: numFolds - the number of folds for the cross-validation Parameters: options - the options to the classifier. | public boolean | equals(Object obj) | final public double | errorRate() Returns the estimated error rate or the root mean squared error
(if the class is numeric). | public static String | evaluateModel(String classifierString, String[] options) Evaluates a classifier with the options given in an array of
strings. | public static String | evaluateModel(Classifier classifier, String[] options) Evaluates a classifier with the options given in an array of
strings. | public double[] | evaluateModel(Classifier classifier, Instances data) Evaluates the classifier on a given set of instances. | public double | evaluateModelOnce(Classifier classifier, Instance instance) Evaluates the classifier on a single instance. | public double | evaluateModelOnce(double[] dist, Instance instance) Evaluates the supplied distribution on a single instance. | public void | evaluateModelOnce(double prediction, Instance instance) Evaluates the supplied prediction on a single instance. | public double | evaluateModelOnceAndRecordPrediction(Classifier classifier, Instance instance) Evaluates the classifier on a single instance and records the
prediction (if the class is nominal). | public double | evaluateModelOnceAndRecordPrediction(double[] dist, Instance instance) Evaluates the supplied distribution on a single instance. | public double | fMeasure(int classIndex) Calculate the F-Measure with respect to a particular class. | public double | falseNegativeRate(int classIndex) Calculate the false negative rate with respect to a particular class. | public double | falsePositiveRate(int classIndex) Calculate the false positive rate with respect to a particular class. | public double[] | getClassPriors() | protected static CostMatrix | handleCostOption(String costFileName, int numClasses) Attempts to load a cost matrix.
Parameters: costFileName - the filename of the cost matrix Parameters: numClasses - the number of classes that should be in the cost matrix(only used if the cost file is in old format). | final public double | incorrect() Gets the number of instances incorrectly classified (that is, for
which an incorrect prediction was made). | final public double | kappa() Returns value of kappa statistic if class is nominal. | public static void | main(String[] args) A test method for this class. | protected double[] | makeDistribution(double predictedClass) | protected static String | makeOptionString(Classifier classifier) | final public double | meanAbsoluteError() Returns the mean absolute error. | final public double | meanPriorAbsoluteError() Returns the mean absolute error of the prior. | protected String | num2ShortID(int num, char[] IDChars, int IDWidth) Method for generating indices for the confusion matrix. | public double | numFalseNegatives(int classIndex) Calculate number of false negatives with respect to a particular class. | public double | numFalsePositives(int classIndex) Calculate number of false positives with respect to a particular class. | final public double | numInstances() Gets the number of test instances that had a known class value
(actually the sum of the weights of test instances with known
class value). | public double | numTrueNegatives(int classIndex) Calculate the number of true negatives with respect to a particular class. | public double | numTruePositives(int classIndex) Calculate the number of true positives with respect to a particular class. | final public double | pctCorrect() Gets the percentage of instances correctly classified (that is, for
which a correct prediction was made). | final public double | pctIncorrect() Gets the percentage of instances incorrectly classified (that is, for
which an incorrect prediction was made). | final public double | pctUnclassified() Gets the percentage of instances not classified (that is, for
which no prediction was made by the classifier). | public double | precision(int classIndex) Calculate the precision with respect to a particular class. | protected static String | predictionText(Classifier classifier, Instance inst, int instNum, Range attributesToOutput, boolean printDistribution) | public FastVector | predictions() Returns the predictions that have been collected.
a reference to the FastVector containing the predictionsthat have been collected. | protected static String | printClassifications(Classifier classifier, Instances train, DataSource testSource, int classIndex, Range attributesToOutput) Prints the predictions for the given dataset into a String variable. | protected static String | printClassifications(Classifier classifier, Instances train, DataSource testSource, int classIndex, Range attributesToOutput, boolean printDistribution) Prints the predictions for the given dataset into a String variable. | final public double | priorEntropy() | public double | recall(int classIndex) Calculate the recall with respect to a particular class. | final public double | relativeAbsoluteError() Returns the relative absolute error. | final public double | rootMeanPriorSquaredError() Returns the root mean prior squared error. | final public double | rootMeanSquaredError() Returns the root mean squared error. | final public double | rootRelativeSquaredError() Returns the root relative squared error if the class is numeric. | protected void | setNumericPriorsFromBuffer() Sets up the priors for numeric class attributes from the
training class values that have been seen so far. | public void | setPriors(Instances train) | public String | toClassDetailsString() Generates a breakdown of the accuracy for each class (with default title),
incorporating various information-retrieval statistics, such as
true/false positive rate, precision/recall/F-Measure. | public String | toClassDetailsString(String title) Generates a breakdown of the accuracy for each class,
incorporating various information-retrieval statistics, such as
true/false positive rate, precision/recall/F-Measure. | public String | toCumulativeMarginDistributionString() Output the cumulative margin distribution as a string suitable
for input for gnuplot or similar package. | public String | toMatrixString() Calls toMatrixString() with a default title. | public String | toMatrixString(String title) Outputs the performance statistics as a classification confusion
matrix. | public String | toSummaryString() | public String | toSummaryString(boolean printComplexityStatistics) Calls toSummaryString() with a default title. | public String | toSummaryString(String title, boolean printComplexityStatistics) Outputs the performance statistics in summary form. | final public double | totalCost() Gets the total cost, that is, the cost of each prediction times the
weight of the instance, summed over all instances. | public double | trueNegativeRate(int classIndex) Calculate the true negative rate with respect to a particular class. | public double | truePositiveRate(int classIndex) Calculate the true positive rate with respect to a particular class. | final public double | unclassified() Gets the number of instances not classified (that is, for
which no prediction was made by the classifier). | protected void | updateMargins(double[] predictedDistribution, int actualClass, double weight) | protected void | updateNumericScores(double[] predicted, double[] actual, double weight) Update the numeric accuracy measures. | public void | updatePriors(Instance instance) | protected void | updateStatsForClassifier(double[] predictedDistribution, Instance instance) Updates all the statistics about a classifiers performance for
the current test instance. | protected void | updateStatsForPredictor(double predictedValue, Instance instance) Updates all the statistics about a predictors performance for
the current test instance. | public void | useNoPriors() disables the use of priors, e.g., in case of de-serialized schemes
that have no access to the original training set, but are evaluated
on a set set. | protected static String | wekaStaticWrapper(Sourcable classifier, String className) Wraps a static classifier in enough source to test using the weka
class libraries. |
MIN_SF_PROB | final protected static double MIN_SF_PROB(Code) | | The minimum probablility accepted from an estimator to avoid
taking log(0) in Sf calculations.
|
k_MarginResolution | protected static int k_MarginResolution(Code) | | Resolution of the margin histogram
|
m_ClassIsNominal | protected boolean m_ClassIsNominal(Code) | | Is the class nominal or numeric?
|
m_ClassNames | protected String[] m_ClassNames(Code) | | The names of the classes.
|
m_ClassPriors | protected double[] m_ClassPriors(Code) | | The prior probabilities of the classes
|
m_ClassPriorsSum | protected double m_ClassPriorsSum(Code) | | The sum of counts for priors
|
m_ConfusionMatrix | protected double[][] m_ConfusionMatrix(Code) | | Array for storing the confusion matrix.
|
m_Correct | protected double m_Correct(Code) | | The weight of all correctly classified instances.
|
m_ErrorEstimator | protected Estimator m_ErrorEstimator(Code) | | Numeric class error estimator for scheme
|
m_Incorrect | protected double m_Incorrect(Code) | | The weight of all incorrectly classified instances.
|
m_MarginCounts | protected double m_MarginCounts(Code) | | Cumulative margin distribution
|
m_MissingClass | protected double m_MissingClass(Code) | | The weight of all instances that had no class assigned to them.
|
m_NoPriors | protected boolean m_NoPriors(Code) | | enables/disables the use of priors, e.g., if no training set is
present in case of de-serialized schemes
|
m_NumClasses | protected int m_NumClasses(Code) | | The number of classes.
|
m_NumFolds | protected int m_NumFolds(Code) | | The number of folds for a cross-validation.
|
m_NumTrainClassVals | protected int m_NumTrainClassVals(Code) | | Number of non-missing class training instances seen
|
m_PriorErrorEstimator | protected Estimator m_PriorErrorEstimator(Code) | | Numeric class error estimator for prior
|
m_SumAbsErr | protected double m_SumAbsErr(Code) | | Sum of absolute errors.
|
m_SumClass | protected double m_SumClass(Code) | | Sum of class values.
|
m_SumClassPredicted | protected double m_SumClassPredicted(Code) | | Sum of predicted * class values.
|
m_SumErr | protected double m_SumErr(Code) | | Sum of errors.
|
m_SumKBInfo | protected double m_SumKBInfo(Code) | | Total Kononenko & Bratko Information
|
m_SumPredicted | protected double m_SumPredicted(Code) | | Sum of predicted values.
|
m_SumPriorAbsErr | protected double m_SumPriorAbsErr(Code) | | Sum of absolute errors of the prior
|
m_SumPriorEntropy | protected double m_SumPriorEntropy(Code) | | Total entropy of prior predictions
|
m_SumPriorSqrErr | protected double m_SumPriorSqrErr(Code) | | Sum of absolute errors of the prior
|
m_SumSchemeEntropy | protected double m_SumSchemeEntropy(Code) | | Total entropy of scheme predictions
|
m_SumSqrClass | protected double m_SumSqrClass(Code) | | Sum of squared class values.
|
m_SumSqrErr | protected double m_SumSqrErr(Code) | | Sum of squared errors.
|
m_SumSqrPredicted | protected double m_SumSqrPredicted(Code) | | Sum of squared predicted values.
|
m_TotalCost | protected double m_TotalCost(Code) | | The total cost of predictions (includes instance weights)
|
m_TrainClassVals | protected double[] m_TrainClassVals(Code) | | Array containing all numeric training class values seen
|
m_TrainClassWeights | protected double[] m_TrainClassWeights(Code) | | Array containing all numeric training class weights
|
m_Unclassified | protected double m_Unclassified(Code) | | The weight of all unclassified instances.
|
m_WithClass | protected double m_WithClass(Code) | | The weight of all instances that had a class assigned to them.
|
Evaluation | public Evaluation(Instances data) throws Exception(Code) | | Initializes all the counters for the evaluation.
Use useNoPriors() if the dataset is the test set and you
can't initialize with the priors from the training set via
setPriors(Instances) .
Parameters: data - set of training instances, to get some header information and prior class distribution information throws: Exception - if the class is not defined See Also: Evaluation.useNoPriors() See Also: Evaluation.setPriors(Instances) |
Evaluation | public Evaluation(Instances data, CostMatrix costMatrix) throws Exception(Code) | | Initializes all the counters for the evaluation and also takes a
cost matrix as parameter.
Use useNoPriors() if the dataset is the test set and you
can't initialize with the priors from the training set via
setPriors(Instances) .
Parameters: data - set of training instances, to get some header information and prior class distribution information Parameters: costMatrix - the cost matrix---if null, default costs will be used throws: Exception - if cost matrix is not compatible with data, the class is not defined or the class is numeric See Also: Evaluation.useNoPriors() See Also: Evaluation.setPriors(Instances) |
KBInformation | final public double KBInformation() throws Exception(Code) | | Return the total Kononenko & Bratko Information score in bits
the K&B information score throws: Exception - if the class is not nominal |
KBMeanInformation | final public double KBMeanInformation() throws Exception(Code) | | Return the Kononenko & Bratko Information score in bits per
instance.
the K&B information score throws: Exception - if the class is not nominal |
KBRelativeInformation | final public double KBRelativeInformation() throws Exception(Code) | | Return the Kononenko & Bratko Relative Information score
the K&B relative information score throws: Exception - if the class is not nominal |
SFEntropyGain | final public double SFEntropyGain()(Code) | | Returns the total SF, which is the null model entropy minus
the scheme entropy.
the total SF |
SFMeanEntropyGain | final public double SFMeanEntropyGain()(Code) | | Returns the SF per instance, which is the null model entropy
minus the scheme entropy, per instance.
the SF per instance |
SFMeanPriorEntropy | final public double SFMeanPriorEntropy()(Code) | | Returns the entropy per instance for the null model
the null model entropy per instance |
SFMeanSchemeEntropy | final public double SFMeanSchemeEntropy()(Code) | | Returns the entropy per instance for the scheme
the scheme entropy per instance |
SFPriorEntropy | final public double SFPriorEntropy()(Code) | | Returns the total entropy for the null model
the total null model entropy |
SFSchemeEntropy | final public double SFSchemeEntropy()(Code) | | Returns the total entropy for the scheme
the total scheme entropy |
addNumericTrainClass | protected void addNumericTrainClass(double classValue, double weight)(Code) | | Adds a numeric (non-missing) training class value and weight to
the buffer of stored values.
Parameters: classValue - the class value Parameters: weight - the instance weight |
areaUnderROC | public double areaUnderROC(int classIndex)(Code) | | Returns the area under ROC for those predictions that have been collected
in the evaluateClassifier(Classifier, Instances) method. Returns
Instance.missingValue() if the area is not available.
Parameters: classIndex - the index of the class to consider as "positive" the area under the ROC curve or not a number |
attributeValuesString | protected static String attributeValuesString(Instance instance, Range attRange)(Code) | | Builds a string listing the attribute values in a specified range of indices,
separated by commas and enclosed in brackets.
Parameters: instance - the instance to print the values from Parameters: attRange - the range of the attributes to list a string listing values of the attributes in the range |
avgCost | final public double avgCost()(Code) | | Gets the average cost, that is, total cost of misclassifications
(incorrect plus unclassified) over the total number of instances.
the average cost. |
confusionMatrix | public double[][] confusionMatrix()(Code) | | Returns a copy of the confusion matrix.
a copy of the confusion matrix as a two-dimensional array |
correct | final public double correct()(Code) | | Gets the number of instances correctly classified (that is, for
which a correct prediction was made). (Actually the sum of the weights
of these instances)
the number of correctly classified instances |
correlationCoefficient | final public double correlationCoefficient() throws Exception(Code) | | Returns the correlation coefficient if the class is numeric.
the correlation coefficient throws: Exception - if class is not numeric |
crossValidateModel | public void crossValidateModel(Classifier classifier, Instances data, int numFolds, Random random) throws Exception(Code) | | Performs a (stratified if class is nominal) cross-validation
for a classifier on a set of instances. Now performs
a deep copy of the classifier before each call to
buildClassifier() (just in case the classifier is not
initialized properly).
Parameters: classifier - the classifier with any options set. Parameters: data - the data on which the cross-validation is to be performed Parameters: numFolds - the number of folds for the cross-validation Parameters: random - random number generator for randomization throws: Exception - if a classifier could not be generated successfully or the class is not defined |
crossValidateModel | public void crossValidateModel(String classifierString, Instances data, int numFolds, String[] options, Random random) throws Exception(Code) | | Performs a (stratified if class is nominal) cross-validation
for a classifier on a set of instances.
Parameters: classifierString - a string naming the class of the classifier Parameters: data - the data on which the cross-validation is to be performed Parameters: numFolds - the number of folds for the cross-validation Parameters: options - the options to the classifier. Any options Parameters: random - the random number generator for randomizing the dataaccepted by the classifier will be removed from this array. throws: Exception - if a classifier could not be generated successfully or the class is not defined |
equals | public boolean equals(Object obj)(Code) | | Tests whether the current evaluation object is equal to another
evaluation object
Parameters: obj - the object to compare against true if the two objects are equal |
errorRate | final public double errorRate()(Code) | | Returns the estimated error rate or the root mean squared error
(if the class is numeric). If a cost matrix was given this
error rate gives the average cost.
the estimated error rate (between 0 and 1, or between 0 and maximum cost) |
evaluateModel | public static String evaluateModel(String classifierString, String[] options) throws Exception(Code) | | Evaluates a classifier with the options given in an array of
strings.
Valid options are:
-t filename
Name of the file with the training data. (required)
-T filename
Name of the file with the test data. If missing a cross-validation
is performed.
-c index
Index of the class attribute (1, 2, ...; default: last).
-x number
The number of folds for the cross-validation (default: 10).
-no-cv
No cross validation. If no test file is provided, no evaluation
is done.
-split-percentage percentage
Sets the percentage for the train/test set split, e.g., 66.
-preserve-order
Preserves the order in the percentage split instead of randomizing
the data first with the seed value ('-s').
-s seed
Random number seed for the cross-validation and percentage split
(default: 1).
-m filename
The name of a file containing a cost matrix.
-l filename
Loads classifier from the given file. In case the filename ends with
".xml" the options are loaded from XML.
-d filename
Saves classifier built from the training data into the given file. In case
the filename ends with ".xml" the options are saved XML, not the model.
-v
Outputs no statistics for the training data.
-o
Outputs statistics only, not the classifier.
-i
Outputs detailed information-retrieval statistics per class.
-k
Outputs information-theoretic statistics.
-p range
Outputs predictions for test instances (or the train instances if no test
instances provided), along with the attributes in the specified range (and
nothing else). Use '-p 0' if no attributes are desired.
-distribution
Outputs the distribution instead of only the prediction
in conjunction with the '-p' option (only nominal classes).
-r
Outputs cumulative margin distribution (and nothing else).
-g
Only for classifiers that implement "Graphable." Outputs
the graph representation of the classifier (and nothing
else).
-xml filename | xml-string
Retrieves the options from the XML-data instead of the command line.
-threshold-file file
The file to save the threshold data to.
The format is determined by the extensions, e.g., '.arff' for ARFF
format or '.csv' for CSV.
-threshold-label label
The class label to determine the threshold data for
(default is the first label)
Parameters: classifierString - class of machine learning classifier as a string Parameters: options - the array of string containing the options throws: Exception - if model could not be evaluated successfully a string describing the results |
evaluateModel | public static String evaluateModel(Classifier classifier, String[] options) throws Exception(Code) | | Evaluates a classifier with the options given in an array of
strings.
Valid options are:
-t name of training file
Name of the file with the training data. (required)
-T name of test file
Name of the file with the test data. If missing a cross-validation
is performed.
-c class index
Index of the class attribute (1, 2, ...; default: last).
-x number of folds
The number of folds for the cross-validation (default: 10).
-no-cv
No cross validation. If no test file is provided, no evaluation
is done.
-split-percentage percentage
Sets the percentage for the train/test set split, e.g., 66.
-preserve-order
Preserves the order in the percentage split instead of randomizing
the data first with the seed value ('-s').
-s seed
Random number seed for the cross-validation and percentage split
(default: 1).
-m file with cost matrix
The name of a file containing a cost matrix.
-l filename
Loads classifier from the given file. In case the filename ends with
".xml" the options are loaded from XML.
-d filename
Saves classifier built from the training data into the given file. In case
the filename ends with ".xml" the options are saved XML, not the model.
-v
Outputs no statistics for the training data.
-o
Outputs statistics only, not the classifier.
-i
Outputs detailed information-retrieval statistics per class.
-k
Outputs information-theoretic statistics.
-p range
Outputs predictions for test instances (or the train instances if no test
instances provided), along with the attributes in the specified range
(and nothing else). Use '-p 0' if no attributes are desired.
-distribution
Outputs the distribution instead of only the prediction
in conjunction with the '-p' option (only nominal classes).
-r
Outputs cumulative margin distribution (and nothing else).
-g
Only for classifiers that implement "Graphable." Outputs
the graph representation of the classifier (and nothing
else).
-xml filename | xml-string
Retrieves the options from the XML-data instead of the command line.
Parameters: classifier - machine learning classifier Parameters: options - the array of string containing the options throws: Exception - if model could not be evaluated successfully a string describing the results |
evaluateModel | public double[] evaluateModel(Classifier classifier, Instances data) throws Exception(Code) | | Evaluates the classifier on a given set of instances. Note that
the data must have exactly the same format (e.g. order of
attributes) as the data used to train the classifier! Otherwise
the results will generally be meaningless.
Parameters: classifier - machine learning classifier Parameters: data - set of test instances for evaluation the predictions throws: Exception - if model could not be evaluated successfully |
evaluateModelOnce | public double evaluateModelOnce(Classifier classifier, Instance instance) throws Exception(Code) | | Evaluates the classifier on a single instance.
Parameters: classifier - machine learning classifier Parameters: instance - the test instance to be classified the prediction made by the clasifier throws: Exception - if model could not be evaluated successfully or the data contains string attributes |
evaluateModelOnce | public double evaluateModelOnce(double[] dist, Instance instance) throws Exception(Code) | | Evaluates the supplied distribution on a single instance.
Parameters: dist - the supplied distribution Parameters: instance - the test instance to be classified the prediction throws: Exception - if model could not be evaluated successfully |
evaluateModelOnce | public void evaluateModelOnce(double prediction, Instance instance) throws Exception(Code) | | Evaluates the supplied prediction on a single instance.
Parameters: prediction - the supplied prediction Parameters: instance - the test instance to be classified throws: Exception - if model could not be evaluated successfully |
evaluateModelOnceAndRecordPrediction | public double evaluateModelOnceAndRecordPrediction(Classifier classifier, Instance instance) throws Exception(Code) | | Evaluates the classifier on a single instance and records the
prediction (if the class is nominal).
Parameters: classifier - machine learning classifier Parameters: instance - the test instance to be classified the prediction made by the clasifier throws: Exception - if model could not be evaluated successfully or the data contains string attributes |
evaluateModelOnceAndRecordPrediction | public double evaluateModelOnceAndRecordPrediction(double[] dist, Instance instance) throws Exception(Code) | | Evaluates the supplied distribution on a single instance.
Parameters: dist - the supplied distribution Parameters: instance - the test instance to be classified the prediction throws: Exception - if model could not be evaluated successfully |
fMeasure | public double fMeasure(int classIndex)(Code) | | Calculate the F-Measure with respect to a particular class.
This is defined as
2 * recall * precision
----------------------
recall + precision
Parameters: classIndex - the index of the class to consider as "positive" the F-Measure |
falseNegativeRate | public double falseNegativeRate(int classIndex)(Code) | | Calculate the false negative rate with respect to a particular class.
This is defined as
incorrectly classified positives
--------------------------------
total positives
Parameters: classIndex - the index of the class to consider as "positive" the false positive rate |
falsePositiveRate | public double falsePositiveRate(int classIndex)(Code) | | Calculate the false positive rate with respect to a particular class.
This is defined as
incorrectly classified negatives
--------------------------------
total negatives
Parameters: classIndex - the index of the class to consider as "positive" the false positive rate |
getClassPriors | public double[] getClassPriors()(Code) | | Get the current weighted class counts
the weighted class counts |
handleCostOption | protected static CostMatrix handleCostOption(String costFileName, int numClasses) throws Exception(Code) | | Attempts to load a cost matrix.
Parameters: costFileName - the filename of the cost matrix Parameters: numClasses - the number of classes that should be in the cost matrix(only used if the cost file is in old format). a CostMatrix value, or null if costFileName is empty throws: Exception - if an error occurs. |
incorrect | final public double incorrect()(Code) | | Gets the number of instances incorrectly classified (that is, for
which an incorrect prediction was made). (Actually the sum of the weights
of these instances)
the number of incorrectly classified instances |
kappa | final public double kappa()(Code) | | Returns value of kappa statistic if class is nominal.
the value of the kappa statistic |
main | public static void main(String[] args)(Code) | | A test method for this class. Just extracts the first command line
argument as a classifier class name and calls evaluateModel.
Parameters: args - an array of command line arguments, the first of whichmust be the class name of a classifier. |
makeDistribution | protected double[] makeDistribution(double predictedClass)(Code) | | Convert a single prediction into a probability distribution
with all zero probabilities except the predicted value which
has probability 1.0;
Parameters: predictedClass - the index of the predicted class the probability distribution |
makeOptionString | protected static String makeOptionString(Classifier classifier)(Code) | | Make up the help string giving all the command line options
Parameters: classifier - the classifier to include options for a string detailing the valid command line options |
meanAbsoluteError | final public double meanAbsoluteError()(Code) | | Returns the mean absolute error. Refers to the error of the
predicted values for numeric classes, and the error of the
predicted probability distribution for nominal classes.
the mean absolute error |
meanPriorAbsoluteError | final public double meanPriorAbsoluteError()(Code) | | Returns the mean absolute error of the prior.
the mean absolute error |
num2ShortID | protected String num2ShortID(int num, char[] IDChars, int IDWidth)(Code) | | Method for generating indices for the confusion matrix.
Parameters: num - integer to format Parameters: IDChars - the characters to use Parameters: IDWidth - the width of the entry the formatted integer as a string |
numFalseNegatives | public double numFalseNegatives(int classIndex)(Code) | | Calculate number of false negatives with respect to a particular class.
This is defined as
incorrectly classified positives
Parameters: classIndex - the index of the class to consider as "positive" the false positive rate |
numFalsePositives | public double numFalsePositives(int classIndex)(Code) | | Calculate number of false positives with respect to a particular class.
This is defined as
incorrectly classified negatives
Parameters: classIndex - the index of the class to consider as "positive" the false positive rate |
numInstances | final public double numInstances()(Code) | | Gets the number of test instances that had a known class value
(actually the sum of the weights of test instances with known
class value).
the number of test instances with known class |
numTrueNegatives | public double numTrueNegatives(int classIndex)(Code) | | Calculate the number of true negatives with respect to a particular class.
This is defined as
correctly classified negatives
Parameters: classIndex - the index of the class to consider as "positive" the true positive rate |
numTruePositives | public double numTruePositives(int classIndex)(Code) | | Calculate the number of true positives with respect to a particular class.
This is defined as
correctly classified positives
Parameters: classIndex - the index of the class to consider as "positive" the true positive rate |
pctCorrect | final public double pctCorrect()(Code) | | Gets the percentage of instances correctly classified (that is, for
which a correct prediction was made).
the percent of correctly classified instances (between 0 and 100) |
pctIncorrect | final public double pctIncorrect()(Code) | | Gets the percentage of instances incorrectly classified (that is, for
which an incorrect prediction was made).
the percent of incorrectly classified instances (between 0 and 100) |
pctUnclassified | final public double pctUnclassified()(Code) | | Gets the percentage of instances not classified (that is, for
which no prediction was made by the classifier).
the percent of unclassified instances (between 0 and 100) |
precision | public double precision(int classIndex)(Code) | | Calculate the precision with respect to a particular class.
This is defined as
correctly classified positives
------------------------------
total predicted as positive
Parameters: classIndex - the index of the class to consider as "positive" the precision |
predictionText | protected static String predictionText(Classifier classifier, Instance inst, int instNum, Range attributesToOutput, boolean printDistribution) throws Exception(Code) | | returns the prediction made by the classifier as a string
Parameters: classifier - the classifier to use Parameters: inst - the instance to generate text from Parameters: instNum - the index in the dataset Parameters: attributesToOutput - the indices of the attributes to output Parameters: printDistribution - prints the complete distribution for nominal classes, not just the predicted value the generated text throws: Exception - if something goes wrong See Also: Evaluation.printClassifications(Classifier,Instances,String,int,Range,boolean) |
predictions | public FastVector predictions()(Code) | | Returns the predictions that have been collected.
a reference to the FastVector containing the predictionsthat have been collected. This should be null if no predictionshave been collected (e.g. if the class is numeric). |
printClassifications | protected static String printClassifications(Classifier classifier, Instances train, DataSource testSource, int classIndex, Range attributesToOutput) throws Exception(Code) | | Prints the predictions for the given dataset into a String variable.
Parameters: classifier - the classifier to use Parameters: train - the training data Parameters: testSource - the test set Parameters: classIndex - the class index (1-based), if -1 ot does not override the class index is stored in the data file (by using the last attribute) Parameters: attributesToOutput - the indices of the attributes to output the generated predictions for the attribute range throws: Exception - if test file cannot be opened |
printClassifications | protected static String printClassifications(Classifier classifier, Instances train, DataSource testSource, int classIndex, Range attributesToOutput, boolean printDistribution) throws Exception(Code) | | Prints the predictions for the given dataset into a String variable.
Parameters: classifier - the classifier to use Parameters: train - the training data Parameters: testSource - the test set Parameters: classIndex - the class index (1-based), if -1 ot does not override the class index is stored in the data file (by using the last attribute) Parameters: attributesToOutput - the indices of the attributes to output Parameters: printDistribution - prints the complete distribution for nominal classes, not just the predicted value the generated predictions for the attribute range throws: Exception - if test file cannot be opened |
priorEntropy | final public double priorEntropy() throws Exception(Code) | | Calculate the entropy of the prior distribution
the entropy of the prior distribution throws: Exception - if the class is not nominal |
recall | public double recall(int classIndex)(Code) | | Calculate the recall with respect to a particular class.
This is defined as
correctly classified positives
------------------------------
total positives
(Which is also the same as the truePositiveRate.)
Parameters: classIndex - the index of the class to consider as "positive" the recall |
relativeAbsoluteError | final public double relativeAbsoluteError() throws Exception(Code) | | Returns the relative absolute error.
the relative absolute error throws: Exception - if it can't be computed |
rootMeanPriorSquaredError | final public double rootMeanPriorSquaredError()(Code) | | Returns the root mean prior squared error.
the root mean prior squared error |
rootMeanSquaredError | final public double rootMeanSquaredError()(Code) | | Returns the root mean squared error.
the root mean squared error |
rootRelativeSquaredError | final public double rootRelativeSquaredError()(Code) | | Returns the root relative squared error if the class is numeric.
the root relative squared error |
setNumericPriorsFromBuffer | protected void setNumericPriorsFromBuffer()(Code) | | Sets up the priors for numeric class attributes from the
training class values that have been seen so far.
|
setPriors | public void setPriors(Instances train) throws Exception(Code) | | Sets the class prior probabilities
Parameters: train - the training instances used to determinethe prior probabilities throws: Exception - if the class attribute of the instances is notset |
toClassDetailsString | public String toClassDetailsString() throws Exception(Code) | | Generates a breakdown of the accuracy for each class (with default title),
incorporating various information-retrieval statistics, such as
true/false positive rate, precision/recall/F-Measure. Should be
useful for ROC curves, recall/precision curves.
the statistics presented as a string throws: Exception - if class is not nominal |
toClassDetailsString | public String toClassDetailsString(String title) throws Exception(Code) | | Generates a breakdown of the accuracy for each class,
incorporating various information-retrieval statistics, such as
true/false positive rate, precision/recall/F-Measure. Should be
useful for ROC curves, recall/precision curves.
Parameters: title - the title to prepend the stats string with the statistics presented as a string throws: Exception - if class is not nominal |
toCumulativeMarginDistributionString | public String toCumulativeMarginDistributionString() throws Exception(Code) | | Output the cumulative margin distribution as a string suitable
for input for gnuplot or similar package.
the cumulative margin distribution throws: Exception - if the class attribute is nominal |
toMatrixString | public String toMatrixString() throws Exception(Code) | | Calls toMatrixString() with a default title.
the confusion matrix as a string throws: Exception - if the class is numeric |
toMatrixString | public String toMatrixString(String title) throws Exception(Code) | | Outputs the performance statistics as a classification confusion
matrix. For each class value, shows the distribution of
predicted class values.
Parameters: title - the title for the confusion matrix the confusion matrix as a String throws: Exception - if the class is numeric |
toSummaryString | public String toSummaryString()(Code) | | Calls toSummaryString() with no title and no complexity stats
a summary description of the classifier evaluation |
toSummaryString | public String toSummaryString(boolean printComplexityStatistics)(Code) | | Calls toSummaryString() with a default title.
Parameters: printComplexityStatistics - if true, complexity statistics arereturned as well the summary string |
toSummaryString | public String toSummaryString(String title, boolean printComplexityStatistics)(Code) | | Outputs the performance statistics in summary form. Lists
number (and percentage) of instances classified correctly,
incorrectly and unclassified. Outputs the total number of
instances classified, and the number of instances (if any)
that had no class value provided.
Parameters: title - the title for the statistics Parameters: printComplexityStatistics - if true, complexity statistics arereturned as well the summary as a String |
totalCost | final public double totalCost()(Code) | | Gets the total cost, that is, the cost of each prediction times the
weight of the instance, summed over all instances.
the total cost |
trueNegativeRate | public double trueNegativeRate(int classIndex)(Code) | | Calculate the true negative rate with respect to a particular class.
This is defined as
correctly classified negatives
------------------------------
total negatives
Parameters: classIndex - the index of the class to consider as "positive" the true positive rate |
truePositiveRate | public double truePositiveRate(int classIndex)(Code) | | Calculate the true positive rate with respect to a particular class.
This is defined as
correctly classified positives
------------------------------
total positives
Parameters: classIndex - the index of the class to consider as "positive" the true positive rate |
unclassified | final public double unclassified()(Code) | | Gets the number of instances not classified (that is, for
which no prediction was made by the classifier). (Actually the sum
of the weights of these instances)
the number of unclassified instances |
updateMargins | protected void updateMargins(double[] predictedDistribution, int actualClass, double weight)(Code) | | Update the cumulative record of classification margins
Parameters: predictedDistribution - the probability distribution predicted forthe current instance Parameters: actualClass - the index of the actual instance class Parameters: weight - the weight assigned to the instance |
updateNumericScores | protected void updateNumericScores(double[] predicted, double[] actual, double weight)(Code) | | Update the numeric accuracy measures. For numeric classes, the
accuracy is between the actual and predicted class values. For
nominal classes, the accuracy is between the actual and
predicted class probabilities.
Parameters: predicted - the predicted values Parameters: actual - the actual value Parameters: weight - the weight associated with this prediction |
updatePriors | public void updatePriors(Instance instance) throws Exception(Code) | | Updates the class prior probabilities (when incrementally
training)
Parameters: instance - the new training instance seen throws: Exception - if the class of the instance is notset |
updateStatsForClassifier | protected void updateStatsForClassifier(double[] predictedDistribution, Instance instance) throws Exception(Code) | | Updates all the statistics about a classifiers performance for
the current test instance.
Parameters: predictedDistribution - the probabilities assigned to each class Parameters: instance - the instance to be classified throws: Exception - if the class of the instance is notset |
updateStatsForPredictor | protected void updateStatsForPredictor(double predictedValue, Instance instance) throws Exception(Code) | | Updates all the statistics about a predictors performance for
the current test instance.
Parameters: predictedValue - the numeric value the classifier predicts Parameters: instance - the instance to be classified throws: Exception - if the class of the instance is notset |
useNoPriors | public void useNoPriors()(Code) | | disables the use of priors, e.g., in case of de-serialized schemes
that have no access to the original training set, but are evaluated
on a set set.
|
wekaStaticWrapper | protected static String wekaStaticWrapper(Sourcable classifier, String className) throws Exception(Code) | | Wraps a static classifier in enough source to test using the weka
class libraries.
Parameters: classifier - a Sourcable Classifier Parameters: className - the name to give to the source code class the source for a static classifier that can be tested withweka libraries. throws: Exception - if code-generation fails |
|
|