Java Doc for Evaluation.java in  » Science » weka » weka » classifiers » Java Source Code / Java DocumentationJava Source Code and Java Documentation

Java Source Code / Java Documentation
1. 6.0 JDK Core
2. 6.0 JDK Modules
3. 6.0 JDK Modules com.sun
4. 6.0 JDK Modules com.sun.java
5. 6.0 JDK Modules sun
6. 6.0 JDK Platform
7. Ajax
8. Apache Harmony Java SE
9. Aspect oriented
10. Authentication Authorization
11. Blogger System
12. Build
13. Byte Code
14. Cache
15. Chart
16. Chat
17. Code Analyzer
18. Collaboration
19. Content Management System
20. Database Client
21. Database DBMS
22. Database JDBC Connection Pool
23. Database ORM
24. Development
25. EJB Server geronimo
26. EJB Server GlassFish
27. EJB Server JBoss 4.2.1
28. EJB Server resin 3.1.5
29. ERP CRM Financial
30. ESB
31. Forum
32. GIS
33. Graphic Library
34. Groupware
35. HTML Parser
36. IDE
37. IDE Eclipse
38. IDE Netbeans
39. Installer
40. Internationalization Localization
41. Inversion of Control
42. Issue Tracking
43. J2EE
44. JBoss
45. JMS
46. JMX
47. Library
48. Mail Clients
49. Net
50. Parser
51. PDF
52. Portal
53. Profiler
54. Project Management
55. Report
56. RSS RDF
57. Rule Engine
58. Science
59. Scripting
60. Search Engine
61. Security
62. Sevlet Container
63. Source Control
64. Swing Library
65. Template Engine
66. Test Coverage
67. Testing
68. UML
69. Web Crawler
70. Web Framework
71. Web Mail
72. Web Server
73. Web Services
74. Web Services apache cxf 2.0.1
75. Web Services AXIS2
76. Wiki Engine
77. Workflow Engines
78. XML
79. XML UI
Java
Java Tutorial
Java Open Source
Jar File Download
Java Articles
Java Products
Java by API
Photoshop Tutorials
Maya Tutorials
Flash Tutorials
3ds-Max Tutorials
Illustrator Tutorials
GIMP Tutorials
C# / C Sharp
C# / CSharp Tutorial
C# / CSharp Open Source
ASP.Net
ASP.NET Tutorial
JavaScript DHTML
JavaScript Tutorial
JavaScript Reference
HTML / CSS
HTML CSS Reference
C / ANSI-C
C Tutorial
C++
C++ Tutorial
Ruby
PHP
Python
Python Tutorial
Python Open Source
SQL Server / T-SQL
SQL Server / T-SQL Tutorial
Oracle PL / SQL
Oracle PL/SQL Tutorial
PostgreSQL
SQL / MySQL
MySQL Tutorial
VB.Net
VB.Net Tutorial
Flash / Flex / ActionScript
VBA / Excel / Access / Word
XML
XML Tutorial
Microsoft Office PowerPoint 2007 Tutorial
Microsoft Office Excel 2007 Tutorial
Microsoft Office Word 2007 Tutorial
Java Source Code / Java Documentation » Science » weka » weka.classifiers 
Source Cross Reference  Class Diagram Java Document (Java Doc) 


java.lang.Object
   weka.classifiers.Evaluation

Evaluation
public class Evaluation implements Summarizable(Code)
Class for evaluating machine learning models.

-------------------------------------------------------------------

General options when evaluating a learning scheme from the command-line:

-t filename
Name of the file with the training data. (required)

-T filename
Name of the file with the test data. If missing a cross-validation is performed.

-c index
Index of the class attribute (1, 2, ...; default: last).

-x number
The number of folds for the cross-validation (default: 10).

-no-cv
No cross validation. If no test file is provided, no evaluation is done.

-split-percentage percentage
Sets the percentage for the train/test set split, e.g., 66.

-preserve-order
Preserves the order in the percentage split instead of randomizing the data first with the seed value ('-s').

-s seed
Random number seed for the cross-validation and percentage split (default: 1).

-m filename
The name of a file containing a cost matrix.

-l filename
Loads classifier from the given file. In case the filename ends with ".xml" the options are loaded from XML.

-d filename
Saves classifier built from the training data into the given file. In case the filename ends with ".xml" the options are saved XML, not the model.

-v
Outputs no statistics for the training data.

-o
Outputs statistics only, not the classifier.

-i
Outputs information-retrieval statistics per class.

-k
Outputs information-theoretic statistics.

-p range
Outputs predictions for test instances (or the train instances if no test instances provided), along with the attributes in the specified range (and nothing else). Use '-p 0' if no attributes are desired.

-distribution
Outputs the distribution instead of only the prediction in conjunction with the '-p' option (only nominal classes).

-r
Outputs cumulative margin distribution (and nothing else).

-g
Only for classifiers that implement "Graphable." Outputs the graph representation of the classifier (and nothing else).

-xml filename | xml-string
Retrieves the options from the XML-data instead of the command line.

-threshold-file file
The file to save the threshold data to. The format is determined by the extensions, e.g., '.arff' for ARFF format or '.csv' for CSV.

-threshold-label label
The class label to determine the threshold data for (default is the first label)

-------------------------------------------------------------------

Example usage as the main of a classifier (called FunkyClassifier):

 public static void main(String [] args) {
 runClassifier(new FunkyClassifier(), args);
 }
 

------------------------------------------------------------------

Example usage from within an application:

 Instances trainInstances = ... instances got from somewhere
 Instances testInstances = ... instances got from somewhere
 Classifier scheme = ... scheme got from somewhere
 Evaluation evaluation = new Evaluation(trainInstances);
 evaluation.evaluateModel(scheme, testInstances);
 System.out.println(evaluation.toSummaryString());
 

author:
   Eibe Frank (eibe@cs.waikato.ac.nz)
author:
   Len Trigg (trigg@cs.waikato.ac.nz)
version:
   $Revision: 1.77 $


Field Summary
final protected static  doubleMIN_SF_PROB
     The minimum probablility accepted from an estimator to avoid taking log(0) in Sf calculations.
protected static  intk_MarginResolution
    
protected  booleanm_ClassIsNominal
    
protected  String[]m_ClassNames
     The names of the classes.
protected  double[]m_ClassPriors
    
protected  doublem_ClassPriorsSum
    
protected  double[][]m_ConfusionMatrix
     Array for storing the confusion matrix.
protected  doublem_Correct
     The weight of all correctly classified instances.
protected  CostMatrixm_CostMatrix
     The cost matrix (if given).
protected  Estimatorm_ErrorEstimator
    
protected  doublem_Incorrect
     The weight of all incorrectly classified instances.
protected  doublem_MarginCounts
    
protected  doublem_MissingClass
     The weight of all instances that had no class assigned to them.
protected  booleanm_NoPriors
    
protected  intm_NumClasses
     The number of classes.
protected  intm_NumFolds
     The number of folds for a cross-validation.
protected  intm_NumTrainClassVals
    
protected  Estimatorm_PriorErrorEstimator
    
protected  doublem_SumAbsErr
     Sum of absolute errors.
protected  doublem_SumClass
     Sum of class values.
protected  doublem_SumClassPredicted
     Sum of predicted * class values.
protected  doublem_SumErr
     Sum of errors.
protected  doublem_SumKBInfo
    
protected  doublem_SumPredicted
     Sum of predicted values.
protected  doublem_SumPriorAbsErr
    
protected  doublem_SumPriorEntropy
    
protected  doublem_SumPriorSqrErr
    
protected  doublem_SumSchemeEntropy
    
protected  doublem_SumSqrClass
     Sum of squared class values.
protected  doublem_SumSqrErr
     Sum of squared errors.
protected  doublem_SumSqrPredicted
     Sum of squared predicted values.
protected  doublem_TotalCost
    
protected  double[]m_TrainClassVals
    
protected  double[]m_TrainClassWeights
    
protected  doublem_Unclassified
     The weight of all unclassified instances.
protected  doublem_WithClass
     The weight of all instances that had a class assigned to them.

Constructor Summary
public  Evaluation(Instances data)
     Initializes all the counters for the evaluation.
public  Evaluation(Instances data, CostMatrix costMatrix)
     Initializes all the counters for the evaluation and also takes a cost matrix as parameter.

Method Summary
final public  doubleKBInformation()
    
final public  doubleKBMeanInformation()
     Return the Kononenko & Bratko Information score in bits per instance.
final public  doubleKBRelativeInformation()
    
final public  doubleSFEntropyGain()
     Returns the total SF, which is the null model entropy minus the scheme entropy.
final public  doubleSFMeanEntropyGain()
     Returns the SF per instance, which is the null model entropy minus the scheme entropy, per instance.
final public  doubleSFMeanPriorEntropy()
    
final public  doubleSFMeanSchemeEntropy()
    
final public  doubleSFPriorEntropy()
    
final public  doubleSFSchemeEntropy()
    
protected  voidaddNumericTrainClass(double classValue, double weight)
     Adds a numeric (non-missing) training class value and weight to the buffer of stored values.
public  doubleareaUnderROC(int classIndex)
     Returns the area under ROC for those predictions that have been collected in the evaluateClassifier(Classifier, Instances) method.
protected static  StringattributeValuesString(Instance instance, Range attRange)
     Builds a string listing the attribute values in a specified range of indices, separated by commas and enclosed in brackets.
final public  doubleavgCost()
     Gets the average cost, that is, total cost of misclassifications (incorrect plus unclassified) over the total number of instances. the average cost.
public  double[][]confusionMatrix()
     Returns a copy of the confusion matrix.
final public  doublecorrect()
     Gets the number of instances correctly classified (that is, for which a correct prediction was made).
final public  doublecorrelationCoefficient()
     Returns the correlation coefficient if the class is numeric.
public  voidcrossValidateModel(Classifier classifier, Instances data, int numFolds, Random random)
     Performs a (stratified if class is nominal) cross-validation for a classifier on a set of instances.
public  voidcrossValidateModel(String classifierString, Instances data, int numFolds, String[] options, Random random)
     Performs a (stratified if class is nominal) cross-validation for a classifier on a set of instances.
Parameters:
  classifierString - a string naming the class of the classifier
Parameters:
  data - the data on which the cross-validation is to be performed
Parameters:
  numFolds - the number of folds for the cross-validation
Parameters:
  options - the options to the classifier.
public  booleanequals(Object obj)
    
final public  doubleerrorRate()
     Returns the estimated error rate or the root mean squared error (if the class is numeric).
public static  StringevaluateModel(String classifierString, String[] options)
     Evaluates a classifier with the options given in an array of strings.
public static  StringevaluateModel(Classifier classifier, String[] options)
     Evaluates a classifier with the options given in an array of strings.
public  double[]evaluateModel(Classifier classifier, Instances data)
     Evaluates the classifier on a given set of instances.
public  doubleevaluateModelOnce(Classifier classifier, Instance instance)
     Evaluates the classifier on a single instance.
public  doubleevaluateModelOnce(double[] dist, Instance instance)
     Evaluates the supplied distribution on a single instance.
public  voidevaluateModelOnce(double prediction, Instance instance)
     Evaluates the supplied prediction on a single instance.
public  doubleevaluateModelOnceAndRecordPrediction(Classifier classifier, Instance instance)
     Evaluates the classifier on a single instance and records the prediction (if the class is nominal).
public  doubleevaluateModelOnceAndRecordPrediction(double[] dist, Instance instance)
     Evaluates the supplied distribution on a single instance.
public  doublefMeasure(int classIndex)
     Calculate the F-Measure with respect to a particular class.
public  doublefalseNegativeRate(int classIndex)
     Calculate the false negative rate with respect to a particular class.
public  doublefalsePositiveRate(int classIndex)
     Calculate the false positive rate with respect to a particular class.
public  double[]getClassPriors()
    
protected static  CostMatrixhandleCostOption(String costFileName, int numClasses)
     Attempts to load a cost matrix.
Parameters:
  costFileName - the filename of the cost matrix
Parameters:
  numClasses - the number of classes that should be in the cost matrix(only used if the cost file is in old format).
final public  doubleincorrect()
     Gets the number of instances incorrectly classified (that is, for which an incorrect prediction was made).
final public  doublekappa()
     Returns value of kappa statistic if class is nominal.
public static  voidmain(String[] args)
     A test method for this class.
protected  double[]makeDistribution(double predictedClass)
    
protected static  StringmakeOptionString(Classifier classifier)
    
final public  doublemeanAbsoluteError()
     Returns the mean absolute error.
final public  doublemeanPriorAbsoluteError()
     Returns the mean absolute error of the prior.
protected  Stringnum2ShortID(int num, char[] IDChars, int IDWidth)
     Method for generating indices for the confusion matrix.
public  doublenumFalseNegatives(int classIndex)
     Calculate number of false negatives with respect to a particular class.
public  doublenumFalsePositives(int classIndex)
     Calculate number of false positives with respect to a particular class.
final public  doublenumInstances()
     Gets the number of test instances that had a known class value (actually the sum of the weights of test instances with known class value).
public  doublenumTrueNegatives(int classIndex)
     Calculate the number of true negatives with respect to a particular class.
public  doublenumTruePositives(int classIndex)
     Calculate the number of true positives with respect to a particular class.
final public  doublepctCorrect()
     Gets the percentage of instances correctly classified (that is, for which a correct prediction was made).
final public  doublepctIncorrect()
     Gets the percentage of instances incorrectly classified (that is, for which an incorrect prediction was made).
final public  doublepctUnclassified()
     Gets the percentage of instances not classified (that is, for which no prediction was made by the classifier).
public  doubleprecision(int classIndex)
     Calculate the precision with respect to a particular class.
protected static  StringpredictionText(Classifier classifier, Instance inst, int instNum, Range attributesToOutput, boolean printDistribution)
    
public  FastVectorpredictions()
     Returns the predictions that have been collected. a reference to the FastVector containing the predictionsthat have been collected.
protected static  StringprintClassifications(Classifier classifier, Instances train, DataSource testSource, int classIndex, Range attributesToOutput)
     Prints the predictions for the given dataset into a String variable.
protected static  StringprintClassifications(Classifier classifier, Instances train, DataSource testSource, int classIndex, Range attributesToOutput, boolean printDistribution)
     Prints the predictions for the given dataset into a String variable.
final public  doublepriorEntropy()
    
public  doublerecall(int classIndex)
     Calculate the recall with respect to a particular class.
final public  doublerelativeAbsoluteError()
     Returns the relative absolute error.
final public  doublerootMeanPriorSquaredError()
     Returns the root mean prior squared error.
final public  doublerootMeanSquaredError()
     Returns the root mean squared error.
final public  doublerootRelativeSquaredError()
     Returns the root relative squared error if the class is numeric.
protected  voidsetNumericPriorsFromBuffer()
     Sets up the priors for numeric class attributes from the training class values that have been seen so far.
public  voidsetPriors(Instances train)
    
public  StringtoClassDetailsString()
     Generates a breakdown of the accuracy for each class (with default title), incorporating various information-retrieval statistics, such as true/false positive rate, precision/recall/F-Measure.
public  StringtoClassDetailsString(String title)
     Generates a breakdown of the accuracy for each class, incorporating various information-retrieval statistics, such as true/false positive rate, precision/recall/F-Measure.
public  StringtoCumulativeMarginDistributionString()
     Output the cumulative margin distribution as a string suitable for input for gnuplot or similar package.
public  StringtoMatrixString()
     Calls toMatrixString() with a default title.
public  StringtoMatrixString(String title)
     Outputs the performance statistics as a classification confusion matrix.
public  StringtoSummaryString()
    
public  StringtoSummaryString(boolean printComplexityStatistics)
     Calls toSummaryString() with a default title.
public  StringtoSummaryString(String title, boolean printComplexityStatistics)
     Outputs the performance statistics in summary form.
final public  doubletotalCost()
     Gets the total cost, that is, the cost of each prediction times the weight of the instance, summed over all instances.
public  doubletrueNegativeRate(int classIndex)
     Calculate the true negative rate with respect to a particular class.
public  doubletruePositiveRate(int classIndex)
     Calculate the true positive rate with respect to a particular class.
final public  doubleunclassified()
     Gets the number of instances not classified (that is, for which no prediction was made by the classifier).
protected  voidupdateMargins(double[] predictedDistribution, int actualClass, double weight)
    
protected  voidupdateNumericScores(double[] predicted, double[] actual, double weight)
     Update the numeric accuracy measures.
public  voidupdatePriors(Instance instance)
    
protected  voidupdateStatsForClassifier(double[] predictedDistribution, Instance instance)
     Updates all the statistics about a classifiers performance for the current test instance.
protected  voidupdateStatsForPredictor(double predictedValue, Instance instance)
     Updates all the statistics about a predictors performance for the current test instance.
public  voiduseNoPriors()
     disables the use of priors, e.g., in case of de-serialized schemes that have no access to the original training set, but are evaluated on a set set.
protected static  StringwekaStaticWrapper(Sourcable classifier, String className)
     Wraps a static classifier in enough source to test using the weka class libraries.

Field Detail
MIN_SF_PROB
final protected static double MIN_SF_PROB(Code)
The minimum probablility accepted from an estimator to avoid taking log(0) in Sf calculations.



k_MarginResolution
protected static int k_MarginResolution(Code)
Resolution of the margin histogram



m_ClassIsNominal
protected boolean m_ClassIsNominal(Code)
Is the class nominal or numeric?



m_ClassNames
protected String[] m_ClassNames(Code)
The names of the classes.



m_ClassPriors
protected double[] m_ClassPriors(Code)
The prior probabilities of the classes



m_ClassPriorsSum
protected double m_ClassPriorsSum(Code)
The sum of counts for priors



m_ConfusionMatrix
protected double[][] m_ConfusionMatrix(Code)
Array for storing the confusion matrix.



m_Correct
protected double m_Correct(Code)
The weight of all correctly classified instances.



m_CostMatrix
protected CostMatrix m_CostMatrix(Code)
The cost matrix (if given).



m_ErrorEstimator
protected Estimator m_ErrorEstimator(Code)
Numeric class error estimator for scheme



m_Incorrect
protected double m_Incorrect(Code)
The weight of all incorrectly classified instances.



m_MarginCounts
protected double m_MarginCounts(Code)
Cumulative margin distribution



m_MissingClass
protected double m_MissingClass(Code)
The weight of all instances that had no class assigned to them.



m_NoPriors
protected boolean m_NoPriors(Code)
enables/disables the use of priors, e.g., if no training set is present in case of de-serialized schemes



m_NumClasses
protected int m_NumClasses(Code)
The number of classes.



m_NumFolds
protected int m_NumFolds(Code)
The number of folds for a cross-validation.



m_NumTrainClassVals
protected int m_NumTrainClassVals(Code)
Number of non-missing class training instances seen



m_PriorErrorEstimator
protected Estimator m_PriorErrorEstimator(Code)
Numeric class error estimator for prior



m_SumAbsErr
protected double m_SumAbsErr(Code)
Sum of absolute errors.



m_SumClass
protected double m_SumClass(Code)
Sum of class values.



m_SumClassPredicted
protected double m_SumClassPredicted(Code)
Sum of predicted * class values.



m_SumErr
protected double m_SumErr(Code)
Sum of errors.



m_SumKBInfo
protected double m_SumKBInfo(Code)
Total Kononenko & Bratko Information



m_SumPredicted
protected double m_SumPredicted(Code)
Sum of predicted values.



m_SumPriorAbsErr
protected double m_SumPriorAbsErr(Code)
Sum of absolute errors of the prior



m_SumPriorEntropy
protected double m_SumPriorEntropy(Code)
Total entropy of prior predictions



m_SumPriorSqrErr
protected double m_SumPriorSqrErr(Code)
Sum of absolute errors of the prior



m_SumSchemeEntropy
protected double m_SumSchemeEntropy(Code)
Total entropy of scheme predictions



m_SumSqrClass
protected double m_SumSqrClass(Code)
Sum of squared class values.



m_SumSqrErr
protected double m_SumSqrErr(Code)
Sum of squared errors.



m_SumSqrPredicted
protected double m_SumSqrPredicted(Code)
Sum of squared predicted values.



m_TotalCost
protected double m_TotalCost(Code)
The total cost of predictions (includes instance weights)



m_TrainClassVals
protected double[] m_TrainClassVals(Code)
Array containing all numeric training class values seen



m_TrainClassWeights
protected double[] m_TrainClassWeights(Code)
Array containing all numeric training class weights



m_Unclassified
protected double m_Unclassified(Code)
The weight of all unclassified instances.



m_WithClass
protected double m_WithClass(Code)
The weight of all instances that had a class assigned to them.




Constructor Detail
Evaluation
public Evaluation(Instances data) throws Exception(Code)
Initializes all the counters for the evaluation. Use useNoPriors() if the dataset is the test set and you can't initialize with the priors from the training set via setPriors(Instances).
Parameters:
  data - set of training instances, to get some header information and prior class distribution information
throws:
  Exception - if the class is not defined
See Also:   Evaluation.useNoPriors()
See Also:   Evaluation.setPriors(Instances)



Evaluation
public Evaluation(Instances data, CostMatrix costMatrix) throws Exception(Code)
Initializes all the counters for the evaluation and also takes a cost matrix as parameter. Use useNoPriors() if the dataset is the test set and you can't initialize with the priors from the training set via setPriors(Instances).
Parameters:
  data - set of training instances, to get some header information and prior class distribution information
Parameters:
  costMatrix - the cost matrix---if null, default costs will be used
throws:
  Exception - if cost matrix is not compatible with data, the class is not defined or the class is numeric
See Also:   Evaluation.useNoPriors()
See Also:   Evaluation.setPriors(Instances)




Method Detail
KBInformation
final public double KBInformation() throws Exception(Code)
Return the total Kononenko & Bratko Information score in bits the K&B information score
throws:
  Exception - if the class is not nominal



KBMeanInformation
final public double KBMeanInformation() throws Exception(Code)
Return the Kononenko & Bratko Information score in bits per instance. the K&B information score
throws:
  Exception - if the class is not nominal



KBRelativeInformation
final public double KBRelativeInformation() throws Exception(Code)
Return the Kononenko & Bratko Relative Information score the K&B relative information score
throws:
  Exception - if the class is not nominal



SFEntropyGain
final public double SFEntropyGain()(Code)
Returns the total SF, which is the null model entropy minus the scheme entropy. the total SF



SFMeanEntropyGain
final public double SFMeanEntropyGain()(Code)
Returns the SF per instance, which is the null model entropy minus the scheme entropy, per instance. the SF per instance



SFMeanPriorEntropy
final public double SFMeanPriorEntropy()(Code)
Returns the entropy per instance for the null model the null model entropy per instance



SFMeanSchemeEntropy
final public double SFMeanSchemeEntropy()(Code)
Returns the entropy per instance for the scheme the scheme entropy per instance



SFPriorEntropy
final public double SFPriorEntropy()(Code)
Returns the total entropy for the null model the total null model entropy



SFSchemeEntropy
final public double SFSchemeEntropy()(Code)
Returns the total entropy for the scheme the total scheme entropy



addNumericTrainClass
protected void addNumericTrainClass(double classValue, double weight)(Code)
Adds a numeric (non-missing) training class value and weight to the buffer of stored values.
Parameters:
  classValue - the class value
Parameters:
  weight - the instance weight



areaUnderROC
public double areaUnderROC(int classIndex)(Code)
Returns the area under ROC for those predictions that have been collected in the evaluateClassifier(Classifier, Instances) method. Returns Instance.missingValue() if the area is not available.
Parameters:
  classIndex - the index of the class to consider as "positive" the area under the ROC curve or not a number



attributeValuesString
protected static String attributeValuesString(Instance instance, Range attRange)(Code)
Builds a string listing the attribute values in a specified range of indices, separated by commas and enclosed in brackets.
Parameters:
  instance - the instance to print the values from
Parameters:
  attRange - the range of the attributes to list a string listing values of the attributes in the range



avgCost
final public double avgCost()(Code)
Gets the average cost, that is, total cost of misclassifications (incorrect plus unclassified) over the total number of instances. the average cost.



confusionMatrix
public double[][] confusionMatrix()(Code)
Returns a copy of the confusion matrix. a copy of the confusion matrix as a two-dimensional array



correct
final public double correct()(Code)
Gets the number of instances correctly classified (that is, for which a correct prediction was made). (Actually the sum of the weights of these instances) the number of correctly classified instances



correlationCoefficient
final public double correlationCoefficient() throws Exception(Code)
Returns the correlation coefficient if the class is numeric. the correlation coefficient
throws:
  Exception - if class is not numeric



crossValidateModel
public void crossValidateModel(Classifier classifier, Instances data, int numFolds, Random random) throws Exception(Code)
Performs a (stratified if class is nominal) cross-validation for a classifier on a set of instances. Now performs a deep copy of the classifier before each call to buildClassifier() (just in case the classifier is not initialized properly).
Parameters:
  classifier - the classifier with any options set.
Parameters:
  data - the data on which the cross-validation is to be performed
Parameters:
  numFolds - the number of folds for the cross-validation
Parameters:
  random - random number generator for randomization
throws:
  Exception - if a classifier could not be generated successfully or the class is not defined



crossValidateModel
public void crossValidateModel(String classifierString, Instances data, int numFolds, String[] options, Random random) throws Exception(Code)
Performs a (stratified if class is nominal) cross-validation for a classifier on a set of instances.
Parameters:
  classifierString - a string naming the class of the classifier
Parameters:
  data - the data on which the cross-validation is to be performed
Parameters:
  numFolds - the number of folds for the cross-validation
Parameters:
  options - the options to the classifier. Any options
Parameters:
  random - the random number generator for randomizing the dataaccepted by the classifier will be removed from this array.
throws:
  Exception - if a classifier could not be generated successfully or the class is not defined



equals
public boolean equals(Object obj)(Code)
Tests whether the current evaluation object is equal to another evaluation object
Parameters:
  obj - the object to compare against true if the two objects are equal



errorRate
final public double errorRate()(Code)
Returns the estimated error rate or the root mean squared error (if the class is numeric). If a cost matrix was given this error rate gives the average cost. the estimated error rate (between 0 and 1, or between 0 and maximum cost)



evaluateModel
public static String evaluateModel(String classifierString, String[] options) throws Exception(Code)
Evaluates a classifier with the options given in an array of strings.

Valid options are:

-t filename
Name of the file with the training data. (required)

-T filename
Name of the file with the test data. If missing a cross-validation is performed.

-c index
Index of the class attribute (1, 2, ...; default: last).

-x number
The number of folds for the cross-validation (default: 10).

-no-cv
No cross validation. If no test file is provided, no evaluation is done.

-split-percentage percentage
Sets the percentage for the train/test set split, e.g., 66.

-preserve-order
Preserves the order in the percentage split instead of randomizing the data first with the seed value ('-s').

-s seed
Random number seed for the cross-validation and percentage split (default: 1).

-m filename
The name of a file containing a cost matrix.

-l filename
Loads classifier from the given file. In case the filename ends with ".xml" the options are loaded from XML.

-d filename
Saves classifier built from the training data into the given file. In case the filename ends with ".xml" the options are saved XML, not the model.

-v
Outputs no statistics for the training data.

-o
Outputs statistics only, not the classifier.

-i
Outputs detailed information-retrieval statistics per class.

-k
Outputs information-theoretic statistics.

-p range
Outputs predictions for test instances (or the train instances if no test instances provided), along with the attributes in the specified range (and nothing else). Use '-p 0' if no attributes are desired.

-distribution
Outputs the distribution instead of only the prediction in conjunction with the '-p' option (only nominal classes).

-r
Outputs cumulative margin distribution (and nothing else).

-g
Only for classifiers that implement "Graphable." Outputs the graph representation of the classifier (and nothing else).

-xml filename | xml-string
Retrieves the options from the XML-data instead of the command line.

-threshold-file file
The file to save the threshold data to. The format is determined by the extensions, e.g., '.arff' for ARFF format or '.csv' for CSV.

-threshold-label label
The class label to determine the threshold data for (default is the first label)


Parameters:
  classifierString - class of machine learning classifier as a string
Parameters:
  options - the array of string containing the options
throws:
  Exception - if model could not be evaluated successfully a string describing the results




evaluateModel
public static String evaluateModel(Classifier classifier, String[] options) throws Exception(Code)
Evaluates a classifier with the options given in an array of strings.

Valid options are:

-t name of training file
Name of the file with the training data. (required)

-T name of test file
Name of the file with the test data. If missing a cross-validation is performed.

-c class index
Index of the class attribute (1, 2, ...; default: last).

-x number of folds
The number of folds for the cross-validation (default: 10).

-no-cv
No cross validation. If no test file is provided, no evaluation is done.

-split-percentage percentage
Sets the percentage for the train/test set split, e.g., 66.

-preserve-order
Preserves the order in the percentage split instead of randomizing the data first with the seed value ('-s').

-s seed
Random number seed for the cross-validation and percentage split (default: 1).

-m file with cost matrix
The name of a file containing a cost matrix.

-l filename
Loads classifier from the given file. In case the filename ends with ".xml" the options are loaded from XML.

-d filename
Saves classifier built from the training data into the given file. In case the filename ends with ".xml" the options are saved XML, not the model.

-v
Outputs no statistics for the training data.

-o
Outputs statistics only, not the classifier.

-i
Outputs detailed information-retrieval statistics per class.

-k
Outputs information-theoretic statistics.

-p range
Outputs predictions for test instances (or the train instances if no test instances provided), along with the attributes in the specified range (and nothing else). Use '-p 0' if no attributes are desired.

-distribution
Outputs the distribution instead of only the prediction in conjunction with the '-p' option (only nominal classes).

-r
Outputs cumulative margin distribution (and nothing else).

-g
Only for classifiers that implement "Graphable." Outputs the graph representation of the classifier (and nothing else).

-xml filename | xml-string
Retrieves the options from the XML-data instead of the command line.


Parameters:
  classifier - machine learning classifier
Parameters:
  options - the array of string containing the options
throws:
  Exception - if model could not be evaluated successfully a string describing the results




evaluateModel
public double[] evaluateModel(Classifier classifier, Instances data) throws Exception(Code)
Evaluates the classifier on a given set of instances. Note that the data must have exactly the same format (e.g. order of attributes) as the data used to train the classifier! Otherwise the results will generally be meaningless.
Parameters:
  classifier - machine learning classifier
Parameters:
  data - set of test instances for evaluation the predictions
throws:
  Exception - if model could not be evaluated successfully



evaluateModelOnce
public double evaluateModelOnce(Classifier classifier, Instance instance) throws Exception(Code)
Evaluates the classifier on a single instance.
Parameters:
  classifier - machine learning classifier
Parameters:
  instance - the test instance to be classified the prediction made by the clasifier
throws:
  Exception - if model could not be evaluated successfully or the data contains string attributes



evaluateModelOnce
public double evaluateModelOnce(double[] dist, Instance instance) throws Exception(Code)
Evaluates the supplied distribution on a single instance.
Parameters:
  dist - the supplied distribution
Parameters:
  instance - the test instance to be classified the prediction
throws:
  Exception - if model could not be evaluated successfully



evaluateModelOnce
public void evaluateModelOnce(double prediction, Instance instance) throws Exception(Code)
Evaluates the supplied prediction on a single instance.
Parameters:
  prediction - the supplied prediction
Parameters:
  instance - the test instance to be classified
throws:
  Exception - if model could not be evaluated successfully



evaluateModelOnceAndRecordPrediction
public double evaluateModelOnceAndRecordPrediction(Classifier classifier, Instance instance) throws Exception(Code)
Evaluates the classifier on a single instance and records the prediction (if the class is nominal).
Parameters:
  classifier - machine learning classifier
Parameters:
  instance - the test instance to be classified the prediction made by the clasifier
throws:
  Exception - if model could not be evaluated successfully or the data contains string attributes



evaluateModelOnceAndRecordPrediction
public double evaluateModelOnceAndRecordPrediction(double[] dist, Instance instance) throws Exception(Code)
Evaluates the supplied distribution on a single instance.
Parameters:
  dist - the supplied distribution
Parameters:
  instance - the test instance to be classified the prediction
throws:
  Exception - if model could not be evaluated successfully



fMeasure
public double fMeasure(int classIndex)(Code)
Calculate the F-Measure with respect to a particular class. This is defined as

 2 * recall * precision
 ----------------------
 recall + precision
 

Parameters:
  classIndex - the index of the class to consider as "positive" the F-Measure



falseNegativeRate
public double falseNegativeRate(int classIndex)(Code)
Calculate the false negative rate with respect to a particular class. This is defined as

 incorrectly classified positives
 --------------------------------
 total positives
 

Parameters:
  classIndex - the index of the class to consider as "positive" the false positive rate



falsePositiveRate
public double falsePositiveRate(int classIndex)(Code)
Calculate the false positive rate with respect to a particular class. This is defined as

 incorrectly classified negatives
 --------------------------------
 total negatives
 

Parameters:
  classIndex - the index of the class to consider as "positive" the false positive rate



getClassPriors
public double[] getClassPriors()(Code)
Get the current weighted class counts the weighted class counts



handleCostOption
protected static CostMatrix handleCostOption(String costFileName, int numClasses) throws Exception(Code)
Attempts to load a cost matrix.
Parameters:
  costFileName - the filename of the cost matrix
Parameters:
  numClasses - the number of classes that should be in the cost matrix(only used if the cost file is in old format). a CostMatrix value, or null if costFileName is empty
throws:
  Exception - if an error occurs.



incorrect
final public double incorrect()(Code)
Gets the number of instances incorrectly classified (that is, for which an incorrect prediction was made). (Actually the sum of the weights of these instances) the number of incorrectly classified instances



kappa
final public double kappa()(Code)
Returns value of kappa statistic if class is nominal. the value of the kappa statistic



main
public static void main(String[] args)(Code)
A test method for this class. Just extracts the first command line argument as a classifier class name and calls evaluateModel.
Parameters:
  args - an array of command line arguments, the first of whichmust be the class name of a classifier.



makeDistribution
protected double[] makeDistribution(double predictedClass)(Code)
Convert a single prediction into a probability distribution with all zero probabilities except the predicted value which has probability 1.0;
Parameters:
  predictedClass - the index of the predicted class the probability distribution



makeOptionString
protected static String makeOptionString(Classifier classifier)(Code)
Make up the help string giving all the command line options
Parameters:
  classifier - the classifier to include options for a string detailing the valid command line options



meanAbsoluteError
final public double meanAbsoluteError()(Code)
Returns the mean absolute error. Refers to the error of the predicted values for numeric classes, and the error of the predicted probability distribution for nominal classes. the mean absolute error



meanPriorAbsoluteError
final public double meanPriorAbsoluteError()(Code)
Returns the mean absolute error of the prior. the mean absolute error



num2ShortID
protected String num2ShortID(int num, char[] IDChars, int IDWidth)(Code)
Method for generating indices for the confusion matrix.
Parameters:
  num - integer to format
Parameters:
  IDChars - the characters to use
Parameters:
  IDWidth - the width of the entry the formatted integer as a string



numFalseNegatives
public double numFalseNegatives(int classIndex)(Code)
Calculate number of false negatives with respect to a particular class. This is defined as

 incorrectly classified positives
 

Parameters:
  classIndex - the index of the class to consider as "positive" the false positive rate



numFalsePositives
public double numFalsePositives(int classIndex)(Code)
Calculate number of false positives with respect to a particular class. This is defined as

 incorrectly classified negatives
 

Parameters:
  classIndex - the index of the class to consider as "positive" the false positive rate



numInstances
final public double numInstances()(Code)
Gets the number of test instances that had a known class value (actually the sum of the weights of test instances with known class value). the number of test instances with known class



numTrueNegatives
public double numTrueNegatives(int classIndex)(Code)
Calculate the number of true negatives with respect to a particular class. This is defined as

 correctly classified negatives
 

Parameters:
  classIndex - the index of the class to consider as "positive" the true positive rate



numTruePositives
public double numTruePositives(int classIndex)(Code)
Calculate the number of true positives with respect to a particular class. This is defined as

 correctly classified positives
 

Parameters:
  classIndex - the index of the class to consider as "positive" the true positive rate



pctCorrect
final public double pctCorrect()(Code)
Gets the percentage of instances correctly classified (that is, for which a correct prediction was made). the percent of correctly classified instances (between 0 and 100)



pctIncorrect
final public double pctIncorrect()(Code)
Gets the percentage of instances incorrectly classified (that is, for which an incorrect prediction was made). the percent of incorrectly classified instances (between 0 and 100)



pctUnclassified
final public double pctUnclassified()(Code)
Gets the percentage of instances not classified (that is, for which no prediction was made by the classifier). the percent of unclassified instances (between 0 and 100)



precision
public double precision(int classIndex)(Code)
Calculate the precision with respect to a particular class. This is defined as

 correctly classified positives
 ------------------------------
 total predicted as positive
 

Parameters:
  classIndex - the index of the class to consider as "positive" the precision



predictionText
protected static String predictionText(Classifier classifier, Instance inst, int instNum, Range attributesToOutput, boolean printDistribution) throws Exception(Code)
returns the prediction made by the classifier as a string
Parameters:
  classifier - the classifier to use
Parameters:
  inst - the instance to generate text from
Parameters:
  instNum - the index in the dataset
Parameters:
  attributesToOutput - the indices of the attributes to output
Parameters:
  printDistribution - prints the complete distribution for nominal classes, not just the predicted value the generated text
throws:
  Exception - if something goes wrong
See Also:   Evaluation.printClassifications(Classifier,Instances,String,int,Range,boolean)



predictions
public FastVector predictions()(Code)
Returns the predictions that have been collected. a reference to the FastVector containing the predictionsthat have been collected. This should be null if no predictionshave been collected (e.g. if the class is numeric).



printClassifications
protected static String printClassifications(Classifier classifier, Instances train, DataSource testSource, int classIndex, Range attributesToOutput) throws Exception(Code)
Prints the predictions for the given dataset into a String variable.
Parameters:
  classifier - the classifier to use
Parameters:
  train - the training data
Parameters:
  testSource - the test set
Parameters:
  classIndex - the class index (1-based), if -1 ot does not override the class index is stored in the data file (by using the last attribute)
Parameters:
  attributesToOutput - the indices of the attributes to output the generated predictions for the attribute range
throws:
  Exception - if test file cannot be opened



printClassifications
protected static String printClassifications(Classifier classifier, Instances train, DataSource testSource, int classIndex, Range attributesToOutput, boolean printDistribution) throws Exception(Code)
Prints the predictions for the given dataset into a String variable.
Parameters:
  classifier - the classifier to use
Parameters:
  train - the training data
Parameters:
  testSource - the test set
Parameters:
  classIndex - the class index (1-based), if -1 ot does not override the class index is stored in the data file (by using the last attribute)
Parameters:
  attributesToOutput - the indices of the attributes to output
Parameters:
  printDistribution - prints the complete distribution for nominal classes, not just the predicted value the generated predictions for the attribute range
throws:
  Exception - if test file cannot be opened



priorEntropy
final public double priorEntropy() throws Exception(Code)
Calculate the entropy of the prior distribution the entropy of the prior distribution
throws:
  Exception - if the class is not nominal



recall
public double recall(int classIndex)(Code)
Calculate the recall with respect to a particular class. This is defined as

 correctly classified positives
 ------------------------------
 total positives
 

(Which is also the same as the truePositiveRate.)
Parameters:
  classIndex - the index of the class to consider as "positive" the recall




relativeAbsoluteError
final public double relativeAbsoluteError() throws Exception(Code)
Returns the relative absolute error. the relative absolute error
throws:
  Exception - if it can't be computed



rootMeanPriorSquaredError
final public double rootMeanPriorSquaredError()(Code)
Returns the root mean prior squared error. the root mean prior squared error



rootMeanSquaredError
final public double rootMeanSquaredError()(Code)
Returns the root mean squared error. the root mean squared error



rootRelativeSquaredError
final public double rootRelativeSquaredError()(Code)
Returns the root relative squared error if the class is numeric. the root relative squared error



setNumericPriorsFromBuffer
protected void setNumericPriorsFromBuffer()(Code)
Sets up the priors for numeric class attributes from the training class values that have been seen so far.



setPriors
public void setPriors(Instances train) throws Exception(Code)
Sets the class prior probabilities
Parameters:
  train - the training instances used to determinethe prior probabilities
throws:
  Exception - if the class attribute of the instances is notset



toClassDetailsString
public String toClassDetailsString() throws Exception(Code)
Generates a breakdown of the accuracy for each class (with default title), incorporating various information-retrieval statistics, such as true/false positive rate, precision/recall/F-Measure. Should be useful for ROC curves, recall/precision curves. the statistics presented as a string
throws:
  Exception - if class is not nominal



toClassDetailsString
public String toClassDetailsString(String title) throws Exception(Code)
Generates a breakdown of the accuracy for each class, incorporating various information-retrieval statistics, such as true/false positive rate, precision/recall/F-Measure. Should be useful for ROC curves, recall/precision curves.
Parameters:
  title - the title to prepend the stats string with the statistics presented as a string
throws:
  Exception - if class is not nominal



toCumulativeMarginDistributionString
public String toCumulativeMarginDistributionString() throws Exception(Code)
Output the cumulative margin distribution as a string suitable for input for gnuplot or similar package. the cumulative margin distribution
throws:
  Exception - if the class attribute is nominal



toMatrixString
public String toMatrixString() throws Exception(Code)
Calls toMatrixString() with a default title. the confusion matrix as a string
throws:
  Exception - if the class is numeric



toMatrixString
public String toMatrixString(String title) throws Exception(Code)
Outputs the performance statistics as a classification confusion matrix. For each class value, shows the distribution of predicted class values.
Parameters:
  title - the title for the confusion matrix the confusion matrix as a String
throws:
  Exception - if the class is numeric



toSummaryString
public String toSummaryString()(Code)
Calls toSummaryString() with no title and no complexity stats a summary description of the classifier evaluation



toSummaryString
public String toSummaryString(boolean printComplexityStatistics)(Code)
Calls toSummaryString() with a default title.
Parameters:
  printComplexityStatistics - if true, complexity statistics arereturned as well the summary string



toSummaryString
public String toSummaryString(String title, boolean printComplexityStatistics)(Code)
Outputs the performance statistics in summary form. Lists number (and percentage) of instances classified correctly, incorrectly and unclassified. Outputs the total number of instances classified, and the number of instances (if any) that had no class value provided.
Parameters:
  title - the title for the statistics
Parameters:
  printComplexityStatistics - if true, complexity statistics arereturned as well the summary as a String



totalCost
final public double totalCost()(Code)
Gets the total cost, that is, the cost of each prediction times the weight of the instance, summed over all instances. the total cost



trueNegativeRate
public double trueNegativeRate(int classIndex)(Code)
Calculate the true negative rate with respect to a particular class. This is defined as

 correctly classified negatives
 ------------------------------
 total negatives
 

Parameters:
  classIndex - the index of the class to consider as "positive" the true positive rate



truePositiveRate
public double truePositiveRate(int classIndex)(Code)
Calculate the true positive rate with respect to a particular class. This is defined as

 correctly classified positives
 ------------------------------
 total positives
 

Parameters:
  classIndex - the index of the class to consider as "positive" the true positive rate



unclassified
final public double unclassified()(Code)
Gets the number of instances not classified (that is, for which no prediction was made by the classifier). (Actually the sum of the weights of these instances) the number of unclassified instances



updateMargins
protected void updateMargins(double[] predictedDistribution, int actualClass, double weight)(Code)
Update the cumulative record of classification margins
Parameters:
  predictedDistribution - the probability distribution predicted forthe current instance
Parameters:
  actualClass - the index of the actual instance class
Parameters:
  weight - the weight assigned to the instance



updateNumericScores
protected void updateNumericScores(double[] predicted, double[] actual, double weight)(Code)
Update the numeric accuracy measures. For numeric classes, the accuracy is between the actual and predicted class values. For nominal classes, the accuracy is between the actual and predicted class probabilities.
Parameters:
  predicted - the predicted values
Parameters:
  actual - the actual value
Parameters:
  weight - the weight associated with this prediction



updatePriors
public void updatePriors(Instance instance) throws Exception(Code)
Updates the class prior probabilities (when incrementally training)
Parameters:
  instance - the new training instance seen
throws:
  Exception - if the class of the instance is notset



updateStatsForClassifier
protected void updateStatsForClassifier(double[] predictedDistribution, Instance instance) throws Exception(Code)
Updates all the statistics about a classifiers performance for the current test instance.
Parameters:
  predictedDistribution - the probabilities assigned to each class
Parameters:
  instance - the instance to be classified
throws:
  Exception - if the class of the instance is notset



updateStatsForPredictor
protected void updateStatsForPredictor(double predictedValue, Instance instance) throws Exception(Code)
Updates all the statistics about a predictors performance for the current test instance.
Parameters:
  predictedValue - the numeric value the classifier predicts
Parameters:
  instance - the instance to be classified
throws:
  Exception - if the class of the instance is notset



useNoPriors
public void useNoPriors()(Code)
disables the use of priors, e.g., in case of de-serialized schemes that have no access to the original training set, but are evaluated on a set set.



wekaStaticWrapper
protected static String wekaStaticWrapper(Sourcable classifier, String className) throws Exception(Code)
Wraps a static classifier in enough source to test using the weka class libraries.
Parameters:
  classifier - a Sourcable Classifier
Parameters:
  className - the name to give to the source code class the source for a static classifier that can be tested withweka libraries.
throws:
  Exception - if code-generation fails



Methods inherited from java.lang.Object
native protected Object clone() throws CloneNotSupportedException(Code)(Java Doc)
public boolean equals(Object obj)(Code)(Java Doc)
protected void finalize() throws Throwable(Code)(Java Doc)
final native public Class getClass()(Code)(Java Doc)
native public int hashCode()(Code)(Java Doc)
final native public void notify()(Code)(Java Doc)
final native public void notifyAll()(Code)(Java Doc)
public String toString()(Code)(Java Doc)
final native public void wait(long timeout) throws InterruptedException(Code)(Java Doc)
final public void wait(long timeout, int nanos) throws InterruptedException(Code)(Java Doc)
final public void wait() throws InterruptedException(Code)(Java Doc)

www.java2java.com | Contact Us
Copyright 2009 - 12 Demo Source and Support. All rights reserved.
All other trademarks are property of their respective owners.