weka.classifiers
Class Evaluation

java.lang.Object
  extended byweka.classifiers.Evaluation
All Implemented Interfaces:
Summarizable

public class Evaluation
extends java.lang.Object
implements Summarizable

Class for evaluating machine learning models.

-------------------------------------------------------------------

General options when evaluating a learning scheme from the command-line:

-t filename
Name of the file with the training data. (required)

-T filename
Name of the file with the test data. If missing a cross-validation is performed.

-c index
Index of the class attribute (1, 2, ...; default: last).

-x number
The number of folds for the cross-validation (default: 10).

-s seed
Random number seed for the cross-validation (default: 1).

-m filename
The name of a file containing a cost matrix.

-l filename
Loads classifier from the given file.

-d filename
Saves classifier built from the training data into the given file.

-v
Outputs no statistics for the training data.

-o
Outputs statistics only, not the classifier.

-i
Outputs information-retrieval statistics per class.

-k
Outputs information-theoretic statistics.

-p range
Outputs predictions for test instances, along with the attributes in the specified range (and nothing else). Use '-p 0' if no attributes are desired.

-r
Outputs cumulative margin distribution (and nothing else).

-g
Only for classifiers that implement "Graphable." Outputs the graph representation of the classifier (and nothing else).

-------------------------------------------------------------------

Example usage as the main of a classifier (called FunkyClassifier):

 public static void main(String [] args) {
   try {
     Classifier scheme = new FunkyClassifier();
     System.out.println(Evaluation.evaluateModel(scheme, args));
   } catch (Exception e) {
     System.err.println(e.getMessage());
   }
 }
 

------------------------------------------------------------------

Example usage from within an application:

 Instances trainInstances = ... instances got from somewhere
 Instances testInstances = ... instances got from somewhere
 Classifier scheme = ... scheme got from somewhere

 Evaluation evaluation = new Evaluation(trainInstances);
 evaluation.evaluateModel(scheme, testInstances);
 System.out.println(evaluation.toSummaryString());
 

Version:
$Revision: 1.49 $
Author:
Eibe Frank (eibe@cs.waikato.ac.nz), Len Trigg (trigg@cs.waikato.ac.nz)

Field Summary
protected static int k_MarginResolution
          Resolution of the margin histogram
protected  boolean m_ClassIsNominal
          Is the class nominal or numeric?
protected  java.lang.String[] m_ClassNames
          The names of the classes.
protected  double[] m_ClassPriors
          The prior probabilities of the classes
protected  double m_ClassPriorsSum
          The sum of counts for priors
protected  double[][] m_ConfusionMatrix
          Array for storing the confusion matrix.
protected  double m_Correct
          The weight of all correctly classified instances.
protected  CostMatrix m_CostMatrix
          The cost matrix (if given).
protected  Estimator m_ErrorEstimator
          Numeric class error estimator for scheme
protected  double m_Incorrect
          The weight of all incorrectly classified instances.
protected  double[] m_MarginCounts
          Cumulative margin distribution
protected  double m_MissingClass
          The weight of all instances that had no class assigned to them.
protected  int m_NumClasses
          The number of classes.
protected  int m_NumFolds
          The number of folds for a cross-validation.
protected  int m_NumTrainClassVals
          Number of non-missing class training instances seen
protected  Estimator m_PriorErrorEstimator
          Numeric class error estimator for prior
protected  double m_SumAbsErr
          Sum of absolute errors.
protected  double m_SumClass
          Sum of class values.
protected  double m_SumClassPredicted
          Sum of predicted * class values.
protected  double m_SumErr
          Sum of errors.
protected  double m_SumKBInfo
          Total Kononenko & Bratko Information
protected  double m_SumPredicted
          Sum of predicted values.
protected  double m_SumPriorAbsErr
          Sum of absolute errors of the prior
protected  double m_SumPriorEntropy
          Total entropy of prior predictions
protected  double m_SumPriorSqrErr
          Sum of absolute errors of the prior
protected  double m_SumSchemeEntropy
          Total entropy of scheme predictions
protected  double m_SumSqrClass
          Sum of squared class values.
protected  double m_SumSqrErr
          Sum of squared errors.
protected  double m_SumSqrPredicted
          Sum of squared predicted values.
protected  double m_TotalCost
          The total cost of predictions (includes instance weights)
protected  double[] m_TrainClassVals
          Array containing all numeric training class values seen
protected  double[] m_TrainClassWeights
          Array containing all numeric training class weights
protected  double m_Unclassified
          The weight of all unclassified instances.
protected  double m_WithClass
          The weight of all instances that had a class assigned to them.
protected static double MIN_SF_PROB
          The minimum probablility accepted from an estimator to avoid taking log(0) in Sf calculations.
 
Constructor Summary
Evaluation(Instances data)
          Initializes all the counters for the evaluation.
Evaluation(Instances data, CostMatrix costMatrix)
          Initializes all the counters for the evaluation and also takes a cost matrix as parameter.
 
Method Summary
protected  void addNumericTrainClass(double classValue, double weight)
          Adds a numeric (non-missing) training class value and weight to the buffer of stored values.
protected static java.lang.String attributeValuesString(Instance instance, Range attRange)
          Builds a string listing the attribute values in a specified range of indices, separated by commas and enclosed in brackets.
 double avgCost()
          Gets the average cost, that is, total cost of misclassifications (incorrect plus unclassified) over the total number of instances.
 double[][] confusionMatrix()
          Returns a copy of the confusion matrix.
 double correct()
          Gets the number of instances correctly classified (that is, for which a correct prediction was made).
 double correlationCoefficient()
          Returns the correlation coefficient if the class is numeric.
 void crossValidateModel(Classifier classifier, Instances data, int numFolds, java.util.Random random)
          Performs a (stratified if class is nominal) cross-validation for a classifier on a set of instances.
 void crossValidateModel(java.lang.String classifierString, Instances data, int numFolds, java.lang.String[] options, java.util.Random random)
          Performs a (stratified if class is nominal) cross-validation for a classifier on a set of instances.
 boolean equals(java.lang.Object obj)
          Tests whether the current evaluation object is equal to another evaluation object
 double errorRate()
          Returns the estimated error rate or the root mean squared error (if the class is numeric).
 void evaluateModel(Classifier classifier, Instances data)
          Evaluates the classifier on a given set of instances.
static java.lang.String evaluateModel(Classifier classifier, java.lang.String[] options)
          Evaluates a classifier with the options given in an array of strings.
static java.lang.String evaluateModel(java.lang.String classifierString, java.lang.String[] options)
          Evaluates a classifier with the options given in an array of strings.
 double evaluateModelOnce(Classifier classifier, Instance instance)
          Evaluates the classifier on a single instance.
 double evaluateModelOnce(double[] dist, Instance instance)
          Evaluates the supplied distribution on a single instance.
 void evaluateModelOnce(double prediction, Instance instance)
          Evaluates the supplied prediction on a single instance.
 double falseNegativeRate(int classIndex)
          Calculate the false negative rate with respect to a particular class.
 double falsePositiveRate(int classIndex)
          Calculate the false positive rate with respect to a particular class.
 double fMeasure(int classIndex)
          Calculate the F-Measure with respect to a particular class.
protected static CostMatrix handleCostOption(java.lang.String costFileName, int numClasses)
          Attempts to load a cost matrix.
 double incorrect()
          Gets the number of instances incorrectly classified (that is, for which an incorrect prediction was made).
 double kappa()
          Returns value of kappa statistic if class is nominal.
 double KBInformation()
          Return the total Kononenko & Bratko Information score in bits
 double KBMeanInformation()
          Return the Kononenko & Bratko Information score in bits per instance.
 double KBRelativeInformation()
          Return the Kononenko & Bratko Relative Information score
static void main(java.lang.String[] args)
          A test method for this class.
protected  double[] makeDistribution(double predictedClass)
          Convert a single prediction into a probability distribution with all zero probabilities except the predicted value which has probability 1.0;
protected static java.lang.String makeOptionString(Classifier classifier)
          Make up the help string giving all the command line options
 double meanAbsoluteError()
          Returns the mean absolute error.
 double meanPriorAbsoluteError()
          Returns the mean absolute error of the prior.
protected  java.lang.String num2ShortID(int num, char[] IDChars, int IDWidth)
          Method for generating indices for the confusion matrix.
 double numFalseNegatives(int classIndex)
          Calculate number of false negatives with respect to a particular class.
 double numFalsePositives(int classIndex)
          Calculate number of false positives with respect to a particular class.
 double numInstances()
          Gets the number of test instances that had a known class value (actually the sum of the weights of test instances with known class value).
 double numTrueNegatives(int classIndex)
          Calculate the number of true negatives with respect to a particular class.
 double numTruePositives(int classIndex)
          Calculate the number of true positives with respect to a particular class.
 double pctCorrect()
          Gets the percentage of instances correctly classified (that is, for which a correct prediction was made).
 double pctIncorrect()
          Gets the percentage of instances incorrectly classified (that is, for which an incorrect prediction was made).
 double pctUnclassified()
          Gets the percentage of instances not classified (that is, for which no prediction was made by the classifier).
 double precision(int classIndex)
          Calculate the precision with respect to a particular class.
protected static java.lang.String printClassifications(Classifier classifier, Instances train, java.lang.String testFileName, int classIndex, Range attributesToOutput)
          Prints the predictions for the given dataset into a String variable.
 double priorEntropy()
          Calculate the entropy of the prior distribution
 double recall(int classIndex)
          Calculate the recall with respect to a particular class.
 double relativeAbsoluteError()
          Returns the relative absolute error.
 double rootMeanPriorSquaredError()
          Returns the root mean prior squared error.
 double rootMeanSquaredError()
          Returns the root mean squared error.
 double rootRelativeSquaredError()
          Returns the root relative squared error if the class is numeric.
protected  void setNumericPriorsFromBuffer()
          Sets up the priors for numeric class attributes from the training class values that have been seen so far.
 void setPriors(Instances train)
          Sets the class prior probabilities
 double SFEntropyGain()
          Returns the total SF, which is the null model entropy minus the scheme entropy.
 double SFMeanEntropyGain()
          Returns the SF per instance, which is the null model entropy minus the scheme entropy, per instance.
 double SFMeanPriorEntropy()
          Returns the entropy per instance for the null model
 double SFMeanSchemeEntropy()
          Returns the entropy per instance for the scheme
 double SFPriorEntropy()
          Returns the total entropy for the null model
 double SFSchemeEntropy()
          Returns the total entropy for the scheme
 java.lang.String toClassDetailsString()
           
 java.lang.String toClassDetailsString(java.lang.String title)
          Generates a breakdown of the accuracy for each class, incorporating various information-retrieval statistics, such as true/false positive rate, precision/recall/F-Measure.
 java.lang.String toCumulativeMarginDistributionString()
          Output the cumulative margin distribution as a string suitable for input for gnuplot or similar package.
 java.lang.String toMatrixString()
          Calls toMatrixString() with a default title.
 java.lang.String toMatrixString(java.lang.String title)
          Outputs the performance statistics as a classification confusion matrix.
 java.lang.String toSummaryString()
          Calls toSummaryString() with no title and no complexity stats
 java.lang.String toSummaryString(boolean printComplexityStatistics)
          Calls toSummaryString() with a default title.
 java.lang.String toSummaryString(java.lang.String title, boolean printComplexityStatistics)
          Outputs the performance statistics in summary form.
 double totalCost()
          Gets the total cost, that is, the cost of each prediction times the weight of the instance, summed over all instances.
 double trueNegativeRate(int classIndex)
          Calculate the true negative rate with respect to a particular class.
 double truePositiveRate(int classIndex)
          Calculate the true positive rate with respect to a particular class.
 double unclassified()
          Gets the number of instances not classified (that is, for which no prediction was made by the classifier).
protected  void updateMargins(double[] predictedDistribution, int actualClass, double weight)
          Update the cumulative record of classification margins
protected  void updateNumericScores(double[] predicted, double[] actual, double weight)
          Update the numeric accuracy measures.
 void updatePriors(Instance instance)
          Updates the class prior probabilities (when incrementally training)
protected  void updateStatsForClassifier(double[] predictedDistribution, Instance instance)
          Updates all the statistics about a classifiers performance for the current test instance.
protected  void updateStatsForPredictor(double predictedValue, Instance instance)
          Updates all the statistics about a predictors performance for the current test instance.
protected static java.lang.String wekaStaticWrapper(Sourcable classifier, java.lang.String className)
          Wraps a static classifier in enough source to test using the weka class libraries.
 
Methods inherited from class java.lang.Object
clone, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

m_NumClasses

protected int m_NumClasses
The number of classes.


m_NumFolds

protected int m_NumFolds
The number of folds for a cross-validation.


m_Incorrect

protected double m_Incorrect
The weight of all incorrectly classified instances.


m_Correct

protected double m_Correct
The weight of all correctly classified instances.


m_Unclassified

protected double m_Unclassified
The weight of all unclassified instances.


m_MissingClass

protected double m_MissingClass
The weight of all instances that had no class assigned to them.


m_WithClass

protected double m_WithClass
The weight of all instances that had a class assigned to them.


m_ConfusionMatrix

protected double[][] m_ConfusionMatrix
Array for storing the confusion matrix.


m_ClassNames

protected java.lang.String[] m_ClassNames
The names of the classes.


m_ClassIsNominal

protected boolean m_ClassIsNominal
Is the class nominal or numeric?


m_ClassPriors

protected double[] m_ClassPriors
The prior probabilities of the classes


m_ClassPriorsSum

protected double m_ClassPriorsSum
The sum of counts for priors


m_CostMatrix

protected CostMatrix m_CostMatrix
The cost matrix (if given).


m_TotalCost

protected double m_TotalCost
The total cost of predictions (includes instance weights)


m_SumErr

protected double m_SumErr
Sum of errors.


m_SumAbsErr

protected double m_SumAbsErr
Sum of absolute errors.


m_SumSqrErr

protected double m_SumSqrErr
Sum of squared errors.


m_SumClass

protected double m_SumClass
Sum of class values.


m_SumSqrClass

protected double m_SumSqrClass
Sum of squared class values.


m_SumPredicted

protected double m_SumPredicted
Sum of predicted values.


m_SumSqrPredicted

protected double m_SumSqrPredicted
Sum of squared predicted values.


m_SumClassPredicted

protected double m_SumClassPredicted
Sum of predicted * class values.


m_SumPriorAbsErr

protected double m_SumPriorAbsErr
Sum of absolute errors of the prior


m_SumPriorSqrErr

protected double m_SumPriorSqrErr
Sum of absolute errors of the prior


m_SumKBInfo

protected double m_SumKBInfo
Total Kononenko & Bratko Information


k_MarginResolution

protected static int k_MarginResolution
Resolution of the margin histogram


m_MarginCounts

protected double[] m_MarginCounts
Cumulative margin distribution


m_NumTrainClassVals

protected int m_NumTrainClassVals
Number of non-missing class training instances seen


m_TrainClassVals

protected double[] m_TrainClassVals
Array containing all numeric training class values seen


m_TrainClassWeights

protected double[] m_TrainClassWeights
Array containing all numeric training class weights


m_PriorErrorEstimator

protected Estimator m_PriorErrorEstimator
Numeric class error estimator for prior


m_ErrorEstimator

protected Estimator m_ErrorEstimator
Numeric class error estimator for scheme


MIN_SF_PROB

protected static final double MIN_SF_PROB
The minimum probablility accepted from an estimator to avoid taking log(0) in Sf calculations.

See Also:
Constant Field Values

m_SumPriorEntropy

protected double m_SumPriorEntropy
Total entropy of prior predictions


m_SumSchemeEntropy

protected double m_SumSchemeEntropy
Total entropy of scheme predictions

Constructor Detail

Evaluation

public Evaluation(Instances data)
           throws java.lang.Exception
Initializes all the counters for the evaluation.

Parameters:
data - set of training instances, to get some header information and prior class distribution information
Throws:
java.lang.Exception - if the class is not defined

Evaluation

public Evaluation(Instances data,
                  CostMatrix costMatrix)
           throws java.lang.Exception
Initializes all the counters for the evaluation and also takes a cost matrix as parameter.

Parameters:
data - set of instances, to get some header information
costMatrix - the cost matrix---if null, default costs will be used
Throws:
java.lang.Exception - if cost matrix is not compatible with data, the class is not defined or the class is numeric
Method Detail

confusionMatrix

public double[][] confusionMatrix()
Returns a copy of the confusion matrix.

Returns:
a copy of the confusion matrix as a two-dimensional array

crossValidateModel

public void crossValidateModel(Classifier classifier,
                               Instances data,
                               int numFolds,
                               java.util.Random random)
                        throws java.lang.Exception
Performs a (stratified if class is nominal) cross-validation for a classifier on a set of instances.

Parameters:
classifier - the classifier with any options set.
data - the data on which the cross-validation is to be performed
numFolds - the number of folds for the cross-validation
random - random number generator for randomization
Throws:
java.lang.Exception - if a classifier could not be generated successfully or the class is not defined

crossValidateModel

public void crossValidateModel(java.lang.String classifierString,
                               Instances data,
                               int numFolds,
                               java.lang.String[] options,
                               java.util.Random random)
                        throws java.lang.Exception
Performs a (stratified if class is nominal) cross-validation for a classifier on a set of instances.

Parameters:
data - the data on which the cross-validation is to be performed
numFolds - the number of folds for the cross-validation
options - the options to the classifier. Any options
random - the random number generator for randomizing the data accepted by the classifier will be removed from this array.
Throws:
java.lang.Exception - if a classifier could not be generated successfully or the class is not defined

evaluateModel

public static java.lang.String evaluateModel(java.lang.String classifierString,
                                             java.lang.String[] options)
                                      throws java.lang.Exception
Evaluates a classifier with the options given in an array of strings.

Valid options are:

-t filename
Name of the file with the training data. (required)

-T filename
Name of the file with the test data. If missing a cross-validation is performed.

-c index
Index of the class attribute (1, 2, ...; default: last).

-x number
The number of folds for the cross-validation (default: 10).

-s seed
Random number seed for the cross-validation (default: 1).

-m filename
The name of a file containing a cost matrix.

-l filename
Loads classifier from the given file.

-d filename
Saves classifier built from the training data into the given file.

-v
Outputs no statistics for the training data.

-o
Outputs statistics only, not the classifier.

-i
Outputs detailed information-retrieval statistics per class.

-k
Outputs information-theoretic statistics.

-p range
Outputs predictions for test instances, along with the attributes in the specified range (and nothing else). Use '-p 0' if no attributes are desired.

-r
Outputs cumulative margin distribution (and nothing else).

-g
Only for classifiers that implement "Graphable." Outputs the graph representation of the classifier (and nothing else).

Parameters:
classifierString - class of machine learning classifier as a string
options - the array of string containing the options
Returns:
a string describing the results
Throws:
java.lang.Exception - if model could not be evaluated successfully

main

public static void main(java.lang.String[] args)
A test method for this class. Just extracts the first command line argument as a classifier class name and calls evaluateModel.

Parameters:
args - an array of command line arguments, the first of which must be the class name of a classifier.

evaluateModel

public static java.lang.String evaluateModel(Classifier classifier,
                                             java.lang.String[] options)
                                      throws java.lang.Exception
Evaluates a classifier with the options given in an array of strings.

Valid options are:

-t name of training file
Name of the file with the training data. (required)

-T name of test file
Name of the file with the test data. If missing a cross-validation is performed.

-c class index
Index of the class attribute (1, 2, ...; default: last).

-x number of folds
The number of folds for the cross-validation (default: 10).

-s random number seed
Random number seed for the cross-validation (default: 1).

-m file with cost matrix
The name of a file containing a cost matrix.

-l name of model input file
Loads classifier from the given file.

-d name of model output file
Saves classifier built from the training data into the given file.

-v
Outputs no statistics for the training data.

-o
Outputs statistics only, not the classifier.

-i
Outputs detailed information-retrieval statistics per class.

-k
Outputs information-theoretic statistics.

-p
Outputs predictions for test instances (and nothing else).

-r
Outputs cumulative margin distribution (and nothing else).

-g
Only for classifiers that implement "Graphable." Outputs the graph representation of the classifier (and nothing else).

Parameters:
classifier - machine learning classifier
options - the array of string containing the options
Returns:
a string describing the results
Throws:
java.lang.Exception - if model could not be evaluated successfully

handleCostOption

protected static CostMatrix handleCostOption(java.lang.String costFileName,
                                             int numClasses)
                                      throws java.lang.Exception
Attempts to load a cost matrix.

Parameters:
costFileName - the filename of the cost matrix
numClasses - the number of classes that should be in the cost matrix (only used if the cost file is in old format).
Returns:
a CostMatrix value, or null if costFileName is empty
Throws:
java.lang.Exception - if an error occurs.

evaluateModel

public void evaluateModel(Classifier classifier,
                          Instances data)
                   throws java.lang.Exception
Evaluates the classifier on a given set of instances.

Parameters:
classifier - machine learning classifier
data - set of test instances for evaluation
Throws:
java.lang.Exception - if model could not be evaluated successfully

evaluateModelOnce

public double evaluateModelOnce(Classifier classifier,
                                Instance instance)
                         throws java.lang.Exception
Evaluates the classifier on a single instance.

Parameters:
classifier - machine learning classifier
instance - the test instance to be classified
Returns:
the prediction made by the clasifier
Throws:
java.lang.Exception - if model could not be evaluated successfully or the data contains string attributes

evaluateModelOnce

public double evaluateModelOnce(double[] dist,
                                Instance instance)
                         throws java.lang.Exception
Evaluates the supplied distribution on a single instance.

Parameters:
dist - the supplied distribution
instance - the test instance to be classified
Throws:
java.lang.Exception - if model could not be evaluated successfully

evaluateModelOnce

public void evaluateModelOnce(double prediction,
                              Instance instance)
                       throws java.lang.Exception
Evaluates the supplied prediction on a single instance.

Parameters:
prediction - the supplied prediction
instance - the test instance to be classified
Throws:
java.lang.Exception - if model could not be evaluated successfully

wekaStaticWrapper

protected static java.lang.String wekaStaticWrapper(Sourcable classifier,
                                                    java.lang.String className)
                                             throws java.lang.Exception
Wraps a static classifier in enough source to test using the weka class libraries.

Parameters:
classifier - a Sourcable Classifier
className - the name to give to the source code class
Returns:
the source for a static classifier that can be tested with weka libraries.
Throws:
java.lang.Exception

numInstances

public final double numInstances()
Gets the number of test instances that had a known class value (actually the sum of the weights of test instances with known class value).

Returns:
the number of test instances with known class

incorrect

public final double incorrect()
Gets the number of instances incorrectly classified (that is, for which an incorrect prediction was made). (Actually the sum of the weights of these instances)

Returns:
the number of incorrectly classified instances

pctIncorrect

public final double pctIncorrect()
Gets the percentage of instances incorrectly classified (that is, for which an incorrect prediction was made).

Returns:
the percent of incorrectly classified instances (between 0 and 100)

totalCost

public final double totalCost()
Gets the total cost, that is, the cost of each prediction times the weight of the instance, summed over all instances.

Returns:
the total cost

avgCost

public final double avgCost()
Gets the average cost, that is, total cost of misclassifications (incorrect plus unclassified) over the total number of instances.

Returns:
the average cost.

correct

public final double correct()
Gets the number of instances correctly classified (that is, for which a correct prediction was made). (Actually the sum of the weights of these instances)

Returns:
the number of correctly classified instances

pctCorrect

public final double pctCorrect()
Gets the percentage of instances correctly classified (that is, for which a correct prediction was made).

Returns:
the percent of correctly classified instances (between 0 and 100)

unclassified

public final double unclassified()
Gets the number of instances not classified (that is, for which no prediction was made by the classifier). (Actually the sum of the weights of these instances)

Returns:
the number of unclassified instances

pctUnclassified

public final double pctUnclassified()
Gets the percentage of instances not classified (that is, for which no prediction was made by the classifier).

Returns:
the percent of unclassified instances (between 0 and 100)

errorRate

public final double errorRate()
Returns the estimated error rate or the root mean squared error (if the class is numeric). If a cost matrix was given this error rate gives the average cost.

Returns:
the estimated error rate (between 0 and 1, or between 0 and maximum cost)

kappa

public final double kappa()
Returns value of kappa statistic if class is nominal.

Returns:
the value of the kappa statistic

correlationCoefficient

public final double correlationCoefficient()
                                    throws java.lang.Exception
Returns the correlation coefficient if the class is numeric.

Returns:
the correlation coefficient
Throws:
java.lang.Exception - if class is not numeric

meanAbsoluteError

public final double meanAbsoluteError()
Returns the mean absolute error. Refers to the error of the predicted values for numeric classes, and the error of the predicted probability distribution for nominal classes.

Returns:
the mean absolute error

meanPriorAbsoluteError

public final double meanPriorAbsoluteError()
Returns the mean absolute error of the prior.

Returns:
the mean absolute error

relativeAbsoluteError

public final double relativeAbsoluteError()
                                   throws java.lang.Exception
Returns the relative absolute error.

Returns:
the relative absolute error
Throws:
java.lang.Exception - if it can't be computed

rootMeanSquaredError

public final double rootMeanSquaredError()
Returns the root mean squared error.

Returns:
the root mean squared error

rootMeanPriorSquaredError

public final double rootMeanPriorSquaredError()
Returns the root mean prior squared error.

Returns:
the root mean prior squared error

rootRelativeSquaredError

public final double rootRelativeSquaredError()
Returns the root relative squared error if the class is numeric.

Returns:
the root relative squared error

priorEntropy

public final double priorEntropy()
                          throws java.lang.Exception
Calculate the entropy of the prior distribution

Returns:
the entropy of the prior distribution
Throws:
java.lang.Exception - if the class is not nominal

KBInformation

public final double KBInformation()
                           throws java.lang.Exception
Return the total Kononenko & Bratko Information score in bits

Returns:
the K&B information score
Throws:
java.lang.Exception - if the class is not nominal

KBMeanInformation

public final double KBMeanInformation()
                               throws java.lang.Exception
Return the Kononenko & Bratko Information score in bits per instance.

Returns:
the K&B information score
Throws:
java.lang.Exception - if the class is not nominal

KBRelativeInformation

public final double KBRelativeInformation()
                                   throws java.lang.Exception
Return the Kononenko & Bratko Relative Information score

Returns:
the K&B relative information score
Throws:
java.lang.Exception - if the class is not nominal

SFPriorEntropy

public final double SFPriorEntropy()
Returns the total entropy for the null model

Returns:
the total null model entropy

SFMeanPriorEntropy

public final double SFMeanPriorEntropy()
Returns the entropy per instance for the null model

Returns:
the null model entropy per instance

SFSchemeEntropy

public final double SFSchemeEntropy()
Returns the total entropy for the scheme

Returns:
the total scheme entropy

SFMeanSchemeEntropy

public final double SFMeanSchemeEntropy()
Returns the entropy per instance for the scheme

Returns:
the scheme entropy per instance

SFEntropyGain

public final double SFEntropyGain()
Returns the total SF, which is the null model entropy minus the scheme entropy.

Returns:
the total SF

SFMeanEntropyGain

public final double SFMeanEntropyGain()
Returns the SF per instance, which is the null model entropy minus the scheme entropy, per instance.

Returns:
the SF per instance

toCumulativeMarginDistributionString

public java.lang.String toCumulativeMarginDistributionString()
                                                      throws java.lang.Exception
Output the cumulative margin distribution as a string suitable for input for gnuplot or similar package.

Returns:
the cumulative margin distribution
Throws:
java.lang.Exception - if the class attribute is nominal

toSummaryString

public java.lang.String toSummaryString()
Calls toSummaryString() with no title and no complexity stats

Specified by:
toSummaryString in interface Summarizable
Returns:
a summary description of the classifier evaluation

toSummaryString

public java.lang.String toSummaryString(boolean printComplexityStatistics)
Calls toSummaryString() with a default title.

Parameters:
printComplexityStatistics - if true, complexity statistics are returned as well

toSummaryString

public java.lang.String toSummaryString(java.lang.String title,
                                        boolean printComplexityStatistics)
Outputs the performance statistics in summary form. Lists number (and percentage) of instances classified correctly, incorrectly and unclassified. Outputs the total number of instances classified, and the number of instances (if any) that had no class value provided.

Parameters:
title - the title for the statistics
printComplexityStatistics - if true, complexity statistics are returned as well
Returns:
the summary as a String

toMatrixString

public java.lang.String toMatrixString()
                                throws java.lang.Exception
Calls toMatrixString() with a default title.

Returns:
the confusion matrix as a string
Throws:
java.lang.Exception - if the class is numeric

toMatrixString

public java.lang.String toMatrixString(java.lang.String title)
                                throws java.lang.Exception
Outputs the performance statistics as a classification confusion matrix. For each class value, shows the distribution of predicted class values.

Parameters:
title - the title for the confusion matrix
Returns:
the confusion matrix as a String
Throws:
java.lang.Exception - if the class is numeric

toClassDetailsString

public java.lang.String toClassDetailsString()
                                      throws java.lang.Exception
Throws:
java.lang.Exception

toClassDetailsString

public java.lang.String toClassDetailsString(java.lang.String title)
                                      throws java.lang.Exception
Generates a breakdown of the accuracy for each class, incorporating various information-retrieval statistics, such as true/false positive rate, precision/recall/F-Measure. Should be useful for ROC curves, recall/precision curves.

Parameters:
title - the title to prepend the stats string with
Returns:
the statistics presented as a string
Throws:
java.lang.Exception

numTruePositives

public double numTruePositives(int classIndex)
Calculate the number of true positives with respect to a particular class. This is defined as

 correctly classified positives
 

Parameters:
classIndex - the index of the class to consider as "positive"
Returns:
the true positive rate

truePositiveRate

public double truePositiveRate(int classIndex)
Calculate the true positive rate with respect to a particular class. This is defined as

 correctly classified positives
 ------------------------------
       total positives
 

Parameters:
classIndex - the index of the class to consider as "positive"
Returns:
the true positive rate

numTrueNegatives

public double numTrueNegatives(int classIndex)
Calculate the number of true negatives with respect to a particular class. This is defined as

 correctly classified negatives
 

Parameters:
classIndex - the index of the class to consider as "positive"
Returns:
the true positive rate

trueNegativeRate

public double trueNegativeRate(int classIndex)
Calculate the true negative rate with respect to a particular class. This is defined as

 correctly classified negatives
 ------------------------------
       total negatives
 

Parameters:
classIndex - the index of the class to consider as "positive"
Returns:
the true positive rate

numFalsePositives

public double numFalsePositives(int classIndex)
Calculate number of false positives with respect to a particular class. This is defined as

 incorrectly classified negatives
 

Parameters:
classIndex - the index of the class to consider as "positive"
Returns:
the false positive rate

falsePositiveRate

public double falsePositiveRate(int classIndex)
Calculate the false positive rate with respect to a particular class. This is defined as

 incorrectly classified negatives
 --------------------------------
        total negatives
 

Parameters:
classIndex - the index of the class to consider as "positive"
Returns:
the false positive rate

numFalseNegatives

public double numFalseNegatives(int classIndex)
Calculate number of false negatives with respect to a particular class. This is defined as

 incorrectly classified positives
 

Parameters:
classIndex - the index of the class to consider as "positive"
Returns:
the false positive rate

falseNegativeRate

public double falseNegativeRate(int classIndex)
Calculate the false negative rate with respect to a particular class. This is defined as

 incorrectly classified positives
 --------------------------------
        total positives
 

Parameters:
classIndex - the index of the class to consider as "positive"
Returns:
the false positive rate

recall

public double recall(int classIndex)
Calculate the recall with respect to a particular class. This is defined as

 correctly classified positives
 ------------------------------
       total positives
 

(Which is also the same as the truePositiveRate.)

Parameters:
classIndex - the index of the class to consider as "positive"
Returns:
the recall

precision

public double precision(int classIndex)
Calculate the precision with respect to a particular class. This is defined as

 correctly classified positives
 ------------------------------
  total predicted as positive
 

Parameters:
classIndex - the index of the class to consider as "positive"
Returns:
the precision

fMeasure

public double fMeasure(int classIndex)
Calculate the F-Measure with respect to a particular class. This is defined as

 2 * recall * precision
 ----------------------
   recall + precision
 

Parameters:
classIndex - the index of the class to consider as "positive"
Returns:
the F-Measure

setPriors

public void setPriors(Instances train)
               throws java.lang.Exception
Sets the class prior probabilities

Parameters:
train - the training instances used to determine the prior probabilities
Throws:
java.lang.Exception - if the class attribute of the instances is not set

updatePriors

public void updatePriors(Instance instance)
                  throws java.lang.Exception
Updates the class prior probabilities (when incrementally training)

Parameters:
instance - the new training instance seen
Throws:
java.lang.Exception - if the class of the instance is not set

equals

public boolean equals(java.lang.Object obj)
Tests whether the current evaluation object is equal to another evaluation object

Parameters:
obj - the object to compare against
Returns:
true if the two objects are equal

printClassifications

protected static java.lang.String printClassifications(Classifier classifier,
                                                       Instances train,
                                                       java.lang.String testFileName,
                                                       int classIndex,
                                                       Range attributesToOutput)
                                                throws java.lang.Exception
Prints the predictions for the given dataset into a String variable.

Throws:
java.lang.Exception

attributeValuesString

protected static java.lang.String attributeValuesString(Instance instance,
                                                        Range attRange)
Builds a string listing the attribute values in a specified range of indices, separated by commas and enclosed in brackets.

Parameters:
instance - the instance to print the values from
Returns:
a string listing values of the attributes in the range

makeOptionString

protected static java.lang.String makeOptionString(Classifier classifier)
Make up the help string giving all the command line options

Parameters:
classifier - the classifier to include options for
Returns:
a string detailing the valid command line options

num2ShortID

protected java.lang.String num2ShortID(int num,
                                       char[] IDChars,
                                       int IDWidth)
Method for generating indices for the confusion matrix.

Parameters:
num - integer to format
Returns:
the formatted integer as a string

makeDistribution

protected double[] makeDistribution(double predictedClass)
Convert a single prediction into a probability distribution with all zero probabilities except the predicted value which has probability 1.0;

Parameters:
predictedClass - the index of the predicted class
Returns:
the probability distribution

updateStatsForClassifier

protected void updateStatsForClassifier(double[] predictedDistribution,
                                        Instance instance)
                                 throws java.lang.Exception
Updates all the statistics about a classifiers performance for the current test instance.

Parameters:
predictedDistribution - the probabilities assigned to each class
instance - the instance to be classified
Throws:
java.lang.Exception - if the class of the instance is not set

updateStatsForPredictor

protected void updateStatsForPredictor(double predictedValue,
                                       Instance instance)
                                throws java.lang.Exception
Updates all the statistics about a predictors performance for the current test instance.

Parameters:
predictedValue - the numeric value the classifier predicts
instance - the instance to be classified
Throws:
java.lang.Exception - if the class of the instance is not set

updateMargins

protected void updateMargins(double[] predictedDistribution,
                             int actualClass,
                             double weight)
Update the cumulative record of classification margins

Parameters:
predictedDistribution - the probability distribution predicted for the current instance
actualClass - the index of the actual instance class
weight - the weight assigned to the instance

updateNumericScores

protected void updateNumericScores(double[] predicted,
                                   double[] actual,
                                   double weight)
Update the numeric accuracy measures. For numeric classes, the accuracy is between the actual and predicted class values. For nominal classes, the accuracy is between the actual and predicted class probabilities.

Parameters:
predicted - the predicted values
actual - the actual value
weight - the weight associated with this prediction

addNumericTrainClass

protected void addNumericTrainClass(double classValue,
                                    double weight)
Adds a numeric (non-missing) training class value and weight to the buffer of stored values.

Parameters:
classValue - the class value
weight - the instance weight

setNumericPriorsFromBuffer

protected void setNumericPriorsFromBuffer()
Sets up the priors for numeric class attributes from the training class values that have been seen so far.