weka.attributeSelection
Class ClassifierSubsetEval

java.lang.Object
  extended byweka.attributeSelection.ASEvaluation
      extended byweka.attributeSelection.SubsetEvaluator
          extended byweka.attributeSelection.HoldOutSubsetEvaluator
              extended byweka.attributeSelection.ClassifierSubsetEval
All Implemented Interfaces:
ErrorBasedMeritEvaluator, OptionHandler, java.io.Serializable

public class ClassifierSubsetEval
extends HoldOutSubsetEvaluator
implements OptionHandler, ErrorBasedMeritEvaluator

Classifier subset evaluator. Uses a classifier to estimate the "merit" of a set of attributes. Valid options are:

-B
Class name of the classifier to use for accuracy estimation. Place any classifier options last on the command line following a "--". Eg -B weka.classifiers.bayes.NaiveBayes ... -- -K

-T
Use the training data for accuracy estimation rather than a hold out/ test set.

-H
The file containing hold out/test instances to use for accuracy estimation

Version:
$Revision: 1.10 $
Author:
Mark Hall (mhall@cs.waikato.ac.nz)
See Also:
Serialized Form

Field Summary
private  Classifier m_Classifier
          holds the classifier to use for error estimates
private  int m_classIndex
          class index
private  Evaluation m_Evaluation
          holds the evaluation object to use for evaluating the classifier
private  java.io.File m_holdOutFile
          the file that containts hold out/test instances
private  Instances m_holdOutInstances
          the instances to test on
private  int m_numAttribs
          number of attributes in the training data
private  int m_numInstances
          number of training instances
private  Instances m_trainingInstances
          training instances
private  boolean m_useTraining
          evaluate on training data rather than seperate hold out/test set
 
Constructor Summary
ClassifierSubsetEval()
           
 
Method Summary
 void buildEvaluator(Instances data)
          Generates a attribute evaluator.
 java.lang.String classifierTipText()
          Returns the tip text for this property
 double evaluateSubset(java.util.BitSet subset)
          Evaluates a subset of attributes
 double evaluateSubset(java.util.BitSet subset, Instance holdOut, boolean retrain)
          Evaluates a subset of attributes with respect to a single instance.
 double evaluateSubset(java.util.BitSet subset, Instances holdOut)
          Evaluates a subset of attributes with respect to a set of instances.
 Classifier getClassifier()
          Get the classifier used as the base learner.
 java.io.File getHoldOutFile()
          Gets the file that holds hold out/test instances.
 java.lang.String[] getOptions()
          Gets the current settings of ClassifierSubsetEval
 boolean getUseTraining()
          Get if training data is to be used instead of hold out/test data
 java.lang.String globalInfo()
          Returns a string describing this attribute evaluator
 java.lang.String holdOutFileTipText()
          Returns the tip text for this property
 java.util.Enumeration listOptions()
          Returns an enumeration describing the available options.
static void main(java.lang.String[] args)
          Main method for testing this class.
protected  void resetOptions()
          reset to defaults
 void setClassifier(Classifier newClassifier)
          Set the classifier to use for accuracy estimation
 void setHoldOutFile(java.io.File h)
          Set the file that contains hold out/test instances
 void setOptions(java.lang.String[] options)
          Parses a given list of options.
 void setUseTraining(boolean t)
          Set if training data is to be used instead of hold out/test data
 java.lang.String toString()
          Returns a string describing classifierSubsetEval
 java.lang.String useTrainingTipText()
          Returns the tip text for this property
 
Methods inherited from class weka.attributeSelection.ASEvaluation
forName, makeCopies, postProcess
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

m_trainingInstances

private Instances m_trainingInstances
training instances


m_classIndex

private int m_classIndex
class index


m_numAttribs

private int m_numAttribs
number of attributes in the training data


m_numInstances

private int m_numInstances
number of training instances


m_Classifier

private Classifier m_Classifier
holds the classifier to use for error estimates


m_Evaluation

private Evaluation m_Evaluation
holds the evaluation object to use for evaluating the classifier


m_holdOutFile

private java.io.File m_holdOutFile
the file that containts hold out/test instances


m_holdOutInstances

private Instances m_holdOutInstances
the instances to test on


m_useTraining

private boolean m_useTraining
evaluate on training data rather than seperate hold out/test set

Constructor Detail

ClassifierSubsetEval

public ClassifierSubsetEval()
Method Detail

globalInfo

public java.lang.String globalInfo()
Returns a string describing this attribute evaluator

Returns:
a description of the evaluator suitable for displaying in the explorer/experimenter gui

listOptions

public java.util.Enumeration listOptions()
Returns an enumeration describing the available options.

-B
Class name of the classifier to use for accuracy estimation. Place any classifier options last on the command line following a "--". Eg -B weka.classifiers.bayes.NaiveBayes ... -- -K

-T
Use the training data for accuracy estimation rather than a hold out/ test set.

-H
The file containing hold out/test instances to use for accuracy estimation

Specified by:
listOptions in interface OptionHandler
Returns:
an enumeration of all the available options.

setOptions

public void setOptions(java.lang.String[] options)
                throws java.lang.Exception
Parses a given list of options. Valid options are:

-C
Class name of classifier to use for accuracy estimation. Place any classifier options last on the command line following a "--". Eg -B weka.classifiers.bayes.NaiveBayes ... -- -K

-T
Use training data instead of a hold out/test set for accuracy estimation.

-H
Name of the hold out/test set to estimate classifier accuracy on.

Specified by:
setOptions in interface OptionHandler
Parameters:
options - the list of options as an array of strings
Throws:
java.lang.Exception - if an option is not supported

classifierTipText

public java.lang.String classifierTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setClassifier

public void setClassifier(Classifier newClassifier)
Set the classifier to use for accuracy estimation

Parameters:
newClassifier - the Classifier to use.

getClassifier

public Classifier getClassifier()
Get the classifier used as the base learner.

Returns:
the classifier used as the classifier

holdOutFileTipText

public java.lang.String holdOutFileTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

getHoldOutFile

public java.io.File getHoldOutFile()
Gets the file that holds hold out/test instances.

Returns:
File that contains hold out instances

setHoldOutFile

public void setHoldOutFile(java.io.File h)
Set the file that contains hold out/test instances

Parameters:
h - the hold out file

useTrainingTipText

public java.lang.String useTrainingTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

getUseTraining

public boolean getUseTraining()
Get if training data is to be used instead of hold out/test data

Returns:
true if training data is to be used instead of hold out data

setUseTraining

public void setUseTraining(boolean t)
Set if training data is to be used instead of hold out/test data

Returns:
true if training data is to be used instead of hold out data

getOptions

public java.lang.String[] getOptions()
Gets the current settings of ClassifierSubsetEval

Specified by:
getOptions in interface OptionHandler
Returns:
an array of strings suitable for passing to setOptions()

buildEvaluator

public void buildEvaluator(Instances data)
                    throws java.lang.Exception
Generates a attribute evaluator. Has to initialize all fields of the evaluator that are not being set via options.

Specified by:
buildEvaluator in class ASEvaluation
Parameters:
data - set of instances serving as training data
Throws:
java.lang.Exception - if the evaluator has not been generated successfully

evaluateSubset

public double evaluateSubset(java.util.BitSet subset)
                      throws java.lang.Exception
Evaluates a subset of attributes

Specified by:
evaluateSubset in class SubsetEvaluator
Parameters:
subset - a bitset representing the attribute subset to be evaluated
Returns:
the "merit" of the subset
Throws:
java.lang.Exception - if the subset could not be evaluated

evaluateSubset

public double evaluateSubset(java.util.BitSet subset,
                             Instances holdOut)
                      throws java.lang.Exception
Evaluates a subset of attributes with respect to a set of instances. Calling this function overides any test/hold out instancs set from setHoldOutFile.

Specified by:
evaluateSubset in class HoldOutSubsetEvaluator
Parameters:
subset - a bitset representing the attribute subset to be evaluated
holdOut - a set of instances (possibly seperate and distinct from those use to build/train the evaluator) with which to evaluate the merit of the subset
Returns:
the "merit" of the subset on the holdOut data
Throws:
java.lang.Exception - if the subset cannot be evaluated

evaluateSubset

public double evaluateSubset(java.util.BitSet subset,
                             Instance holdOut,
                             boolean retrain)
                      throws java.lang.Exception
Evaluates a subset of attributes with respect to a single instance. Calling this function overides any hold out/test instances set through setHoldOutFile.

Specified by:
evaluateSubset in class HoldOutSubsetEvaluator
Parameters:
subset - a bitset representing the attribute subset to be evaluated
holdOut - a single instance (possibly not one of those used to build/train the evaluator) with which to evaluate the merit of the subset
retrain - true if the classifier should be retrained with respect to the new subset before testing on the holdOut instance.
Returns:
the "merit" of the subset on the holdOut instance
Throws:
java.lang.Exception - if the subset cannot be evaluated

toString

public java.lang.String toString()
Returns a string describing classifierSubsetEval

Returns:
the description as a string

resetOptions

protected void resetOptions()
reset to defaults


main

public static void main(java.lang.String[] args)
Main method for testing this class.

Parameters:
args - the options