weka.classifiers.meta
Class CostSensitiveClassifier

java.lang.Object
  extended byweka.classifiers.Classifier
      extended byweka.classifiers.meta.CostSensitiveClassifier
All Implemented Interfaces:
java.lang.Cloneable, Drawable, OptionHandler, java.io.Serializable

public class CostSensitiveClassifier
extends Classifier
implements OptionHandler, Drawable

This metaclassifier makes its base classifier cost-sensitive. Two methods can be used to introduce cost-sensitivity: reweighting training instances according to the total cost assigned to each class; or predicting the class with minimum expected misclassification cost (rather than the most likely class).

Valid options are:

-M
Minimize expected misclassification cost. (default is to reweight training instances according to costs per class)

-W classname
Specify the full class name of a classifier (required).

-C cost file
File name of a cost matrix to use. If this is not supplied, a cost matrix will be loaded on demand. The name of the on-demand file is the relation name of the training data plus ".cost", and the path to the on-demand file is specified with the -N option.

-N directory
Name of a directory to search for cost files when loading costs on demand (default current directory).

-S seed
Random number seed used when reweighting by resampling (default 1).

Options after -- are passed to the designated classifier.

Version:
$Revision: 1.18 $
Author:
Len Trigg (len@reeltwo.com)
See Also:
Serialized Form

Field Summary
protected  Classifier m_Classifier
          The classifier
protected  java.lang.String m_CostFile
          The name of the cost file, for command line options
protected  CostMatrix m_CostMatrix
          The cost matrix
protected  int m_MatrixSource
          Indicates the current cost matrix source
protected  boolean m_MinimizeExpectedCost
          True if the costs should be used by selecting the minimum expected cost (false means weight training data by the costs)
protected  java.io.File m_OnDemandDirectory
          The directory used when loading cost files on demand, null indicates current directory
protected  int m_Seed
          Seed for reweighting using resampling.
static int MATRIX_ON_DEMAND
           
static int MATRIX_SUPPLIED
           
static Tag[] TAGS_MATRIX_SOURCE
           
 
Fields inherited from class weka.classifiers.Classifier
m_Debug
 
Fields inherited from interface weka.core.Drawable
BayesNet, NOT_DRAWABLE, TREE
 
Constructor Summary
CostSensitiveClassifier()
           
 
Method Summary
 void buildClassifier(Instances data)
          Builds the model of the base learner.
 java.lang.String classifierTipText()
           
 java.lang.String costMatrixSourceTipText()
           
 java.lang.String costMatrixTipText()
           
 double[] distributionForInstance(Instance instance)
          Returns class probabilities.
 Classifier getClassifier()
          Gets the classifier used.
protected  java.lang.String getClassifierSpec()
          Gets the classifier specification string, which contains the class name of the classifier and any options to the classifier
 CostMatrix getCostMatrix()
          Gets the misclassification cost matrix.
 SelectedTag getCostMatrixSource()
          Gets the source location method of the cost matrix.
 boolean getMinimizeExpectedCost()
          Gets the value of MinimizeExpectedCost.
 java.io.File getOnDemandDirectory()
          Returns the directory that will be searched for cost files when loading on demand.
 java.lang.String[] getOptions()
          Gets the current settings of the Classifier.
 int getSeed()
          Get seed for resampling.
 java.lang.String globalInfo()
           
 java.lang.String graph()
          Returns graph describing the classifier (if possible).
 int graphType()
          Returns the type of graph this classifier represents.
 java.util.Enumeration listOptions()
          Returns an enumeration describing the available options.
static void main(java.lang.String[] argv)
          Main method for testing this class.
 java.lang.String minimizeExpectedCostTipText()
           
 java.lang.String onDemandDirectoryTipText()
           
 java.lang.String seedTipText()
           
 void setClassifier(Classifier classifier)
          Sets the distribution classifier
 void setCostMatrix(CostMatrix newCostMatrix)
          Sets the misclassification cost matrix.
 void setCostMatrixSource(SelectedTag newMethod)
          Sets the source location of the cost matrix.
 void setMinimizeExpectedCost(boolean newMinimizeExpectedCost)
          Set the value of MinimizeExpectedCost.
 void setOnDemandDirectory(java.io.File newDir)
          Sets the directory that will be searched for cost files when loading on demand.
 void setOptions(java.lang.String[] options)
          Parses a given list of options.
 void setSeed(int seed)
          Set seed for resampling.
 java.lang.String toString()
          Output a representation of this classifier
 
Methods inherited from class weka.classifiers.Classifier
classifyInstance, debugTipText, forName, getDebug, makeCopies, setDebug
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

MATRIX_ON_DEMAND

public static final int MATRIX_ON_DEMAND
See Also:
Constant Field Values

MATRIX_SUPPLIED

public static final int MATRIX_SUPPLIED
See Also:
Constant Field Values

TAGS_MATRIX_SOURCE

public static final Tag[] TAGS_MATRIX_SOURCE

m_MatrixSource

protected int m_MatrixSource
Indicates the current cost matrix source


m_OnDemandDirectory

protected java.io.File m_OnDemandDirectory
The directory used when loading cost files on demand, null indicates current directory


m_CostFile

protected java.lang.String m_CostFile
The name of the cost file, for command line options


m_CostMatrix

protected CostMatrix m_CostMatrix
The cost matrix


m_Classifier

protected Classifier m_Classifier
The classifier


m_Seed

protected int m_Seed
Seed for reweighting using resampling.


m_MinimizeExpectedCost

protected boolean m_MinimizeExpectedCost
True if the costs should be used by selecting the minimum expected cost (false means weight training data by the costs)

Constructor Detail

CostSensitiveClassifier

public CostSensitiveClassifier()
Method Detail

listOptions

public java.util.Enumeration listOptions()
Returns an enumeration describing the available options.

Specified by:
listOptions in interface OptionHandler
Overrides:
listOptions in class Classifier
Returns:
an enumeration of all the available options.

setOptions

public void setOptions(java.lang.String[] options)
                throws java.lang.Exception
Parses a given list of options. Valid options are:

-M
Minimize expected misclassification cost. (default is to reweight training instances according to costs per class)

-W classname
Specify the full class name of a classifier (required).

-C cost file
File name of a cost matrix to use. If this is not supplied, a cost matrix will be loaded on demand. The name of the on-demand file is the relation name of the training data plus ".cost", and the path to the on-demand file is specified with the -N option.

-N directory
Name of a directory to search for cost files when loading costs on demand (default current directory).

-S seed
Random number seed used when reweighting by resampling (default 1).

Options after -- are passed to the designated classifier.

Specified by:
setOptions in interface OptionHandler
Overrides:
setOptions in class Classifier
Parameters:
options - the list of options as an array of strings
Throws:
java.lang.Exception - if an option is not supported

getOptions

public java.lang.String[] getOptions()
Gets the current settings of the Classifier.

Specified by:
getOptions in interface OptionHandler
Overrides:
getOptions in class Classifier
Returns:
an array of strings suitable for passing to setOptions

globalInfo

public java.lang.String globalInfo()
Returns:
a description of the classifier suitable for displaying in the explorer/experimenter gui

costMatrixSourceTipText

public java.lang.String costMatrixSourceTipText()
Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

getCostMatrixSource

public SelectedTag getCostMatrixSource()
Gets the source location method of the cost matrix. Will be one of MATRIX_ON_DEMAND or MATRIX_SUPPLIED.

Returns:
the cost matrix source.

setCostMatrixSource

public void setCostMatrixSource(SelectedTag newMethod)
Sets the source location of the cost matrix. Values other than MATRIX_ON_DEMAND or MATRIX_SUPPLIED will be ignored.

Parameters:
newMethod - the cost matrix location method.

onDemandDirectoryTipText

public java.lang.String onDemandDirectoryTipText()
Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

getOnDemandDirectory

public java.io.File getOnDemandDirectory()
Returns the directory that will be searched for cost files when loading on demand.

Returns:
The cost file search directory.

setOnDemandDirectory

public void setOnDemandDirectory(java.io.File newDir)
Sets the directory that will be searched for cost files when loading on demand.

Parameters:
newDir - The cost file search directory.

minimizeExpectedCostTipText

public java.lang.String minimizeExpectedCostTipText()
Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

getMinimizeExpectedCost

public boolean getMinimizeExpectedCost()
Gets the value of MinimizeExpectedCost.

Returns:
Value of MinimizeExpectedCost.

setMinimizeExpectedCost

public void setMinimizeExpectedCost(boolean newMinimizeExpectedCost)
Set the value of MinimizeExpectedCost.

Parameters:
newMinimizeExpectedCost - Value to assign to MinimizeExpectedCost.

classifierTipText

public java.lang.String classifierTipText()
Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setClassifier

public void setClassifier(Classifier classifier)
Sets the distribution classifier

Parameters:
classifier - the classifier with all options set.

getClassifier

public Classifier getClassifier()
Gets the classifier used.

Returns:
the classifier

getClassifierSpec

protected java.lang.String getClassifierSpec()
Gets the classifier specification string, which contains the class name of the classifier and any options to the classifier

Returns:
the classifier string.

costMatrixTipText

public java.lang.String costMatrixTipText()
Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

getCostMatrix

public CostMatrix getCostMatrix()
Gets the misclassification cost matrix.

Returns:
the cost matrix

setCostMatrix

public void setCostMatrix(CostMatrix newCostMatrix)
Sets the misclassification cost matrix.


seedTipText

public java.lang.String seedTipText()
Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setSeed

public void setSeed(int seed)
Set seed for resampling.

Parameters:
seed - the seed for resampling

getSeed

public int getSeed()
Get seed for resampling.

Returns:
the seed for resampling

buildClassifier

public void buildClassifier(Instances data)
                     throws java.lang.Exception
Builds the model of the base learner.

Specified by:
buildClassifier in class Classifier
Parameters:
data - the training data
Throws:
java.lang.Exception - if the classifier could not be built successfully

distributionForInstance

public double[] distributionForInstance(Instance instance)
                                 throws java.lang.Exception
Returns class probabilities. When minimum expected cost approach is chosen, returns probability one for class with the minimum expected misclassification cost. Otherwise it returns the probability distribution returned by the base classifier.

Overrides:
distributionForInstance in class Classifier
Parameters:
instance - the instance to be classified
Returns:
an array containing the estimated membership probabilities of the test instance in each class or the numeric prediction
Throws:
java.lang.Exception - if instance could not be classified successfully

graphType

public int graphType()
Returns the type of graph this classifier represents.

Specified by:
graphType in interface Drawable
Returns:
the type of graph representing the object

graph

public java.lang.String graph()
                       throws java.lang.Exception
Returns graph describing the classifier (if possible).

Specified by:
graph in interface Drawable
Returns:
the graph of the classifier in dotty format
Throws:
java.lang.Exception - if the classifier cannot be graphed

toString

public java.lang.String toString()
Output a representation of this classifier


main

public static void main(java.lang.String[] argv)
Main method for testing this class.

Parameters:
argv - should contain the following arguments: -t training file [-T test file] [-c class index]