weka.attributeSelection
Class CfsSubsetEval

java.lang.Object
  extended byweka.attributeSelection.ASEvaluation
      extended byweka.attributeSelection.SubsetEvaluator
          extended byweka.attributeSelection.CfsSubsetEval
All Implemented Interfaces:
OptionHandler, java.io.Serializable

public class CfsSubsetEval
extends SubsetEvaluator
implements OptionHandler

CFS attribute subset evaluator. For more information see:

Hall, M. A. (1998). Correlation-based Feature Subset Selection for Machine Learning. Thesis submitted in partial fulfilment of the requirements of the degree of Doctor of Philosophy at the University of Waikato.

Valid options are: -M
Treat missing values as a seperate value.

-L
Include locally predictive attributes.

Version:
$Revision: 1.19 $
Author:
Mark Hall (mhall@cs.waikato.ac.nz)
See Also:
Serialized Form

Field Summary
private  double m_c_Threshold
          Threshold for admitting locally predictive features
private  int m_classIndex
          The class index
private  float[][] m_corr_matrix
          Holds the matrix of attribute correlations
private  Discretize m_disTransform
          Discretise attributes when class in nominal
private  boolean m_isNumeric
          Is the class numeric
private  boolean m_locallyPredictive
          Include locally predicitive attributes
private  boolean m_missingSeperate
          Treat missing values as seperate values
private  int m_numAttribs
          Number of attributes in the training data
private  int m_numInstances
          Number of instances in the training data
private  double[] m_std_devs
          Standard deviations of attributes (when using pearsons correlation)
private  Instances m_trainInstances
          The training instances
 
Constructor Summary
CfsSubsetEval()
          Constructor
 
Method Summary
private  void addLocallyPredictive(java.util.BitSet best_group)
           
 void buildEvaluator(Instances data)
          Generates a attribute evaluator.
private  float correlate(int att1, int att2)
           
 double evaluateSubset(java.util.BitSet subset)
          evaluates a subset of attributes
 boolean getLocallyPredictive()
          Return true if including locally predictive attributes
 boolean getMissingSeperate()
          Return true is missing is treated as a seperate value
 java.lang.String[] getOptions()
          Gets the current settings of CfsSubsetEval
 java.lang.String globalInfo()
          Returns a string describing this attribute evaluator
 java.util.Enumeration listOptions()
          Returns an enumeration describing the available options.
 java.lang.String locallyPredictiveTipText()
          Returns the tip text for this property
static void main(java.lang.String[] args)
          Main method for testing this class.
 java.lang.String missingSeperateTipText()
          Returns the tip text for this property
private  double nom_nom(int att1, int att2)
           
private  double num_nom2(int att1, int att2)
           
private  double num_num(int att1, int att2)
           
 int[] postProcess(int[] attributeSet)
          Calls locallyPredictive in order to include locally predictive attributes (if requested).
protected  void resetOptions()
           
 void setLocallyPredictive(boolean b)
          Include locally predictive attributes
 void setMissingSeperate(boolean b)
          Treat missing as a seperate value
 void setOptions(java.lang.String[] options)
          Parses and sets a given list of options.
private  double symmUncertCorr(int att1, int att2)
           
 java.lang.String toString()
          returns a string describing CFS
 
Methods inherited from class weka.attributeSelection.ASEvaluation
forName, makeCopies
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

m_trainInstances

private Instances m_trainInstances
The training instances


m_disTransform

private Discretize m_disTransform
Discretise attributes when class in nominal


m_classIndex

private int m_classIndex
The class index


m_isNumeric

private boolean m_isNumeric
Is the class numeric


m_numAttribs

private int m_numAttribs
Number of attributes in the training data


m_numInstances

private int m_numInstances
Number of instances in the training data


m_missingSeperate

private boolean m_missingSeperate
Treat missing values as seperate values


m_locallyPredictive

private boolean m_locallyPredictive
Include locally predicitive attributes


m_corr_matrix

private float[][] m_corr_matrix
Holds the matrix of attribute correlations


m_std_devs

private double[] m_std_devs
Standard deviations of attributes (when using pearsons correlation)


m_c_Threshold

private double m_c_Threshold
Threshold for admitting locally predictive features

Constructor Detail

CfsSubsetEval

public CfsSubsetEval()
Constructor

Method Detail

globalInfo

public java.lang.String globalInfo()
Returns a string describing this attribute evaluator

Returns:
a description of the evaluator suitable for displaying in the explorer/experimenter gui

listOptions

public java.util.Enumeration listOptions()
Returns an enumeration describing the available options.

Specified by:
listOptions in interface OptionHandler
Returns:
an enumeration of all the available options.

setOptions

public void setOptions(java.lang.String[] options)
                throws java.lang.Exception
Parses and sets a given list of options.

Valid options are: -M
Treat missing values as a seperate value.

-L
Include locally predictive attributes.

Specified by:
setOptions in interface OptionHandler
Parameters:
options - the list of options as an array of strings
Throws:
java.lang.Exception - if an option is not supported

locallyPredictiveTipText

public java.lang.String locallyPredictiveTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setLocallyPredictive

public void setLocallyPredictive(boolean b)
Include locally predictive attributes

Parameters:
b - true or false

getLocallyPredictive

public boolean getLocallyPredictive()
Return true if including locally predictive attributes

Returns:
true if locally predictive attributes are to be used

missingSeperateTipText

public java.lang.String missingSeperateTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setMissingSeperate

public void setMissingSeperate(boolean b)
Treat missing as a seperate value

Parameters:
b - true or false

getMissingSeperate

public boolean getMissingSeperate()
Return true is missing is treated as a seperate value

Returns:
true if missing is to be treated as a seperate value

getOptions

public java.lang.String[] getOptions()
Gets the current settings of CfsSubsetEval

Specified by:
getOptions in interface OptionHandler
Returns:
an array of strings suitable for passing to setOptions()

buildEvaluator

public void buildEvaluator(Instances data)
                    throws java.lang.Exception
Generates a attribute evaluator. Has to initialize all fields of the evaluator that are not being set via options. CFS also discretises attributes (if necessary) and initializes the correlation matrix.

Specified by:
buildEvaluator in class ASEvaluation
Parameters:
data - set of instances serving as training data
Throws:
java.lang.Exception - if the evaluator has not been generated successfully

evaluateSubset

public double evaluateSubset(java.util.BitSet subset)
                      throws java.lang.Exception
evaluates a subset of attributes

Specified by:
evaluateSubset in class SubsetEvaluator
Parameters:
subset - a bitset representing the attribute subset to be evaluated
Returns:
the "merit" of the subset
Throws:
java.lang.Exception - if the subset could not be evaluated

correlate

private float correlate(int att1,
                        int att2)

symmUncertCorr

private double symmUncertCorr(int att1,
                              int att2)

num_num

private double num_num(int att1,
                       int att2)

num_nom2

private double num_nom2(int att1,
                        int att2)

nom_nom

private double nom_nom(int att1,
                       int att2)

toString

public java.lang.String toString()
returns a string describing CFS

Returns:
the description as a string

addLocallyPredictive

private void addLocallyPredictive(java.util.BitSet best_group)

postProcess

public int[] postProcess(int[] attributeSet)
                  throws java.lang.Exception
Calls locallyPredictive in order to include locally predictive attributes (if requested).

Overrides:
postProcess in class ASEvaluation
Parameters:
attributeSet - the set of attributes found by the search
Returns:
a possibly ranked list of postprocessed attributes
Throws:
java.lang.Exception - if postprocessing fails for some reason

resetOptions

protected void resetOptions()

main

public static void main(java.lang.String[] args)
Main method for testing this class.

Parameters:
args - the options