weka.classifiers.rules
Class DecisionTable

java.lang.Object
  extended byweka.classifiers.Classifier
      extended byweka.classifiers.rules.DecisionTable
All Implemented Interfaces:
AdditionalMeasureProducer, java.lang.Cloneable, OptionHandler, java.io.Serializable, WeightedInstancesHandler

public class DecisionTable
extends Classifier
implements OptionHandler, WeightedInstancesHandler, AdditionalMeasureProducer

Class for building and using a simple decision table majority classifier. For more information see:

Kohavi R. (1995). The Power of Decision Tables. In Proc European Conference on Machine Learning.

Valid options are:

-S num
Number of fully expanded non improving subsets to consider before terminating a best first search. (Default = 5)

-X num
Use cross validation to evaluate features. Use number of folds = 1 for leave one out CV. (Default = leave one out CV)

-I
Use nearest neighbour instead of global table majority.

-R
Prints the decision table.

Version:
$Revision: 1.27 $
Author:
Mark Hall (mhall@cs.waikato.ac.nz)
See Also:
Serialized Form

Nested Class Summary
 class DecisionTable.hashKey
          Class providing keys to the hash table
 class DecisionTable.Link
          Class for a node in a linked list.
 class DecisionTable.LinkedList
          Class for handling a linked list.
 
Field Summary
private  boolean m_classIsNominal
          Class is nominal
private  int m_CVFolds
          Number of folds for cross validating feature sets
private  boolean m_debug
          Output debug info
private  int[] m_decisionFeatures
          Holds the final feature set
private  Remove m_delTransform
          Filter used to remove columns discarded by feature selection
private  boolean m_displayRules
          Display Rules
private  Filter m_disTransform
          Discretization filter
private  java.util.Hashtable m_entries
          The hashtable used to hold training instances
private  IBk m_ibk
          IB1 used to classify non matching instances rather than majority class
private  double m_majority
          Holds the majority class
private  int m_maxStale
          Maximum number of fully expanded non improving subsets for a best first search.
private  int m_numAttributes
          The number of attributes in the dataset
private  int m_numInstances
          The number of instances in the dataset
private  java.util.Random m_rr
          Random numbers for use in cross validation
private  Instances m_theInstances
          Holds the training instances
private  boolean m_useIBk
          Use the IBk classifier rather than majority class
 
Fields inherited from class weka.classifiers.Classifier
m_Debug
 
Constructor Summary
DecisionTable()
          Constructor for a DecisionTable
 
Method Summary
private  void best_first()
          Does a best first search
 void buildClassifier(Instances data)
          Generates the classifier.
(package private)  double classifyFoldCV(Instances fold, int[] fs)
          Calculates the accuracy on a test fold for internal cross validation of feature sets
(package private)  double classifyInstanceLeaveOneOut(Instance instance, double[] instA)
          Classifies an instance for internal leave one out cross validation of feature sets
 java.lang.String crossValTipText()
          Returns the tip text for this property
 java.lang.String displayRulesTipText()
          Returns the tip text for this property
 double[] distributionForInstance(Instance instance)
          Calculates the class membership probabilities for the given test instance.
 java.util.Enumeration enumerateMeasures()
          Returns an enumeration of the additional measure names
private  double estimateAccuracy(java.util.BitSet feature_set, int num_atts)
          Evaluates a feature subset by cross validation
 int getCrossVal()
          Gets the number of folds for cross validation
 boolean getDisplayRules()
          Gets whether rules are being printed
 int getMaxStale()
          Gets the number of non improving decision tables
 double getMeasure(java.lang.String additionalMeasureName)
          Returns the value of the named measure
 java.lang.String[] getOptions()
          Gets the current settings of the classifier.
 boolean getUseIBk()
          Gets whether IBk is being used instead of the majority class
 java.lang.String globalInfo()
          Returns a string describing classifier
private  void insertIntoTable(Instance inst, double[] instA)
          Inserts an instance into the hash table
 java.util.Enumeration listOptions()
          Returns an enumeration describing the available options.
static void main(java.lang.String[] argv)
          Main method for testing this class.
 java.lang.String maxStaleTipText()
          Returns the tip text for this property
 double measureNumRules()
          Returns the number of rules
 java.lang.String printFeatures()
          Returns a string description of the features selected
private  java.lang.String printSub(java.util.BitSet sub)
          Returns a String representation of a feature subset
protected  void resetOptions()
          Resets the options.
 void setCrossVal(int folds)
          Sets the number of folds for cross validation (1 = leave one out)
 void setDisplayRules(boolean rules)
          Sets whether rules are to be printed
 void setMaxStale(int stale)
          Sets the number of non improving decision tables to consider before abandoning the search.
 void setOptions(java.lang.String[] options)
          Parses the options for this object.
 void setUseIBk(boolean ibk)
          Sets whether IBk should be used instead of the majority class
 java.lang.String toString()
          Returns a description of the classifier.
 java.lang.String useIBkTipText()
          Returns the tip text for this property
 
Methods inherited from class weka.classifiers.Classifier
classifyInstance, debugTipText, forName, getDebug, makeCopies, setDebug
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

m_entries

private java.util.Hashtable m_entries
The hashtable used to hold training instances


m_decisionFeatures

private int[] m_decisionFeatures
Holds the final feature set


m_disTransform

private Filter m_disTransform
Discretization filter


m_delTransform

private Remove m_delTransform
Filter used to remove columns discarded by feature selection


m_ibk

private IBk m_ibk
IB1 used to classify non matching instances rather than majority class


m_theInstances

private Instances m_theInstances
Holds the training instances


m_numAttributes

private int m_numAttributes
The number of attributes in the dataset


m_numInstances

private int m_numInstances
The number of instances in the dataset


m_classIsNominal

private boolean m_classIsNominal
Class is nominal


m_debug

private boolean m_debug
Output debug info


m_useIBk

private boolean m_useIBk
Use the IBk classifier rather than majority class


m_displayRules

private boolean m_displayRules
Display Rules


m_maxStale

private int m_maxStale
Maximum number of fully expanded non improving subsets for a best first search.


m_CVFolds

private int m_CVFolds
Number of folds for cross validating feature sets


m_rr

private java.util.Random m_rr
Random numbers for use in cross validation


m_majority

private double m_majority
Holds the majority class

Constructor Detail

DecisionTable

public DecisionTable()
Constructor for a DecisionTable

Method Detail

globalInfo

public java.lang.String globalInfo()
Returns a string describing classifier

Returns:
a description suitable for displaying in the explorer/experimenter gui

insertIntoTable

private void insertIntoTable(Instance inst,
                             double[] instA)
                      throws java.lang.Exception
Inserts an instance into the hash table

Parameters:
inst - instance to be inserted
Throws:
java.lang.Exception - if the instance can't be inserted

classifyInstanceLeaveOneOut

double classifyInstanceLeaveOneOut(Instance instance,
                                   double[] instA)
                             throws java.lang.Exception
Classifies an instance for internal leave one out cross validation of feature sets

Parameters:
instance - instance to be "left out" and classified
instA - feature values of the selected features for the instance
Returns:
the classification of the instance
Throws:
java.lang.Exception

classifyFoldCV

double classifyFoldCV(Instances fold,
                      int[] fs)
                throws java.lang.Exception
Calculates the accuracy on a test fold for internal cross validation of feature sets

Parameters:
fold - set of instances to be "left out" and classified
fs - currently selected feature set
Returns:
the accuracy for the fold
Throws:
java.lang.Exception

estimateAccuracy

private double estimateAccuracy(java.util.BitSet feature_set,
                                int num_atts)
                         throws java.lang.Exception
Evaluates a feature subset by cross validation

Parameters:
feature_set - the subset to be evaluated
num_atts - the number of attributes in the subset
Returns:
the estimated accuracy
Throws:
java.lang.Exception - if subset can't be evaluated

printSub

private java.lang.String printSub(java.util.BitSet sub)
Returns a String representation of a feature subset

Parameters:
sub - BitSet representation of a subset
Returns:
String containing subset

best_first

private void best_first()
                 throws java.lang.Exception
Does a best first search

Throws:
java.lang.Exception

resetOptions

protected void resetOptions()
Resets the options.


listOptions

public java.util.Enumeration listOptions()
Returns an enumeration describing the available options.

Specified by:
listOptions in interface OptionHandler
Overrides:
listOptions in class Classifier
Returns:
an enumeration of all the available options.

crossValTipText

public java.lang.String crossValTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setCrossVal

public void setCrossVal(int folds)
Sets the number of folds for cross validation (1 = leave one out)

Parameters:
folds - the number of folds

getCrossVal

public int getCrossVal()
Gets the number of folds for cross validation

Returns:
the number of cross validation folds

maxStaleTipText

public java.lang.String maxStaleTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setMaxStale

public void setMaxStale(int stale)
Sets the number of non improving decision tables to consider before abandoning the search.

Parameters:
stale - the number of nodes

getMaxStale

public int getMaxStale()
Gets the number of non improving decision tables

Returns:
the number of non improving decision tables

useIBkTipText

public java.lang.String useIBkTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setUseIBk

public void setUseIBk(boolean ibk)
Sets whether IBk should be used instead of the majority class

Parameters:
ibk - true if IBk is to be used

getUseIBk

public boolean getUseIBk()
Gets whether IBk is being used instead of the majority class

Returns:
true if IBk is being used

displayRulesTipText

public java.lang.String displayRulesTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setDisplayRules

public void setDisplayRules(boolean rules)
Sets whether rules are to be printed

Parameters:
rules - true if rules are to be printed

getDisplayRules

public boolean getDisplayRules()
Gets whether rules are being printed

Returns:
true if rules are being printed

setOptions

public void setOptions(java.lang.String[] options)
                throws java.lang.Exception
Parses the options for this object. Valid options are:

-S num
Number of fully expanded non improving subsets to consider before terminating a best first search. (Default = 5)

-X num
Use cross validation to evaluate features. Use number of folds = 1 for leave one out CV. (Default = leave one out CV)

-I
Use nearest neighbour instead of global table majority.

-R
Prints the decision table.

Specified by:
setOptions in interface OptionHandler
Overrides:
setOptions in class Classifier
Parameters:
options - the list of options as an array of strings
Throws:
java.lang.Exception - if an option is not supported

getOptions

public java.lang.String[] getOptions()
Gets the current settings of the classifier.

Specified by:
getOptions in interface OptionHandler
Overrides:
getOptions in class Classifier
Returns:
an array of strings suitable for passing to setOptions

buildClassifier

public void buildClassifier(Instances data)
                     throws java.lang.Exception
Generates the classifier.

Specified by:
buildClassifier in class Classifier
Parameters:
data - set of instances serving as training data
Throws:
java.lang.Exception - if the classifier has not been generated successfully

distributionForInstance

public double[] distributionForInstance(Instance instance)
                                 throws java.lang.Exception
Calculates the class membership probabilities for the given test instance.

Overrides:
distributionForInstance in class Classifier
Parameters:
instance - the instance to be classified
Returns:
predicted class probability distribution
Throws:
java.lang.Exception - if distribution can't be computed

printFeatures

public java.lang.String printFeatures()
Returns a string description of the features selected

Returns:
a string of features

measureNumRules

public double measureNumRules()
Returns the number of rules

Returns:
the number of rules

enumerateMeasures

public java.util.Enumeration enumerateMeasures()
Returns an enumeration of the additional measure names

Specified by:
enumerateMeasures in interface AdditionalMeasureProducer
Returns:
an enumeration of the measure names

getMeasure

public double getMeasure(java.lang.String additionalMeasureName)
Returns the value of the named measure

Specified by:
getMeasure in interface AdditionalMeasureProducer
Parameters:
additionalMeasureName - the name of the measure to query for its value
Returns:
the value of the named measure
Throws:
java.lang.IllegalArgumentException - if the named measure is not supported

toString

public java.lang.String toString()
Returns a description of the classifier.

Returns:
a description of the classifier as a string.

main

public static void main(java.lang.String[] argv)
Main method for testing this class.

Parameters:
argv - the command-line options