weka.classifiers.misc
Class VFI

java.lang.Object
  extended byweka.classifiers.Classifier
      extended byweka.classifiers.misc.VFI
All Implemented Interfaces:
java.lang.Cloneable, OptionHandler, java.io.Serializable, WeightedInstancesHandler

public class VFI
extends Classifier
implements OptionHandler, WeightedInstancesHandler

Class implementing the voting feature interval classifier. For numeric attributes, upper and lower boundaries (intervals) are constructed around each class. Discrete attributes have point intervals. Class counts are recorded for each interval on each feature. Classification is by voting. Missing values are ignored. Does not handle numeric class.

Have added a simple attribute weighting scheme. Higher weight is assigned to more confident intervals, where confidence is a function of entropy: weight (att_i) = (entropy of class distrib att_i / max uncertainty)^-bias.

Faster than NaiveBayes but slower than HyperPipes.

  Confidence: 0.01 (two tailed)

 Dataset                   (1) VFI '-B  | (2) Hyper (3) Naive
                         ------------------------------------
 anneal.ORIG               (10)   74.56 |   97.88 v   74.77
 anneal                    (10)   71.83 |   97.88 v   86.51 v
 audiology                 (10)   51.69 |   66.26 v   72.25 v
 autos                     (10)   57.63 |   62.79 v   57.76
 balance-scale             (10)   68.72 |   46.08 *   90.5  v
 breast-cancer             (10)   67.25 |   69.84 v   73.12 v
 wisconsin-breast-cancer   (10)   95.72 |   88.31 *   96.05 v
 horse-colic.ORIG          (10)   66.13 |   70.41 v   66.12
 horse-colic               (10)   78.36 |   62.07 *   78.28
 credit-rating             (10)   85.17 |   44.58 *   77.84 *
 german_credit             (10)   70.81 |   69.89 *   74.98 v
 pima_diabetes             (10)   62.13 |   65.47 v   75.73 v
 Glass                     (10)   56.82 |   50.19 *   47.43 *
 cleveland-14-heart-diseas (10)   80.01 |   55.18 *   83.83 v
 hungarian-14-heart-diseas (10)   82.8  |   65.55 *   84.37 v
 heart-statlog             (10)   79.37 |   55.56 *   84.37 v
 hepatitis                 (10)   83.78 |   63.73 *   83.87
 hypothyroid               (10)   92.64 |   93.33 v   95.29 v
 ionosphere                (10)   94.16 |   35.9  *   82.6  *
 iris                      (10)   96.2  |   91.47 *   95.27 *
 kr-vs-kp                  (10)   88.22 |   54.1  *   87.84 *
 labor                     (10)   86.73 |   87.67     93.93 v
 lymphography              (10)   78.48 |   58.18 *   83.24 v
 mushroom                  (10)   99.85 |   99.77 *   95.77 *
 primary-tumor             (10)   29    |   24.78 *   49.35 v
 segment                   (10)   77.42 |   75.15 *   80.1  v
 sick                      (10)   65.92 |   93.85 v   92.71 v
 sonar                     (10)   58.02 |   57.17     67.97 v
 soybean                   (10)   86.81 |   86.12 *   92.9  v
 splice                    (10)   88.61 |   41.97 *   95.41 v
 vehicle                   (10)   52.94 |   32.77 *   44.8  *
 vote                      (10)   91.5  |   61.38 *   90.19 *
 vowel                     (10)   57.56 |   36.34 *   62.81 v
 waveform                  (10)   56.33 |   46.11 *   80.02 v
 zoo                       (10)   94.05 |   94.26     95.04 v
                          ------------------------------------
                                (v| |*) |  (9|3|23)  (22|5|8) 
 

For more information, see

Demiroz, G. and Guvenir, A. (1997) "Classification by voting feature intervals", ECML-97.

Valid options are:

-C
Don't Weight voting intervals by confidence.

-B
Set exponential bias towards confident intervals. default = 1.0

Version:
$Revision: 1.9 $
Author:
Mark Hall (mhall@cs.waikato.ac.nz)
See Also:
Serialized Form

Field Summary
protected  double m_bias
          Bias towards more confident intervals
protected  int m_ClassIndex
          The index of the class attribute
protected  double[][][] m_counts
          The class counts for each interval of each attribute
protected  double[] m_globalCounts
          The global class counts
protected  Instances m_Instances
          The training data
protected  double[][] m_intervalBounds
          The lower bounds for each attribute
protected  double m_maxEntrop
          The maximum entropy for the class
protected  int m_NumClasses
          The number of classes
protected  boolean m_weightByConfidence
          Exponentially bias more confident intervals
private  double TINY
           
 
Fields inherited from class weka.classifiers.Classifier
m_Debug
 
Constructor Summary
VFI()
           
 
Method Summary
 java.lang.String biasTipText()
          Returns the tip text for this property
 void buildClassifier(Instances instances)
          Generates the classifier.
 double[] distributionForInstance(Instance instance)
          Classifies the given test instance.
 double getBias()
          Get the value of the bias parameter
 java.lang.String[] getOptions()
          Gets the current settings of VFI
 boolean getWeightByConfidence()
          Get whether feature intervals are being weighted by confidence
 java.lang.String globalInfo()
          Returns a string describing this search method
 java.util.Enumeration listOptions()
          Returns an enumeration describing the available options.
static void main(java.lang.String[] args)
          Main method for testing this class.
 void setBias(double b)
          Set the value of the exponential bias towards more confident intervals
 void setOptions(java.lang.String[] options)
          Parses a given list of options.
 void setWeightByConfidence(boolean c)
          Set weighting by confidence
 java.lang.String toString()
          Returns a description of this classifier.
 java.lang.String weightByConfidenceTipText()
          Returns the tip text for this property
 
Methods inherited from class weka.classifiers.Classifier
classifyInstance, debugTipText, forName, getDebug, makeCopies, setDebug
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

m_ClassIndex

protected int m_ClassIndex
The index of the class attribute


m_NumClasses

protected int m_NumClasses
The number of classes


m_Instances

protected Instances m_Instances
The training data


m_counts

protected double[][][] m_counts
The class counts for each interval of each attribute


m_globalCounts

protected double[] m_globalCounts
The global class counts


m_intervalBounds

protected double[][] m_intervalBounds
The lower bounds for each attribute


m_maxEntrop

protected double m_maxEntrop
The maximum entropy for the class


m_weightByConfidence

protected boolean m_weightByConfidence
Exponentially bias more confident intervals


m_bias

protected double m_bias
Bias towards more confident intervals


TINY

private double TINY
Constructor Detail

VFI

public VFI()
Method Detail

globalInfo

public java.lang.String globalInfo()
Returns a string describing this search method

Returns:
a description of the search method suitable for displaying in the explorer/experimenter gui

listOptions

public java.util.Enumeration listOptions()
Returns an enumeration describing the available options.

Specified by:
listOptions in interface OptionHandler
Overrides:
listOptions in class Classifier
Returns:
an enumeration of all the available options.

setOptions

public void setOptions(java.lang.String[] options)
                throws java.lang.Exception
Parses a given list of options. Valid options are:

-C
Don't weight voting intervals by confidence.

-B
Set exponential bias towards confident intervals. default = 1.0

Specified by:
setOptions in interface OptionHandler
Overrides:
setOptions in class Classifier
Parameters:
options - the list of options as an array of strings
Throws:
java.lang.Exception - if an option is not supported

weightByConfidenceTipText

public java.lang.String weightByConfidenceTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setWeightByConfidence

public void setWeightByConfidence(boolean c)
Set weighting by confidence

Parameters:
c - true if feature intervals are to be weighted by confidence

getWeightByConfidence

public boolean getWeightByConfidence()
Get whether feature intervals are being weighted by confidence

Returns:
true if weighting by confidence is selected

biasTipText

public java.lang.String biasTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setBias

public void setBias(double b)
Set the value of the exponential bias towards more confident intervals

Parameters:
b - the value of the bias parameter

getBias

public double getBias()
Get the value of the bias parameter

Returns:
the bias parameter

getOptions

public java.lang.String[] getOptions()
Gets the current settings of VFI

Specified by:
getOptions in interface OptionHandler
Overrides:
getOptions in class Classifier
Returns:
an array of strings suitable for passing to setOptions()

buildClassifier

public void buildClassifier(Instances instances)
                     throws java.lang.Exception
Generates the classifier.

Specified by:
buildClassifier in class Classifier
Parameters:
instances - set of instances serving as training data
Throws:
java.lang.Exception - if the classifier has not been generated successfully

toString

public java.lang.String toString()
Returns a description of this classifier.

Returns:
a description of this classifier as a string.

distributionForInstance

public double[] distributionForInstance(Instance instance)
                                 throws java.lang.Exception
Classifies the given test instance.

Overrides:
distributionForInstance in class Classifier
Parameters:
instance - the instance to be classified
Returns:
the predicted class for the instance
Throws:
java.lang.Exception - if the instance can't be classified

main

public static void main(java.lang.String[] args)
Main method for testing this class.

Parameters:
args - should contain command line arguments for evaluation (see Evaluation).