weka.classifiers.trees.j48
Class BinC45Split

java.lang.Object
  extended byweka.classifiers.trees.j48.ClassifierSplitModel
      extended byweka.classifiers.trees.j48.BinC45Split
All Implemented Interfaces:
java.lang.Cloneable, java.io.Serializable

public class BinC45Split
extends ClassifierSplitModel

Class implementing a binary C4.5-like split on an attribute.

Version:
$Revision: 1.8 $
Author:
Eibe Frank (eibe@cs.waikato.ac.nz)
See Also:
Serialized Form

Field Summary
private  int m_attIndex
          Attribute to split on.
private  double m_gainRatio
          GainRatio of split.
private static GainRatioSplitCrit m_gainRatioCrit
          Static reference to splitting criterion.
private  double m_infoGain
          InfoGain of split.
private static InfoGainSplitCrit m_infoGainCrit
          Static reference to splitting criterion.
private  int m_minNoObj
          Minimum number of objects in a split.
private  double m_splitPoint
          Value of split point.
private  double m_sumOfWeights
          The sum of the weights of the instances.
 
Fields inherited from class weka.classifiers.trees.j48.ClassifierSplitModel
m_distribution, m_numSubsets
 
Constructor Summary
BinC45Split(int attIndex, int minNoObj, double sumOfWeights)
          Initializes the split model.
 
Method Summary
 int attIndex()
          Returns index of attribute for which split was generated.
 void buildClassifier(Instances trainInstances)
          Creates a C4.5-type split on the given data.
 double classProb(int classIndex, Instance instance, int theSubset)
          Gets class probability for instance.
 double gainRatio()
          Returns (C4.5-type) gain ratio for the generated split.
private  void handleEnumeratedAttribute(Instances trainInstances)
          Creates split on enumerated attribute.
private  void handleNumericAttribute(Instances trainInstances)
          Creates split on numeric attribute.
 double infoGain()
          Returns (C4.5-type) information gain for the generated split.
 java.lang.String leftSide(Instances data)
          Prints left side of condition..
 void resetDistribution(Instances data)
          Sets distribution associated with model.
 java.lang.String rightSide(int index, Instances data)
          Prints the condition satisfied by instances in a subset.
 void setSplitPoint(Instances allInstances)
          Sets split point to greatest value in given data smaller or equal to old split point.
 java.lang.String sourceExpression(int index, Instances data)
          Returns a string containing java source code equivalent to the test made at this node.
 double[] weights(Instance instance)
          Returns weights if instance is assigned to more than one subset.
 int whichSubset(Instance instance)
          Returns index of subset instance is assigned to.
 
Methods inherited from class weka.classifiers.trees.j48.ClassifierSplitModel
checkModel, classifyInstance, classProbLaplace, clone, codingCost, distribution, dumpLabel, dumpModel, numSubsets, sourceClass, split
 
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

m_attIndex

private int m_attIndex
Attribute to split on.


m_minNoObj

private int m_minNoObj
Minimum number of objects in a split.


m_splitPoint

private double m_splitPoint
Value of split point.


m_infoGain

private double m_infoGain
InfoGain of split.


m_gainRatio

private double m_gainRatio
GainRatio of split.


m_sumOfWeights

private double m_sumOfWeights
The sum of the weights of the instances.


m_infoGainCrit

private static InfoGainSplitCrit m_infoGainCrit
Static reference to splitting criterion.


m_gainRatioCrit

private static GainRatioSplitCrit m_gainRatioCrit
Static reference to splitting criterion.

Constructor Detail

BinC45Split

public BinC45Split(int attIndex,
                   int minNoObj,
                   double sumOfWeights)
Initializes the split model.

Method Detail

buildClassifier

public void buildClassifier(Instances trainInstances)
                     throws java.lang.Exception
Creates a C4.5-type split on the given data.

Specified by:
buildClassifier in class ClassifierSplitModel
Throws:
java.lang.Exception - if something goes wrong

attIndex

public final int attIndex()
Returns index of attribute for which split was generated.


gainRatio

public final double gainRatio()
Returns (C4.5-type) gain ratio for the generated split.


classProb

public final double classProb(int classIndex,
                              Instance instance,
                              int theSubset)
                       throws java.lang.Exception
Gets class probability for instance.

Overrides:
classProb in class ClassifierSplitModel
Throws:
java.lang.Exception - if something goes wrong

handleEnumeratedAttribute

private void handleEnumeratedAttribute(Instances trainInstances)
                                throws java.lang.Exception
Creates split on enumerated attribute.

Throws:
java.lang.Exception - if something goes wrong

handleNumericAttribute

private void handleNumericAttribute(Instances trainInstances)
                             throws java.lang.Exception
Creates split on numeric attribute.

Throws:
java.lang.Exception - if something goes wrong

infoGain

public final double infoGain()
Returns (C4.5-type) information gain for the generated split.


leftSide

public final java.lang.String leftSide(Instances data)
Prints left side of condition..

Specified by:
leftSide in class ClassifierSplitModel
Parameters:
data - the data.

rightSide

public final java.lang.String rightSide(int index,
                                        Instances data)
Prints the condition satisfied by instances in a subset.

Specified by:
rightSide in class ClassifierSplitModel
Parameters:
index - of subset and training set.

sourceExpression

public final java.lang.String sourceExpression(int index,
                                               Instances data)
Returns a string containing java source code equivalent to the test made at this node. The instance being tested is called "i".

Specified by:
sourceExpression in class ClassifierSplitModel
Parameters:
index - index of the nominal value tested
data - the data containing instance structure info
Returns:
a value of type 'String'

setSplitPoint

public final void setSplitPoint(Instances allInstances)
Sets split point to greatest value in given data smaller or equal to old split point. (C4.5 does this for some strange reason).


resetDistribution

public void resetDistribution(Instances data)
                       throws java.lang.Exception
Sets distribution associated with model.

Overrides:
resetDistribution in class ClassifierSplitModel
Throws:
java.lang.Exception

weights

public final double[] weights(Instance instance)
Returns weights if instance is assigned to more than one subset. Returns null if instance is only assigned to one subset.

Specified by:
weights in class ClassifierSplitModel

whichSubset

public final int whichSubset(Instance instance)
                      throws java.lang.Exception
Returns index of subset instance is assigned to. Returns -1 if instance is assigned to more than one subset.

Specified by:
whichSubset in class ClassifierSplitModel
Throws:
java.lang.Exception - if something goes wrong