weka.classifiers.lazy.kstar
Class KStarNominalAttribute

java.lang.Object
  extended byweka.classifiers.lazy.kstar.KStarNominalAttribute
All Implemented Interfaces:
KStarConstants

public class KStarNominalAttribute
extends java.lang.Object
implements KStarConstants

A custom class which provides the environment for computing the transformation probability of a specified test instance nominal attribute to a specified train instance nominal attribute.

Version:
$Revision 1.0 $
Author:
Len Trigg (len@reeltwo.com), Abdelaziz Mahoui (am14@cs.waikato.ac.nz)

Field Summary
protected  int m_AttrIndex
          The index of the nominal attribute in the test and train instances
protected  double m_AverageProb
          Average probability of test attribute transforming into train attribute
protected  int m_BlendFactor
          default sphere of influence blend setting
protected  int m_BlendMethod
          B_SPHERE = use specified blend, B_ENTROPY = entropic blend setting
protected  KStarCache m_Cache
          A cache for storing attribute values and their corresponding stop parameters
protected  int m_ClassType
          The class attribute type
protected  int[] m_Distribution
          Distribution of the attribute value in the train dataset
protected  int m_MissingMode
          missing value treatment
protected  double m_MissingProb
          Probability of test attribute transforming into train attribute with missing value
protected  int m_NumAttributes
          The number of attributes
protected  int m_NumClasses
          The number of class values
protected  int m_NumInstances
          The number of instances in the dataset
protected  int[][] m_RandClassCols
          Set of colomns: each colomn representing a randomised version of the train dataset class colomn
protected  double m_SmallestProb
          Smallest probability of test attribute transforming into train attribute
protected  double m_Stop
          The stop parameter
protected  Instance m_Test
          The test instance
protected  int m_TotalCount
          Number of trai instances with no missing attribute values
protected  Instance m_Train
          The train instance
protected  Instances m_TrainSet
          The training instances used for classification.
 
Fields inherited from interface weka.classifiers.lazy.kstar.KStarConstants
B_ENTROPY, B_SPHERE, EPSILON, FLOOR, FLOOR1, INITIAL_STEP, LOG2, M_AVERAGE, M_DELETE, M_MAXDIFF, M_NORMAL, NUM_RAND_COLS, OFF, ON, ROOT_FINDER_ACCURACY, ROOT_FINDER_MAX_ITER
 
Constructor Summary
KStarNominalAttribute(Instance test, Instance train, int attrIndex, Instances trainSet, int[][] randClassCol, KStarCache cache)
          Constructor
 
Method Summary
private  void calculateEntropy(double stop, KStarWrapper params)
          Calculates the entropy of the actual class prediction and the entropy for random class prediction.
private  void calculateSphereSize(int testvalue, double stop, KStarWrapper params)
          Calculates the size of the "sphere of influence" defined as: sphere = sum(P^2)/sum(P)^2 P(i|j) = (1-tstop)*P(i) + ((i==j)?
private  void generateAttrDistribution()
          Calculates the distribution, in the dataset, of the indexed nominal attribute values.
private  void init()
          Initializes the m_Attributes of the class.
private  double PStar(Instance test, Instance train, int col, double stop)
          Calculates the nominal probability function defined as: P(i|j) = (1-stop) * P(i) + ((i==j) ?
 void setOptions(int missingmode, int blendmethod, int blendfactor)
          Sets the options.
private  double stopProbUsingBlend()
          Calculates the "stop parameter" for this attribute using the blend method: the value is computed using a root finder algorithm.
private  double stopProbUsingEntropy()
          Calculates the "stop parameter" for this attribute using the entropy method: the value is computed using a root finder algorithm.
 double transProb()
          Calculates the probability of the indexed nominal attribute of the test instance transforming into the indexed nominal attribute of the training instance.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

m_TrainSet

protected Instances m_TrainSet
The training instances used for classification.


m_Test

protected Instance m_Test
The test instance


m_Train

protected Instance m_Train
The train instance


m_AttrIndex

protected int m_AttrIndex
The index of the nominal attribute in the test and train instances


m_Stop

protected double m_Stop
The stop parameter


m_MissingProb

protected double m_MissingProb
Probability of test attribute transforming into train attribute with missing value


m_AverageProb

protected double m_AverageProb
Average probability of test attribute transforming into train attribute


m_SmallestProb

protected double m_SmallestProb
Smallest probability of test attribute transforming into train attribute


m_TotalCount

protected int m_TotalCount
Number of trai instances with no missing attribute values


m_Distribution

protected int[] m_Distribution
Distribution of the attribute value in the train dataset


m_RandClassCols

protected int[][] m_RandClassCols
Set of colomns: each colomn representing a randomised version of the train dataset class colomn


m_Cache

protected KStarCache m_Cache
A cache for storing attribute values and their corresponding stop parameters


m_NumInstances

protected int m_NumInstances
The number of instances in the dataset


m_NumClasses

protected int m_NumClasses
The number of class values


m_NumAttributes

protected int m_NumAttributes
The number of attributes


m_ClassType

protected int m_ClassType
The class attribute type


m_MissingMode

protected int m_MissingMode
missing value treatment


m_BlendMethod

protected int m_BlendMethod
B_SPHERE = use specified blend, B_ENTROPY = entropic blend setting


m_BlendFactor

protected int m_BlendFactor
default sphere of influence blend setting

Constructor Detail

KStarNominalAttribute

public KStarNominalAttribute(Instance test,
                             Instance train,
                             int attrIndex,
                             Instances trainSet,
                             int[][] randClassCol,
                             KStarCache cache)
Constructor

Method Detail

init

private void init()
Initializes the m_Attributes of the class.


transProb

public double transProb()
Calculates the probability of the indexed nominal attribute of the test instance transforming into the indexed nominal attribute of the training instance.

Returns:
the value of the transformation probability.

stopProbUsingEntropy

private double stopProbUsingEntropy()
Calculates the "stop parameter" for this attribute using the entropy method: the value is computed using a root finder algorithm. The method takes advantage of the calculation to compute the smallest and average transformation probabilities once the stop factor is obtained. It also sets the transformation probability to an attribute with a missing value.

Returns:
the value of the stop parameter.

calculateEntropy

private void calculateEntropy(double stop,
                              KStarWrapper params)
Calculates the entropy of the actual class prediction and the entropy for random class prediction. It also calculates the smallest and average transformation probabilities.

Parameters:
stop - the stop parameter
params - the object wrapper for the parameters: actual entropy, random entropy, average probability and smallest probability.
Returns:
the values are returned in the object "params".

stopProbUsingBlend

private double stopProbUsingBlend()
Calculates the "stop parameter" for this attribute using the blend method: the value is computed using a root finder algorithm. The method takes advantage of this calculation to compute the smallest and average transformation probabilities once the stop factor is obtained. It also sets the transformation probability to an attribute with a missing value.

Returns:
the value of the stop parameter.

calculateSphereSize

private void calculateSphereSize(int testvalue,
                                 double stop,
                                 KStarWrapper params)
Calculates the size of the "sphere of influence" defined as: sphere = sum(P^2)/sum(P)^2 P(i|j) = (1-tstop)*P(i) + ((i==j)?tstop:0). This method takes advantage of the calculation to compute the values of the "smallest" and "average" transformation probabilities when using the specified stop parameter.

Parameters:
stop - the stop parameter
params - a wrapper of the parameters to be computed: "sphere" the sphere size "avgprob" the average transformation probability "minProb" the smallest transformation probability
Returns:
the values are returned in "params" object.

PStar

private double PStar(Instance test,
                     Instance train,
                     int col,
                     double stop)
Calculates the nominal probability function defined as: P(i|j) = (1-stop) * P(i) + ((i==j) ? stop : 0) In this case, it calculates the transformation probability of the indexed test attribute to the indexed train attribute.

Parameters:
test - the test instance
train - the train instance
col - the attribute index
Returns:
the value of the tranformation probability.

generateAttrDistribution

private void generateAttrDistribution()
Calculates the distribution, in the dataset, of the indexed nominal attribute values. It also counts the actual number of training instances that contributed (those with non-missing values) to calculate the distribution.


setOptions

public void setOptions(int missingmode,
                       int blendmethod,
                       int blendfactor)
Sets the options.