weka.classifiers.lazy
Class KStar

java.lang.Object
  extended byweka.classifiers.Classifier
      extended byweka.classifiers.lazy.KStar
All Implemented Interfaces:
java.lang.Cloneable, KStarConstants, OptionHandler, java.io.Serializable, UpdateableClassifier

public class KStar
extends Classifier
implements KStarConstants, UpdateableClassifier

K* is an instance-based classifier, that is the class of a test instance is based upon the class of those training instances similar to it, as determined by some similarity function. The underlying assumption of instance-based classifiers such as K*, IB1, PEBLS, etc, is that similar instances will have similar classes. For more information on K*, see

John, G. Cleary and Leonard, E. Trigg (1995) "K*: An Instance- based Learner Using an Entropic Distance Measure", Proceedings of the 12th International Conference on Machine learning, pp. 108-114.

Version:
$Revision 1.0 $
Author:
Len Trigg (len@reeltwo.com), Abdelaziz Mahoui (am14@cs.waikato.ac.nz)
See Also:
Serialized Form

Field Summary
protected  int m_BlendMethod
          0 = use specified blend, 1 = entropic blend setting
protected  KStarCache[] m_Cache
          A custom data structure for caching distinct attribute values and their scale factor or stop parameter.
protected  int m_ClassType
          The class attribute type
protected  int m_ComputeRandomCols
          Flag turning on and off the computation of random class colomns
protected  int m_GlobalBlend
          default sphere of influence blend setting
protected  int m_InitFlag
          Flag turning on and off the initialisation of config variables
protected  int m_MissingMode
          missing value treatment
protected  int m_NumAttributes
          The number of attributes
protected  int m_NumClasses
          The number of class values
protected  int m_NumInstances
          The number of instances in the dataset
protected  int[][] m_RandClassCols
          Table of random class value colomns
protected  Instances m_Train
          The training instances used for classification.
static Tag[] TAGS_MISSING
          Define possible missing value handling methods
 
Fields inherited from class weka.classifiers.Classifier
m_Debug
 
Fields inherited from interface weka.classifiers.lazy.kstar.KStarConstants
B_ENTROPY, B_SPHERE, EPSILON, FLOOR, FLOOR1, INITIAL_STEP, LOG2, M_AVERAGE, M_DELETE, M_MAXDIFF, M_NORMAL, NUM_RAND_COLS, OFF, ON, ROOT_FINDER_ACCURACY, ROOT_FINDER_MAX_ITER
 
Constructor Summary
KStar()
           
 
Method Summary
private  double attrTransProb(Instance first, Instance second, int col)
          Calculates the transformation probability of the indexed test attribute to the indexed train attribute.
 void buildClassifier(Instances instances)
          Generates the classifier.
private  int[] classValues()
          Note: for Nominal Class Only!
 double[] distributionForInstance(Instance instance)
          Calculates the class membership probabilities for the given test instance.
 java.lang.String entropicAutoBlendTipText()
          Returns the tip text for this property
private  void generateRandomClassColomns()
          Note: for Nominal Class Only!
 boolean getEntropicAutoBlend()
          Get whether entropic blending being used
 int getGlobalBlend()
          Get the value of the global blend parameter
 SelectedTag getMissingMode()
          Gets the method to use for handling missing values.
 java.lang.String[] getOptions()
          Gets the current settings of K*.
 java.lang.String globalBlendTipText()
          Returns the tip text for this property
 java.lang.String globalInfo()
          Returns a string describing classifier
private  void init_m_Attributes()
          Initializes the m_Attributes of the class.
private  double instanceTransformationProbability(Instance first, Instance second)
          Calculate the probability of the first instance transforming into the second instance: the probability is the product of the transformation probabilities of the attributes normilized over the number of instances used.
 java.util.Enumeration listOptions()
          Returns an enumeration describing the available options.
static void main(java.lang.String[] argv)
          Main method for testing this class.
 java.lang.String missingModeTipText()
          Returns the tip text for this property
private  int[] randomize(int[] array, java.util.Random generator)
          Returns a copy of the array with its elements randomly redistributed.
 void setEntropicAutoBlend(boolean e)
          Set whether entropic blending is to be used.
 void setGlobalBlend(int b)
          Set the global blend parameter
 void setMissingMode(SelectedTag newMode)
          Sets the method to use for handling missing values.
 void setOptions(java.lang.String[] options)
          Parses a given list of options.
 java.lang.String toString()
          Returns a description of this classifier.
private  void update_m_Attributes()
          Updates the m_attributes of the class.
 void updateClassifier(Instance instance)
          Adds the supplied instance to the training set
 
Methods inherited from class weka.classifiers.Classifier
classifyInstance, debugTipText, forName, getDebug, makeCopies, setDebug
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

m_Train

protected Instances m_Train
The training instances used for classification.


m_NumInstances

protected int m_NumInstances
The number of instances in the dataset


m_NumClasses

protected int m_NumClasses
The number of class values


m_NumAttributes

protected int m_NumAttributes
The number of attributes


m_ClassType

protected int m_ClassType
The class attribute type


m_RandClassCols

protected int[][] m_RandClassCols
Table of random class value colomns


m_ComputeRandomCols

protected int m_ComputeRandomCols
Flag turning on and off the computation of random class colomns


m_InitFlag

protected int m_InitFlag
Flag turning on and off the initialisation of config variables


m_Cache

protected KStarCache[] m_Cache
A custom data structure for caching distinct attribute values and their scale factor or stop parameter.


m_MissingMode

protected int m_MissingMode
missing value treatment


m_BlendMethod

protected int m_BlendMethod
0 = use specified blend, 1 = entropic blend setting


m_GlobalBlend

protected int m_GlobalBlend
default sphere of influence blend setting


TAGS_MISSING

public static final Tag[] TAGS_MISSING
Define possible missing value handling methods

Constructor Detail

KStar

public KStar()
Method Detail

globalInfo

public java.lang.String globalInfo()
Returns a string describing classifier

Returns:
a description suitable for displaying in the explorer/experimenter gui

buildClassifier

public void buildClassifier(Instances instances)
                     throws java.lang.Exception
Generates the classifier.

Specified by:
buildClassifier in class Classifier
Parameters:
instances - set of instances serving as training data
Throws:
java.lang.Exception - if the classifier has not been generated successfully

updateClassifier

public void updateClassifier(Instance instance)
                      throws java.lang.Exception
Adds the supplied instance to the training set

Specified by:
updateClassifier in interface UpdateableClassifier
Parameters:
instance - the instance to add
Throws:
java.lang.Exception - if instance could not be incorporated successfully

distributionForInstance

public double[] distributionForInstance(Instance instance)
                                 throws java.lang.Exception
Calculates the class membership probabilities for the given test instance.

Overrides:
distributionForInstance in class Classifier
Parameters:
instance - the instance to be classified
Returns:
predicted class probability distribution
Throws:
java.lang.Exception - if an error occurred during the prediction

instanceTransformationProbability

private double instanceTransformationProbability(Instance first,
                                                 Instance second)
Calculate the probability of the first instance transforming into the second instance: the probability is the product of the transformation probabilities of the attributes normilized over the number of instances used.

Parameters:
first - the test instance
second - the train instance
Returns:
transformation probability value

attrTransProb

private double attrTransProb(Instance first,
                             Instance second,
                             int col)
Calculates the transformation probability of the indexed test attribute to the indexed train attribute.

Parameters:
first - the test instance.
second - the train instance.
col - the index of the attribute in the instance.
Returns:
the value of the transformation probability.

missingModeTipText

public java.lang.String missingModeTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

getMissingMode

public SelectedTag getMissingMode()
Gets the method to use for handling missing values. Will be one of M_NORMAL, M_AVERAGE, M_MAXDIFF or M_DELETE.

Returns:
the method used for handling missing values.

setMissingMode

public void setMissingMode(SelectedTag newMode)
Sets the method to use for handling missing values. Values other than M_NORMAL, M_AVERAGE, M_MAXDIFF and M_DELETE will be ignored.

Parameters:
newMode - the method to use for handling missing values.

listOptions

public java.util.Enumeration listOptions()
Returns an enumeration describing the available options.

Specified by:
listOptions in interface OptionHandler
Overrides:
listOptions in class Classifier
Returns:
an enumeration of all the available options.

globalBlendTipText

public java.lang.String globalBlendTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setGlobalBlend

public void setGlobalBlend(int b)
Set the global blend parameter

Parameters:
b - the value for global blending

getGlobalBlend

public int getGlobalBlend()
Get the value of the global blend parameter

Returns:
the value of the global blend parameter

entropicAutoBlendTipText

public java.lang.String entropicAutoBlendTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setEntropicAutoBlend

public void setEntropicAutoBlend(boolean e)
Set whether entropic blending is to be used.

Parameters:
e - true if entropic blending is to be used

getEntropicAutoBlend

public boolean getEntropicAutoBlend()
Get whether entropic blending being used

Returns:
true if entropic blending is used

setOptions

public void setOptions(java.lang.String[] options)
                throws java.lang.Exception
Parses a given list of options. Valid options are: ...

Specified by:
setOptions in interface OptionHandler
Overrides:
setOptions in class Classifier
Parameters:
options - the list of options as an array of strings
Throws:
java.lang.Exception - if an option is not supported

getOptions

public java.lang.String[] getOptions()
Gets the current settings of K*.

Specified by:
getOptions in interface OptionHandler
Overrides:
getOptions in class Classifier
Returns:
an array of strings suitable for passing to setOptions()

toString

public java.lang.String toString()
Returns a description of this classifier.

Returns:
a description of this classifier as a string.

main

public static void main(java.lang.String[] argv)
Main method for testing this class.

Parameters:
argv - should contain command line options (see setOptions)

init_m_Attributes

private void init_m_Attributes()
Initializes the m_Attributes of the class.


update_m_Attributes

private void update_m_Attributes()
Updates the m_attributes of the class.


generateRandomClassColomns

private void generateRandomClassColomns()
Note: for Nominal Class Only! Generates a set of random versions of the class colomn.


classValues

private int[] classValues()
Note: for Nominal Class Only! Returns an array of the class values

Returns:
an array of class values

randomize

private int[] randomize(int[] array,
                        java.util.Random generator)
Returns a copy of the array with its elements randomly redistributed.

Parameters:
array - the array to randomize.
Returns:
a copy of the array with its elements randomly redistributed.