weka.classifiers.bayes
Class AODE

java.lang.Object
  extended byweka.classifiers.Classifier
      extended byweka.classifiers.bayes.AODE
All Implemented Interfaces:
java.lang.Cloneable, OptionHandler, java.io.Serializable, WeightedInstancesHandler

public class AODE
extends Classifier
implements OptionHandler, WeightedInstancesHandler

AODE achieves highly accurate classification by averaging over all of a small space of alternative naive-Bayes-like models that have weaker (and hence less detrimental) independence assumptions than naive Bayes. The resulting algorithm is computationally efficient while delivering highly accurate classification on many learning tasks.
For more information, see

G. Webb, J. Boughton & Z. Wang (2003). Not So Naive Bayes. Submitted for publication
G. Webb, J. Boughton & Z. Wang (2002). Averaged One-Dependence Estimators: Preliminary Results. AI2002 Data Mining Workshop, Canberra. Valid options are:

-D
Debugging information is printed if this flag is specified.

-F
Specify the frequency limit for parent attributes.

Version:
$Revision: 1.7 $
Author:
Janice Boughton (jrbought@csse.monash.edu.au) & Zhihai Wang (zhw@csse.monash.edu.au)
See Also:
Serialized Form

Field Summary
private  double[] m_ClassCounts
          The number of times each class value occurs in the dataset
private  int m_ClassIndex
          The index of the class attribute
private  double[][][] m_CondiCounts
          3D array (m_NumClasses * m_TotalAttValues * m_TotalAttValues) of attribute counts
private  boolean m_Debug
          If true, outputs debugging info
private  int[] m_Frequencies
          The frequency of each attribute value for the dataset
private  Instances m_Instances
          The dataset
private  int m_Limit
          An att's frequency must be this value or more to be a superParent
private  int m_NumAttributes
          The number of attributes in dataset, including class
private  int[] m_NumAttValues
          The number of values for each attribute
private  int m_NumClasses
          The number of classes
private  int m_NumInstances
          The number of instances in the dataset
private  int[] m_StartAttIndex
          The starting index (in the m_CondiCounts matrix) of each attribute
private  int[][] m_SumForCounts
          The sums of attribute-class counts -- if there are no missing values for att, then m_SumForCounts[classVal][att] will be the same as m_ClassCounts[classVal]
private  double m_SumInstances
          The number of valid class values observed in dataset -- with no missing classes, this number is the same as m_NumInstances.
private  int m_TotalAttValues
          The total number of values for all attributes (not including class).
 
Constructor Summary
AODE()
           
 
Method Summary
private  void addToCounts(Instance instance)
          Puts an instance's values into m_CondiCounts, m_ClassCounts and m_SumInstances.
 void buildClassifier(Instances instances)
          Generates the classifier.
 double[] distributionForInstance(Instance instance)
          Calculates the class membership probabilities for the given test instance.
 java.lang.String frequencyLimitForParentAttributesTipText()
          Returns the tip text for this property
 int getFrequencyLimitForParentAttributes()
          Return the frequency limit for parent attributes
 java.lang.String[] getOptions()
          Gets the current settings of the classifier.
 java.lang.String globalInfo()
          Returns a string describing this classifier
 java.util.Enumeration listOptions()
          Returns an enumeration describing the available options
static void main(java.lang.String[] argv)
          Main method for testing this class.
 double NBconditionalProb(Instance instance, int classVal)
          Calculates the probability of the specified class for the given test instance, using naive Bayes.
 void setFrequencyLimitForParentAttributes(int fl)
          Set the frequency limit for parent attributes
 void setOptions(java.lang.String[] options)
          Parses a given list of options.
 java.lang.String toString()
          Returns a description of the classifier.
 
Methods inherited from class weka.classifiers.Classifier
classifyInstance, debugTipText, forName, getDebug, makeCopies, setDebug
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

m_CondiCounts

private double[][][] m_CondiCounts
3D array (m_NumClasses * m_TotalAttValues * m_TotalAttValues) of attribute counts


m_ClassCounts

private double[] m_ClassCounts
The number of times each class value occurs in the dataset


m_SumForCounts

private int[][] m_SumForCounts
The sums of attribute-class counts -- if there are no missing values for att, then m_SumForCounts[classVal][att] will be the same as m_ClassCounts[classVal]


m_NumClasses

private int m_NumClasses
The number of classes


m_NumAttributes

private int m_NumAttributes
The number of attributes in dataset, including class


m_NumInstances

private int m_NumInstances
The number of instances in the dataset


m_ClassIndex

private int m_ClassIndex
The index of the class attribute


m_Instances

private Instances m_Instances
The dataset


m_TotalAttValues

private int m_TotalAttValues
The total number of values for all attributes (not including class). Eg. for three atts each with two possible values, m_TotalAttValues would be 6. This variable is used when allocating space for m_CondiCounts matrix.


m_StartAttIndex

private int[] m_StartAttIndex
The starting index (in the m_CondiCounts matrix) of each attribute


m_NumAttValues

private int[] m_NumAttValues
The number of values for each attribute


m_Frequencies

private int[] m_Frequencies
The frequency of each attribute value for the dataset


m_SumInstances

private double m_SumInstances
The number of valid class values observed in dataset -- with no missing classes, this number is the same as m_NumInstances.


m_Limit

private int m_Limit
An att's frequency must be this value or more to be a superParent


m_Debug

private boolean m_Debug
If true, outputs debugging info

Constructor Detail

AODE

public AODE()
Method Detail

globalInfo

public java.lang.String globalInfo()
Returns a string describing this classifier

Returns:
a description of the classifier suitable for displaying in the explorer/experimenter gui

buildClassifier

public void buildClassifier(Instances instances)
                     throws java.lang.Exception
Generates the classifier.

Specified by:
buildClassifier in class Classifier
Parameters:
instances - set of instances serving as training data
Throws:
java.lang.Exception - if the classifier has not been generated successfully

addToCounts

private void addToCounts(Instance instance)
Puts an instance's values into m_CondiCounts, m_ClassCounts and m_SumInstances.

Parameters:
instance - the instance whose values are to be put into the counts variables

distributionForInstance

public double[] distributionForInstance(Instance instance)
                                 throws java.lang.Exception
Calculates the class membership probabilities for the given test instance.

Overrides:
distributionForInstance in class Classifier
Parameters:
instance - the instance to be classified
Returns:
predicted class probability distribution
Throws:
java.lang.Exception - if there is a problem generating the prediction

NBconditionalProb

public double NBconditionalProb(Instance instance,
                                int classVal)
Calculates the probability of the specified class for the given test instance, using naive Bayes.

Parameters:
instance - the instance to be classified
classVal - the class for which to calculate the probability
Returns:
predicted class probability
Throws:
java.lang.Exception - if there is a problem generating the prediction

listOptions

public java.util.Enumeration listOptions()
Returns an enumeration describing the available options

Specified by:
listOptions in interface OptionHandler
Overrides:
listOptions in class Classifier
Returns:
an enumeration of all the available options

setOptions

public void setOptions(java.lang.String[] options)
                throws java.lang.Exception
Parses a given list of options. Valid options are:

-D
Debugging information is printed.

-F
Specify the frequency limit for parent attributes.

Specified by:
setOptions in interface OptionHandler
Overrides:
setOptions in class Classifier
Parameters:
options - the list of options as an array of strings
Throws:
java.lang.Exception - if an option is not supported

getOptions

public java.lang.String[] getOptions()
Gets the current settings of the classifier.

Specified by:
getOptions in interface OptionHandler
Overrides:
getOptions in class Classifier
Returns:
an array of strings suitable for passing to setOptions

frequencyLimitForParentAttributesTipText

public java.lang.String frequencyLimitForParentAttributesTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setFrequencyLimitForParentAttributes

public void setFrequencyLimitForParentAttributes(int fl)
Set the frequency limit for parent attributes

Parameters:
fl - an int value

getFrequencyLimitForParentAttributes

public int getFrequencyLimitForParentAttributes()
Return the frequency limit for parent attributes

Returns:
an int value

toString

public java.lang.String toString()
Returns a description of the classifier.

Returns:
a description of the classifier as a string.

main

public static void main(java.lang.String[] argv)
Main method for testing this class.

Parameters:
argv - the options