weka.classifiers.bayes
Class NaiveBayes

java.lang.Object
  extended byweka.classifiers.Classifier
      extended byweka.classifiers.bayes.NaiveBayes
All Implemented Interfaces:
java.lang.Cloneable, OptionHandler, java.io.Serializable, WeightedInstancesHandler
Direct Known Subclasses:
NaiveBayesUpdateable

public class NaiveBayes
extends Classifier
implements OptionHandler, WeightedInstancesHandler

Class for a Naive Bayes classifier using estimator classes. Numeric estimator precision values are chosen based on analysis of the training data. For this reason, the classifier is not an UpdateableClassifier (which in typical usage are initialized with zero training instances) -- if you need the UpdateableClassifier functionality, use the NaiveBayesUpdateable classifier. The NaiveBayesUpdateable classifier will use a default precision of 0.1 for numeric attributes when buildClassifier is called with zero training instances.

For more information on Naive Bayes classifiers, see

George H. John and Pat Langley (1995). Estimating Continuous Distributions in Bayesian Classifiers. Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence. pp. 338-345. Morgan Kaufmann, San Mateo.

Valid options are:

-K
Use kernel estimation for modelling numeric attributes rather than a single normal distribution.

-D
Use supervised discretization to process numeric attributes.

Version:
$Revision: 1.15 $
Author:
Len Trigg (trigg@cs.waikato.ac.nz), Eibe Frank (eibe@cs.waikato.ac.nz)
See Also:
Serialized Form

Field Summary
protected static double DEFAULT_NUM_PRECISION
          The precision parameter used for numeric attributes
protected  Estimator m_ClassDistribution
          The class estimator.
protected  Discretize m_Disc
          The discretization filter.
protected  Estimator[][] m_Distributions
          The attribute estimators.
protected  Instances m_Instances
          The dataset header for the purposes of printing out a semi-intelligible model
protected  int m_NumClasses
          The number of classes (or 1 for numeric class)
protected  boolean m_UseDiscretization
          Whether to use discretization than normal distribution for numeric attributes
protected  boolean m_UseKernelEstimator
          Whether to use kernel density estimator rather than normal distribution for numeric attributes
 
Fields inherited from class weka.classifiers.Classifier
m_Debug
 
Constructor Summary
NaiveBayes()
           
 
Method Summary
 void buildClassifier(Instances instances)
          Generates the classifier.
 double[] distributionForInstance(Instance instance)
          Calculates the class membership probabilities for the given test instance.
 java.lang.String[] getOptions()
          Gets the current settings of the classifier.
 boolean getUseKernelEstimator()
          Gets if kernel estimator is being used.
 boolean getUseSupervisedDiscretization()
          Get whether supervised discretization is to be used.
 java.lang.String globalInfo()
          Returns a string describing this classifier
 java.util.Enumeration listOptions()
          Returns an enumeration describing the available options.
static void main(java.lang.String[] argv)
          Main method for testing this class.
 void setOptions(java.lang.String[] options)
          Parses a given list of options.
 void setUseKernelEstimator(boolean v)
          Sets if kernel estimator is to be used.
 void setUseSupervisedDiscretization(boolean newblah)
          Set whether supervised discretization is to be used.
 java.lang.String toString()
          Returns a description of the classifier.
 void updateClassifier(Instance instance)
          Updates the classifier with the given instance.
 java.lang.String useKernelEstimatorTipText()
          Returns the tip text for this property
 java.lang.String useSupervisedDiscretizationTipText()
          Returns the tip text for this property
 
Methods inherited from class weka.classifiers.Classifier
classifyInstance, debugTipText, forName, getDebug, makeCopies, setDebug
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

m_Distributions

protected Estimator[][] m_Distributions
The attribute estimators.


m_ClassDistribution

protected Estimator m_ClassDistribution
The class estimator.


m_UseKernelEstimator

protected boolean m_UseKernelEstimator
Whether to use kernel density estimator rather than normal distribution for numeric attributes


m_UseDiscretization

protected boolean m_UseDiscretization
Whether to use discretization than normal distribution for numeric attributes


m_NumClasses

protected int m_NumClasses
The number of classes (or 1 for numeric class)


m_Instances

protected Instances m_Instances
The dataset header for the purposes of printing out a semi-intelligible model


DEFAULT_NUM_PRECISION

protected static final double DEFAULT_NUM_PRECISION
The precision parameter used for numeric attributes

See Also:
Constant Field Values

m_Disc

protected Discretize m_Disc
The discretization filter.

Constructor Detail

NaiveBayes

public NaiveBayes()
Method Detail

globalInfo

public java.lang.String globalInfo()
Returns a string describing this classifier

Returns:
a description of the classifier suitable for displaying in the explorer/experimenter gui

buildClassifier

public void buildClassifier(Instances instances)
                     throws java.lang.Exception
Generates the classifier.

Specified by:
buildClassifier in class Classifier
Parameters:
instances - set of instances serving as training data
Throws:
java.lang.Exception - if the classifier has not been generated successfully

updateClassifier

public void updateClassifier(Instance instance)
                      throws java.lang.Exception
Updates the classifier with the given instance.

Parameters:
instance - the new training instance to include in the model
Throws:
java.lang.Exception - if the instance could not be incorporated in the model.

distributionForInstance

public double[] distributionForInstance(Instance instance)
                                 throws java.lang.Exception
Calculates the class membership probabilities for the given test instance.

Overrides:
distributionForInstance in class Classifier
Parameters:
instance - the instance to be classified
Returns:
predicted class probability distribution
Throws:
java.lang.Exception - if there is a problem generating the prediction

listOptions

public java.util.Enumeration listOptions()
Returns an enumeration describing the available options.

Specified by:
listOptions in interface OptionHandler
Overrides:
listOptions in class Classifier
Returns:
an enumeration of all the available options.

setOptions

public void setOptions(java.lang.String[] options)
                throws java.lang.Exception
Parses a given list of options. Valid options are:

-K
Use kernel estimation for modelling numeric attributes rather than a single normal distribution.

-D
Use supervised discretization to process numeric attributes.

Specified by:
setOptions in interface OptionHandler
Overrides:
setOptions in class Classifier
Parameters:
options - the list of options as an array of strings
Throws:
java.lang.Exception - if an option is not supported

getOptions

public java.lang.String[] getOptions()
Gets the current settings of the classifier.

Specified by:
getOptions in interface OptionHandler
Overrides:
getOptions in class Classifier
Returns:
an array of strings suitable for passing to setOptions

toString

public java.lang.String toString()
Returns a description of the classifier.

Returns:
a description of the classifier as a string.

useKernelEstimatorTipText

public java.lang.String useKernelEstimatorTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

getUseKernelEstimator

public boolean getUseKernelEstimator()
Gets if kernel estimator is being used.

Returns:
Value of m_UseKernelEstimatory.

setUseKernelEstimator

public void setUseKernelEstimator(boolean v)
Sets if kernel estimator is to be used.

Parameters:
v - Value to assign to m_UseKernelEstimatory.

useSupervisedDiscretizationTipText

public java.lang.String useSupervisedDiscretizationTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

getUseSupervisedDiscretization

public boolean getUseSupervisedDiscretization()
Get whether supervised discretization is to be used.

Returns:
true if supervised discretization is to be used.

setUseSupervisedDiscretization

public void setUseSupervisedDiscretization(boolean newblah)
Set whether supervised discretization is to be used.

Parameters:
newblah - true if supervised discretization is to be used.

main

public static void main(java.lang.String[] argv)
Main method for testing this class.

Parameters:
argv - the options