ComplementNaiveBayes (Documentation for extended WEKA including Ensembles of Hierarchically Nested Dichotomies)

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

weka.classifiers.bayes
Class ComplementNaiveBayes

java.lang.Object
  weka.classifiers.Classifier
      weka.classifiers.bayes.ComplementNaiveBayes

All Implemented Interfaces:: java.lang.Cloneable, OptionHandler, java.io.Serializable, WeightedInstancesHandler

public class ComplementNaiveBayes
extends Classifier
implements OptionHandler, WeightedInstancesHandler

Class for building and using a Complement class Naive Bayes classifier. For more information see,

ICML-2003 Tackling the poor assumptions of Naive Bayes Text Classifiers P.S.: TF, IDF and length normalization transforms, as described in the paper, can be performed through weka.filters.unsupervised.StringToWordVector.

Valid options for the classifier are:

-N
Normalizes word weights for each class.

-S val
The smoothing value to use to avoid zero WordGivenClass probabilities (default 1.0).

Version:: $Revision: 1.2 $
Author:: Ashraf M. Kibriya (amk14@cs.waikato.ac.nz)
See Also:: Serialized Form

Field Summary
`private Instances`	`header` The instances header that'll be used in toString
`private boolean`	`m_normalizeWordWeights` True if the words weights are to be normalized
`private int`	`numClasses` Holds the number of Class values present in the set of specified instances
`private double`	`smoothingParameter` Holds the smoothing value to avoid word probabilities of zero.
`private double[][]`	`wordWeights` Weight of words for each class.

Fields inherited from class weka.classifiers.Classifier

m_Debug

Constructor Summary
`ComplementNaiveBayes()`

Method Summary
`void`	`buildClassifier(Instances instances)` Generates the classifier.
`double`	`classifyInstance(Instance instance)` Classifies a given instance.
`boolean`	`getNormalizeWordWeights()` Returns true if the word weights for each class are to be normalized
`java.lang.String[]`	`getOptions()` Gets the current settings of the classifier.
`double`	`getSmoothingParameter()` Gets the smoothing value to be used to avoid zero WordGivenClass probabilities.
`java.lang.String`	`globalInfo()` Returns a string describing this classifier
`java.util.Enumeration`	`listOptions()` Returns an enumeration describing the available options.
`static void`	`main(java.lang.String[] argv)` Main method for testing this class.
`java.lang.String`	`normalizeWordWeightsTipText()` Returns the tip text for this property
`void`	`setNormalizeWordWeights(boolean doNormalize)` Sets whether if the word weights for each class should be normalized
`void`	`setOptions(java.lang.String[] options)` Parses a given list of options.
`void`	`setSmoothingParameter(double val)` Sets the smoothing value used to avoid zero WordGivenClass probabilities
`java.lang.String`	`smoothingParameterTipText()` Returns the tip text for this property
`java.lang.String`	`toString()` Prints out the internal model built by the classifier.

Methods inherited from class weka.classifiers.Classifier

debugTipText, distributionForInstance, forName, getDebug, makeCopies, setDebug

Methods inherited from class java.lang.Object

clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait

Field Detail

wordWeights

private double[][] wordWeights

Weight of words for each class. The weight is actually the log of the probability of a word (w) given a class (c) (i.e. log(Pr[w|c])). The format of the matrix is: wordWeights[class][wordAttribute]

smoothingParameter

private double smoothingParameter

Holds the smoothing value to avoid word probabilities of zero.
P.S.: According to the paper this is the Alpha i parameter

m_normalizeWordWeights

private boolean m_normalizeWordWeights

True if the words weights are to be normalized

numClasses

private int numClasses

Holds the number of Class values present in the set of specified instances

header

private Instances header

The instances header that'll be used in toString

Constructor Detail

ComplementNaiveBayes

public ComplementNaiveBayes()

Method Detail

listOptions

public java.util.Enumeration listOptions()

Returns an enumeration describing the available options.

Specified by:: listOptions in interface OptionHandler
Overrides:: listOptions in class Classifier

Returns:: an enumeration of all the available options.

getOptions

public java.lang.String[] getOptions()

Gets the current settings of the classifier.

Specified by:: getOptions in interface OptionHandler
Overrides:: getOptions in class Classifier

Returns:: an array of strings suitable for passing to setOptions

setOptions

public void setOptions(java.lang.String[] options)
                throws java.lang.Exception

Parses a given list of options. Valid options are:

-N
Normalizes word weights for each class.

-S val
The smoothing value to use to avoid zero WordGivenClass probabilities (default 1.0).

Specified by:: setOptions in interface OptionHandler
Overrides:: setOptions in class Classifier

Parameters:: options - the list of options as an array of strings
Throws:: java.lang.Exception - if an option is not supported

getNormalizeWordWeights

public boolean getNormalizeWordWeights()

Returns true if the word weights for each class are to be normalized

setNormalizeWordWeights

public void setNormalizeWordWeights(boolean doNormalize)

Sets whether if the word weights for each class should be normalized

normalizeWordWeightsTipText

public java.lang.String normalizeWordWeightsTipText()

Returns the tip text for this property

Returns:: tip text for this property suitable for displaying in the explorer/experimenter gui

getSmoothingParameter

public double getSmoothingParameter()

Gets the smoothing value to be used to avoid zero WordGivenClass probabilities.

setSmoothingParameter

public void setSmoothingParameter(double val)

Sets the smoothing value used to avoid zero WordGivenClass probabilities

smoothingParameterTipText

public java.lang.String smoothingParameterTipText()

Returns the tip text for this property

Returns:: tip text for this property suitable for displaying in the explorer/experimenter gui

globalInfo

public java.lang.String globalInfo()

Returns a string describing this classifier

Returns:: a description of the classifier suitable for displaying in the explorer/experimenter gui

buildClassifier

public void buildClassifier(Instances instances)
                     throws java.lang.Exception

Generates the classifier.

Specified by:: buildClassifier in class Classifier

Parameters:: instances - set of instances serving as training data
Throws:: java.lang.Exception - if the classifier has not been built successfully

classifyInstance

public double classifyInstance(Instance instance)
                        throws java.lang.Exception

Classifies a given instance.

The classification rule is:
MinC(forAllWords(ti*Wci))
where
ti is the frequency of word i in the given instance
Wci is the weight of word i in Class c.

For more information see section 4.4 of the paper mentioned above in the classifiers description.

Overrides:: classifyInstance in class Classifier

Parameters:: instance - the instance to be classified
Returns:: the index of the class the instance is most likely to belong.
Throws:: if - the classifier has not been built yet.; java.lang.Exception - if an error occurred during the prediction

toString

public java.lang.String toString()

Prints out the internal model built by the classifier. In this case it prints out the word weights calculated when building the classifier.

main

public static void main(java.lang.String[] argv)

Main method for testing this class.

Parameters:: argv - the options

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

weka.classifiers.bayes Class ComplementNaiveBayes

wordWeights

smoothingParameter

m_normalizeWordWeights

numClasses

header

ComplementNaiveBayes

listOptions

getOptions

setOptions

getNormalizeWordWeights

setNormalizeWordWeights

normalizeWordWeightsTipText

getSmoothingParameter

setSmoothingParameter

smoothingParameterTipText

globalInfo

buildClassifier

classifyInstance

toString

main

weka.classifiers.bayes
Class ComplementNaiveBayes