weka.classifiers.bayes
Class BayesNet

java.lang.Object
  extended byweka.classifiers.Classifier
      extended byweka.classifiers.bayes.BayesNet
All Implemented Interfaces:
java.lang.Cloneable, Drawable, OptionHandler, java.io.Serializable, WeightedInstancesHandler
Direct Known Subclasses:
BayesNetB, BayesNetK2

public class BayesNet
extends Classifier
implements OptionHandler, WeightedInstancesHandler, Drawable

Base class for a Bayes Network classifier. Provides datastructures (network structure, conditional probability distributions, etc.) and facilities common to Bayes Network learning algorithms like K2 and B. Works with nominal variables and no missing values only.

Version:
$Revision: 1.10 $
Author:
Remco Bouckaert (rrb@xm.co.nz)
See Also:
Serialized Form

Field Summary
(package private)  ADNode m_ADTree
           
(package private)  boolean m_bInitAsNaiveBayes
          determines whether initial structure is an empty graph or a Naive Bayes network
(package private)  boolean m_bUseADTree
          Use the experimental ADTree datastructure for calculating contingency tables
protected  Estimator[][] m_Distributions
          The attribute estimators containing CPTs.
(package private)  double m_fAlpha
          Holds prior on count
 Instances m_Instances
          The dataset header for the purposes of printing out a semi-intelligible model
(package private)  int m_nMaxNrOfParents
          Holds upper bound on number of parents
protected  int[] m_nOrder
          topological ordering of the network
(package private)  int m_nScoreType
          Holds the score type used to measure quality of network
protected  int m_NumClasses
          The number of classes
protected  ParentSet[] m_ParentSets
          The parent sets.
static Tag[] TAGS_SCORE_TYPE
           
 
Fields inherited from class weka.classifiers.Classifier
m_Debug
 
Fields inherited from interface weka.core.Drawable
BayesNet, NOT_DRAWABLE, TREE
 
Constructor Summary
BayesNet()
           
 
Method Summary
 java.lang.String alphaTipText()
           
 void buildClassifier(Instances instances)
          Generates the classifier.
 void buildStructure()
          buildStructure determines the network structure/graph of the network.
protected  double CalcNodeScore(int nNode)
          Calc Node Score for given parent set
private  double CalcNodeScore(int nNode, Instances instances)
           
private  double CalcNodeScoreADTree(int nNode, Instances instances)
          helper function for CalcNodeScore above using the ADTree data structure
protected  double CalcScoreOfCounts(int[] nCounts, int nCardinality, int numValues, Instances instances)
          utility function used by CalcScore and CalcNodeScore to determine the score based on observed frequencies.
protected  double CalcScoreOfCounts2(int[][] nCounts, int nCardinality, int numValues, Instances instances)
           
protected  double CalcScoreWithExtraParent(int nNode, int nCandidateParent)
          Calc Node Score With AddedParent
 double[] countsForInstance(Instance instance)
          Calculates the counts for Dirichlet distribution for the class membership probabilities for the given test instance.
 double[] distributionForInstance(Instance instance)
          Calculates the class membership probabilities for the given test instance.
 void estimateCPTs()
          estimateCPTs estimates the conditional probability tables for the Bayes Net using the network structure.
 double getAlpha()
          Method declaration
 boolean getInitAsNaiveBayes()
          Method declaration
 int getMaxNrOfParents()
          Method declaration
 java.lang.String[] getOptions()
          Gets the current settings of the classifier.
 SelectedTag getScoreType()
          Method declaration
 boolean getUseADTree()
          Method declaration
 java.lang.String graph()
          Returns a BayesNet graph in XMLBIF ver 0.3 format.
 int graphType()
          Returns the type of graph this classifier represents.
 java.lang.String initAsNaiveBayesTipText()
           
 void initStructure()
          Init structure initializes the structure to an empty graph or a Naive Bayes graph (depending on the -N flag).
 java.util.Enumeration listOptions()
          Returns an enumeration describing the available options
 double logScore(int nType)
          logScore returns the log of the quality of a network (e.g. the posterior probability of the network, or the MDL value).
static void main(java.lang.String[] argv)
          Main method for testing this class.
 java.lang.String maxNrOfParentsTipText()
           
 java.lang.String scoreTypeTipText()
           
 void setAlpha(double fAlpha)
          Method declaration
 void setInitAsNaiveBayes(boolean bInitAsNaiveBayes)
          Method declaration
 void setMaxNrOfParents(int nMaxNrOfParents)
          Method declaration
 void setOptions(java.lang.String[] options)
          Parses a given list of options.
 void setScoreType(SelectedTag newScoreType)
          Method declaration
 void setUseADTree(boolean bUseADTree)
          Method declaration
 java.lang.String toString()
          Returns a description of the classifier.
 java.lang.String toXMLBIF03()
          Returns a description of the classifier in XML BIF 0.3 format.
 void updateClassifier(Instance instance)
          Updates the classifier with the given instance.
 java.lang.String useADTreeTipText()
           
(package private)  java.lang.String XMLNormalize(java.lang.String sStr)
          XMLNormalize converts the five standard XML entities in a string g.e. the string V&D's is returned as V&D's
 
Methods inherited from class weka.classifiers.Classifier
classifyInstance, debugTipText, forName, getDebug, makeCopies, setDebug
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

m_nOrder

protected int[] m_nOrder
topological ordering of the network


m_ParentSets

protected ParentSet[] m_ParentSets
The parent sets.


m_Distributions

protected Estimator[][] m_Distributions
The attribute estimators containing CPTs.


m_NumClasses

protected int m_NumClasses
The number of classes


m_Instances

public Instances m_Instances
The dataset header for the purposes of printing out a semi-intelligible model


m_ADTree

ADNode m_ADTree

TAGS_SCORE_TYPE

public static final Tag[] TAGS_SCORE_TYPE

m_nScoreType

int m_nScoreType
Holds the score type used to measure quality of network


m_fAlpha

double m_fAlpha
Holds prior on count


m_nMaxNrOfParents

int m_nMaxNrOfParents
Holds upper bound on number of parents


m_bInitAsNaiveBayes

boolean m_bInitAsNaiveBayes
determines whether initial structure is an empty graph or a Naive Bayes network


m_bUseADTree

boolean m_bUseADTree
Use the experimental ADTree datastructure for calculating contingency tables

Constructor Detail

BayesNet

public BayesNet()
Method Detail

buildClassifier

public void buildClassifier(Instances instances)
                     throws java.lang.Exception
Generates the classifier.

Specified by:
buildClassifier in class Classifier
Parameters:
instances - set of instances serving as training data
Throws:
java.lang.Exception - if the classifier has not been generated successfully

initStructure

public void initStructure()
                   throws java.lang.Exception
Init structure initializes the structure to an empty graph or a Naive Bayes graph (depending on the -N flag).

Throws:
java.lang.Exception

buildStructure

public void buildStructure()
                    throws java.lang.Exception
buildStructure determines the network structure/graph of the network. The default behavior is creating a network where all nodes have the first node as its parent (i.e., a BayesNet that behaves like a naive Bayes classifier). This method can be overridden by derived classes to restrict the class of network structures that are acceptable.

Throws:
java.lang.Exception

estimateCPTs

public void estimateCPTs()
                  throws java.lang.Exception
estimateCPTs estimates the conditional probability tables for the Bayes Net using the network structure.

Throws:
java.lang.Exception

updateClassifier

public void updateClassifier(Instance instance)
                      throws java.lang.Exception
Updates the classifier with the given instance.

Parameters:
instance - the new training instance to include in the model
Throws:
java.lang.Exception - if the instance could not be incorporated in the model.

distributionForInstance

public double[] distributionForInstance(Instance instance)
                                 throws java.lang.Exception
Calculates the class membership probabilities for the given test instance.

Overrides:
distributionForInstance in class Classifier
Parameters:
instance - the instance to be classified
Returns:
predicted class probability distribution
Throws:
java.lang.Exception - if there is a problem generating the prediction

countsForInstance

public double[] countsForInstance(Instance instance)
                           throws java.lang.Exception
Calculates the counts for Dirichlet distribution for the class membership probabilities for the given test instance.

Parameters:
instance - the instance to be classified
Returns:
counts for Dirichlet distribution for class probability
Throws:
java.lang.Exception - if there is a problem generating the prediction

listOptions

public java.util.Enumeration listOptions()
Returns an enumeration describing the available options

Specified by:
listOptions in interface OptionHandler
Overrides:
listOptions in class Classifier
Returns:
an enumeration of all the available options

setOptions

public void setOptions(java.lang.String[] options)
                throws java.lang.Exception
Parses a given list of options. Valid options are:

Specified by:
setOptions in interface OptionHandler
Overrides:
setOptions in class Classifier
Parameters:
options - the list of options as an array of strings
Throws:
java.lang.Exception - if an option is not supported

setScoreType

public void setScoreType(SelectedTag newScoreType)
Method declaration


getScoreType

public SelectedTag getScoreType()
Method declaration

Returns:

setAlpha

public void setAlpha(double fAlpha)
Method declaration

Parameters:
fAlpha -

getAlpha

public double getAlpha()
Method declaration

Returns:

setInitAsNaiveBayes

public void setInitAsNaiveBayes(boolean bInitAsNaiveBayes)
Method declaration

Parameters:
bInitAsNaiveBayes -

getInitAsNaiveBayes

public boolean getInitAsNaiveBayes()
Method declaration

Returns:

setUseADTree

public void setUseADTree(boolean bUseADTree)
Method declaration

Parameters:
bUseADTree -

getUseADTree

public boolean getUseADTree()
Method declaration

Returns:

setMaxNrOfParents

public void setMaxNrOfParents(int nMaxNrOfParents)
Method declaration

Parameters:
nMaxNrOfParents -

getMaxNrOfParents

public int getMaxNrOfParents()
Method declaration

Returns:

getOptions

public java.lang.String[] getOptions()
Gets the current settings of the classifier.

Specified by:
getOptions in interface OptionHandler
Overrides:
getOptions in class Classifier
Returns:
an array of strings suitable for passing to setOptions

logScore

public double logScore(int nType)
logScore returns the log of the quality of a network (e.g. the posterior probability of the network, or the MDL value).

Parameters:
nType - score type (Bayes, MDL, etc) to calculate score with
Returns:
log score.

toString

public java.lang.String toString()
Returns a description of the classifier.

Returns:
a description of the classifier as a string.

graphType

public int graphType()
Returns the type of graph this classifier represents.

Specified by:
graphType in interface Drawable
Returns:
Drawable.TREE

graph

public java.lang.String graph()
                       throws java.lang.Exception
Returns a BayesNet graph in XMLBIF ver 0.3 format.

Specified by:
graph in interface Drawable
Returns:
- String representing this BayesNet in XMLBIF ver 0.3
Throws:
java.lang.Exception - if the graph can't be computed

toXMLBIF03

public java.lang.String toXMLBIF03()
Returns a description of the classifier in XML BIF 0.3 format. See http://www-2.cs.cmu.edu/~fgcozman/Research/InterchangeFormat/ for details on XML BIF.

Returns:
an XML BIF 0.3 description of the classifier as a string.

XMLNormalize

java.lang.String XMLNormalize(java.lang.String sStr)
XMLNormalize converts the five standard XML entities in a string g.e. the string V&D's is returned as V&D's

Parameters:
sStr - string to normalize
Returns:
normalized string

CalcScoreWithExtraParent

protected double CalcScoreWithExtraParent(int nNode,
                                          int nCandidateParent)
Calc Node Score With AddedParent

Parameters:
nNode - node for which the score is calculate
nCandidateParent - candidate parent to add to the existing parent set
Returns:
log score

CalcNodeScore

protected double CalcNodeScore(int nNode)
Calc Node Score for given parent set

Parameters:
nNode - node for which the score is calculate
Returns:
log score

CalcNodeScoreADTree

private double CalcNodeScoreADTree(int nNode,
                                   Instances instances)
helper function for CalcNodeScore above using the ADTree data structure

Parameters:
nNode - node for which the score is calculate
instances - used to calculate score with
Returns:
log score

CalcNodeScore

private double CalcNodeScore(int nNode,
                             Instances instances)

CalcScoreOfCounts

protected double CalcScoreOfCounts(int[] nCounts,
                                   int nCardinality,
                                   int numValues,
                                   Instances instances)
utility function used by CalcScore and CalcNodeScore to determine the score based on observed frequencies.

Parameters:
nCounts - array with observed frequencies
nCardinality - ardinality of parent set
numValues - number of values a node can take
instances - to calc score with
Returns:
log score

CalcScoreOfCounts2

protected double CalcScoreOfCounts2(int[][] nCounts,
                                    int nCardinality,
                                    int numValues,
                                    Instances instances)

scoreTypeTipText

public java.lang.String scoreTypeTipText()
Returns:
a string to describe the ScoreType option.

alphaTipText

public java.lang.String alphaTipText()
Returns:
a string to describe the Alpha option.

initAsNaiveBayesTipText

public java.lang.String initAsNaiveBayesTipText()
Returns:
a string to describe the InitAsNaiveBayes option.

useADTreeTipText

public java.lang.String useADTreeTipText()
Returns:
a string to describe the UseADTreeoption.

maxNrOfParentsTipText

public java.lang.String maxNrOfParentsTipText()
Returns:
a string to describe the MaxNrOfParentsoption.

main

public static void main(java.lang.String[] argv)
Main method for testing this class.

Parameters:
argv - the options