weka.clusterers
Class MakeDensityBasedClusterer

java.lang.Object
  extended byweka.clusterers.Clusterer
      extended byweka.clusterers.DensityBasedClusterer
          extended byweka.clusterers.MakeDensityBasedClusterer
All Implemented Interfaces:
java.lang.Cloneable, OptionHandler, java.io.Serializable, WeightedInstancesHandler

public class MakeDensityBasedClusterer
extends DensityBasedClusterer
implements OptionHandler, WeightedInstancesHandler

Class for wrapping a Clusterer to make it return a distribution and density. Fits normal distributions and discrete distributions within each cluster produced by the wrapped clusterer.

Version:
$Revision: 1.1 $
Author:
Richard Kirkby (rkirkby@cs.waikato.ac.nz), Mark Hall (mhall@cs.waikato.ac.nz), Eibe Frank (eibe@cs.waikato.ac.nz)
See Also:
Serialized Form

Field Summary
private  double m_minStdDev
          default minimum standard deviation
private  DiscreteEstimator[][] m_model
          discrete distributions fitted to each discrete attribute in each cluster
private  double[][][] m_modelNormal
          normal distributions fitted to each numeric attribute in each cluster
private static double m_normConst
          Constant for normal distribution.
private  double[] m_priors
          prior probabilities for the fitted clusters
private  Instances m_theInstances
          holds training instances header information
private  Clusterer m_wrappedClusterer
          The clusterer being wrapped
 
Constructor Summary
MakeDensityBasedClusterer()
          Default constructor.
MakeDensityBasedClusterer(Clusterer toWrap)
          Contructs a MakeDensityBasedClusterer wrapping a given Clusterer.
 
Method Summary
 void buildClusterer(Instances data)
          Builds a clusterer for a set of instances.
 double[] clusterPriors()
          Returns the cluster priors.
 Clusterer getClusterer()
          Gets the clusterer being wrapped.
 double getMinStdDev()
          Get the minimum allowable standard deviation.
 java.lang.String[] getOptions()
          Gets the current settings of the clusterer.
 java.util.Enumeration listOptions()
          Returns an enumeration describing the available options..
 double[] logDensityPerClusterForInstance(Instance inst)
          Computes the log of the conditional density (per cluster) for a given instance.
private  double logNormalDens(double x, double mean, double stdDev)
          Density function of normal distribution.
static void main(java.lang.String[] argv)
          Main method for testing this class.
 java.lang.String minStdDevTipText()
          Returns the tip text for this property
 int numberOfClusters()
          Returns the number of clusters.
 void setClusterer(Clusterer toWrap)
          Sets the clusterer to wrap.
 void setMinStdDev(double m)
          Set the minimum value for standard deviation when calculating normal density.
 void setOptions(java.lang.String[] options)
          Parses a given list of options.
 java.lang.String toString()
          Returns a description of the clusterer.
 
Methods inherited from class weka.clusterers.DensityBasedClusterer
distributionForInstance, logDensityForInstance, logJointDensitiesForInstance
 
Methods inherited from class weka.clusterers.Clusterer
clusterInstance, forName, makeCopies
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

m_theInstances

private Instances m_theInstances
holds training instances header information


m_priors

private double[] m_priors
prior probabilities for the fitted clusters


m_modelNormal

private double[][][] m_modelNormal
normal distributions fitted to each numeric attribute in each cluster


m_model

private DiscreteEstimator[][] m_model
discrete distributions fitted to each discrete attribute in each cluster


m_minStdDev

private double m_minStdDev
default minimum standard deviation


m_wrappedClusterer

private Clusterer m_wrappedClusterer
The clusterer being wrapped


m_normConst

private static double m_normConst
Constant for normal distribution.

Constructor Detail

MakeDensityBasedClusterer

public MakeDensityBasedClusterer()
Default constructor.


MakeDensityBasedClusterer

public MakeDensityBasedClusterer(Clusterer toWrap)
Contructs a MakeDensityBasedClusterer wrapping a given Clusterer.

Parameters:
toWrap - the clusterer to wrap around
Method Detail

buildClusterer

public void buildClusterer(Instances data)
                    throws java.lang.Exception
Builds a clusterer for a set of instances.

Specified by:
buildClusterer in class Clusterer
Parameters:
data - set of instances serving as training data
Throws:
java.lang.Exception - if the clusterer hasn't been set or something goes wrong

clusterPriors

public double[] clusterPriors()
Returns the cluster priors.

Specified by:
clusterPriors in class DensityBasedClusterer
Returns:
the prior probability for each cluster

logDensityPerClusterForInstance

public double[] logDensityPerClusterForInstance(Instance inst)
                                         throws java.lang.Exception
Computes the log of the conditional density (per cluster) for a given instance.

Specified by:
logDensityPerClusterForInstance in class DensityBasedClusterer
Parameters:
inst - the instance to compute the density for
Returns:
the density.
Throws:
java.lang.Exception - if the density could not be computed successfully

logNormalDens

private double logNormalDens(double x,
                             double mean,
                             double stdDev)
Density function of normal distribution.

Parameters:
x - input value
mean - mean of distribution
stdDev - standard deviation of distribution

numberOfClusters

public int numberOfClusters()
                     throws java.lang.Exception
Returns the number of clusters.

Specified by:
numberOfClusters in class Clusterer
Returns:
the number of clusters generated for a training dataset.
Throws:
java.lang.Exception - if number of clusters could not be returned successfully

toString

public java.lang.String toString()
Returns a description of the clusterer.

Returns:
a string containing a description of the clusterer

setClusterer

public void setClusterer(Clusterer toWrap)
Sets the clusterer to wrap.

Parameters:
toWrap - the clusterer

getClusterer

public Clusterer getClusterer()
Gets the clusterer being wrapped.

Returns:
the clusterer

minStdDevTipText

public java.lang.String minStdDevTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setMinStdDev

public void setMinStdDev(double m)
Set the minimum value for standard deviation when calculating normal density. Reducing this value can help prevent arithmetic overflow resulting from multiplying large densities (arising from small standard deviations) when there are many singleton or near singleton values.

Parameters:
m - minimum value for standard deviation

getMinStdDev

public double getMinStdDev()
Get the minimum allowable standard deviation.

Returns:
the minumum allowable standard deviation

listOptions

public java.util.Enumeration listOptions()
Returns an enumeration describing the available options..

Specified by:
listOptions in interface OptionHandler
Returns:
an enumeration of all the available options.

setOptions

public void setOptions(java.lang.String[] options)
                throws java.lang.Exception
Parses a given list of options. Valid options are:

-W clusterer name
Clusterer to wrap. (required)

-M
Set the minimum allowable standard deviation for normal density calculation.

Specified by:
setOptions in interface OptionHandler
Parameters:
options - the list of options as an array of strings
Throws:
java.lang.Exception - if an option is not supported

getOptions

public java.lang.String[] getOptions()
Gets the current settings of the clusterer.

Specified by:
getOptions in interface OptionHandler
Returns:
an array of strings suitable for passing to setOptions()

main

public static void main(java.lang.String[] argv)
Main method for testing this class.

Parameters:
argv - the options