weka.classifiers.meta
Class LogitBoost

java.lang.Object
  extended byweka.classifiers.Classifier
      extended byweka.classifiers.SingleClassifierEnhancer
          extended byweka.classifiers.IteratedSingleClassifierEnhancer
              extended byweka.classifiers.RandomizableIteratedSingleClassifierEnhancer
                  extended byweka.classifiers.meta.LogitBoost
All Implemented Interfaces:
java.lang.Cloneable, OptionHandler, Randomizable, java.io.Serializable, Sourcable, WeightedInstancesHandler

public class LogitBoost
extends RandomizableIteratedSingleClassifierEnhancer
implements Sourcable, WeightedInstancesHandler

Class for performing additive logistic regression.. This class performs classification using a regression scheme as the base learner, and can handle multi-class problems. For more information, see

Friedman, J., T. Hastie and R. Tibshirani (1998) Additive Logistic Regression: a Statistical View of Boosting download postscript.

Valid options are:

-D
Turn on debugging output.

-W classname
Specify the full class name of a weak learner as the basis for boosting (required).

-I num
Set the number of boost iterations (default 10).

-Q
Use resampling instead of reweighting.

-S seed
Random number seed for resampling (default 1).

-P num
Set the percentage of weight mass used to build classifiers (default 100).

-F num
Set number of folds for the internal cross-validation (default 0 -- no cross-validation).

-R num
Set number of runs for the internal cross-validation (default 1).

-L num
Set the threshold for the improvement of the average loglikelihood (default -Double.MAX_VALUE).

-H num
Set the value of the shrinkage parameter (default 1).

Options after -- are passed to the designated learner.

Version:
$Revision: 1.31 $
Author:
Len Trigg (trigg@cs.waikato.ac.nz), Eibe Frank (eibe@cs.waikato.ac.nz)
See Also:
Serialized Form

Field Summary
protected  Attribute m_ClassAttribute
          The actual class attribute (for getting class names)
protected  Classifier[][] m_Classifiers
          Array for storing the generated base classifiers.
protected  int m_NumClasses
          The number of classes
protected  Instances m_NumericClassData
          Dummy dataset with a numeric class
protected  int m_NumFolds
          The number of folds for the internal cross-validation.
protected  int m_NumGenerated
          The number of successfully generated base classifiers.
protected  int m_NumRuns
          The number of runs for the internal cross-validation.
protected  double m_Offset
          The value by which the actual target value for the true class is offset.
protected  double m_Precision
          The threshold on the improvement of the likelihood
protected  java.util.Random m_RandomInstance
          The random number generator used
protected  double m_Shrinkage
          The value of the shrinkage parameter
protected  boolean m_UseResampling
          Use boosting with reweighting?
protected  int m_WeightThreshold
          Weight thresholding.
protected static double Z_MAX
          A threshold for responses (Friedman suggests between 2 and 4)
 
Fields inherited from class weka.classifiers.RandomizableIteratedSingleClassifierEnhancer
m_Seed
 
Fields inherited from class weka.classifiers.IteratedSingleClassifierEnhancer
m_NumIterations
 
Fields inherited from class weka.classifiers.SingleClassifierEnhancer
m_Classifier
 
Fields inherited from class weka.classifiers.Classifier
m_Debug
 
Constructor Summary
LogitBoost()
          Constructor.
 
Method Summary
 void buildClassifier(Instances data)
          Builds the boosted classifier
 Classifier[][] classifiers()
          Returns the array of classifiers that have been built.
protected  java.lang.String defaultClassifierString()
          String describing default classifier.
 double[] distributionForInstance(Instance instance)
          Calculates the class membership probabilities for the given test instance.
 double getLikelihoodThreshold()
          Get the value of Precision.
 int getNumFolds()
          Get the value of NumFolds.
 int getNumRuns()
          Get the value of NumRuns.
 java.lang.String[] getOptions()
          Gets the current settings of the Classifier.
 double getShrinkage()
          Get the value of Shrinkage.
 boolean getUseResampling()
          Get whether resampling is turned on
 int getWeightThreshold()
          Get the degree of weight thresholding
 java.lang.String globalInfo()
          Returns a string describing classifier
private  double[][] initialProbs(int numInstances)
          Gets the intial class probabilities.
 java.lang.String likelihoodThresholdTipText()
          Returns the tip text for this property
 java.util.Enumeration listOptions()
          Returns an enumeration describing the available options.
private  double logLikelihood(double[][] trainYs, double[][] probs)
          Computes loglikelihood given class values and estimated probablities.
static void main(java.lang.String[] argv)
          Main method for testing this class.
 java.lang.String numFoldsTipText()
          Returns the tip text for this property
 java.lang.String numRunsTipText()
          Returns the tip text for this property
private  void performIteration(double[][] trainYs, double[][] trainFs, double[][] probs, Instances data, double origSumOfWeights)
          Performs one boosting iteration.
private  double[] probs(double[] Fs)
          Computes probabilities from F scores
protected  Instances selectWeightQuantile(Instances data, double quantile)
          Select only instances with weights that contribute to the specified quantile of the weight distribution
 void setLikelihoodThreshold(double newPrecision)
          Set the value of Precision.
 void setNumFolds(int newNumFolds)
          Set the value of NumFolds.
 void setNumRuns(int newNumRuns)
          Set the value of NumRuns.
 void setOptions(java.lang.String[] options)
          Parses a given list of options.
 void setShrinkage(double newShrinkage)
          Set the value of Shrinkage.
 void setUseResampling(boolean r)
          Set resampling mode
 void setWeightThreshold(int threshold)
          Set weight thresholding
 java.lang.String shrinkageTipText()
          Returns the tip text for this property
 java.lang.String toSource(java.lang.String className)
          Returns the boosted model as Java source code.
 java.lang.String toString()
          Returns description of the boosted classifier.
 java.lang.String useResamplingTipText()
          Returns the tip text for this property
 java.lang.String weightThresholdTipText()
          Returns the tip text for this property
 
Methods inherited from class weka.classifiers.RandomizableIteratedSingleClassifierEnhancer
getSeed, seedTipText, setSeed
 
Methods inherited from class weka.classifiers.IteratedSingleClassifierEnhancer
getNumIterations, numIterationsTipText, setNumIterations
 
Methods inherited from class weka.classifiers.SingleClassifierEnhancer
classifierTipText, getClassifier, getClassifierSpec, setClassifier
 
Methods inherited from class weka.classifiers.Classifier
classifyInstance, debugTipText, forName, getDebug, makeCopies, setDebug
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

m_Classifiers

protected Classifier[][] m_Classifiers
Array for storing the generated base classifiers. Note: we are hiding the variable from IteratedSingleClassifierEnhancer


m_NumClasses

protected int m_NumClasses
The number of classes


m_NumGenerated

protected int m_NumGenerated
The number of successfully generated base classifiers.


m_NumFolds

protected int m_NumFolds
The number of folds for the internal cross-validation.


m_NumRuns

protected int m_NumRuns
The number of runs for the internal cross-validation.


m_WeightThreshold

protected int m_WeightThreshold
Weight thresholding. The percentage of weight mass used in training


Z_MAX

protected static final double Z_MAX
A threshold for responses (Friedman suggests between 2 and 4)

See Also:
Constant Field Values

m_NumericClassData

protected Instances m_NumericClassData
Dummy dataset with a numeric class


m_ClassAttribute

protected Attribute m_ClassAttribute
The actual class attribute (for getting class names)


m_UseResampling

protected boolean m_UseResampling
Use boosting with reweighting?


m_Precision

protected double m_Precision
The threshold on the improvement of the likelihood


m_Shrinkage

protected double m_Shrinkage
The value of the shrinkage parameter


m_RandomInstance

protected java.util.Random m_RandomInstance
The random number generator used


m_Offset

protected double m_Offset
The value by which the actual target value for the true class is offset.

Constructor Detail

LogitBoost

public LogitBoost()
Constructor.

Method Detail

globalInfo

public java.lang.String globalInfo()
Returns a string describing classifier

Returns:
a description suitable for displaying in the explorer/experimenter gui

defaultClassifierString

protected java.lang.String defaultClassifierString()
String describing default classifier.

Overrides:
defaultClassifierString in class SingleClassifierEnhancer

selectWeightQuantile

protected Instances selectWeightQuantile(Instances data,
                                         double quantile)
Select only instances with weights that contribute to the specified quantile of the weight distribution

Parameters:
data - the input instances
quantile - the specified quantile eg 0.9 to select 90% of the weight mass
Returns:
the selected instances

listOptions

public java.util.Enumeration listOptions()
Returns an enumeration describing the available options.

Specified by:
listOptions in interface OptionHandler
Overrides:
listOptions in class RandomizableIteratedSingleClassifierEnhancer
Returns:
an enumeration of all the available options.

setOptions

public void setOptions(java.lang.String[] options)
                throws java.lang.Exception
Parses a given list of options. Valid options are:

-D
Turn on debugging output.

-W classname
Specify the full class name of a weak learner as the basis for boosting (required).

-I num
Set the number of boost iterations (default 10).

-Q
Use resampling instead of reweighting.

-S seed
Random number seed for resampling (default 1).

-P num
Set the percentage of weight mass used to build classifiers (default 100).

-F num
Set number of folds for the internal cross-validation (default 0 -- no cross-validation).

-R num
Set number of runs for the internal cross-validation (default 1.

-L num
Set the threshold for the improvement of the average loglikelihood (default -Double.MAX_VALUE).

-H num
Set the value of the shrinkage parameter (default 1).

Options after -- are passed to the designated learner.

Specified by:
setOptions in interface OptionHandler
Overrides:
setOptions in class RandomizableIteratedSingleClassifierEnhancer
Parameters:
options - the list of options as an array of strings
Throws:
java.lang.Exception - if an option is not supported

getOptions

public java.lang.String[] getOptions()
Gets the current settings of the Classifier.

Specified by:
getOptions in interface OptionHandler
Overrides:
getOptions in class RandomizableIteratedSingleClassifierEnhancer
Returns:
an array of strings suitable for passing to setOptions

shrinkageTipText

public java.lang.String shrinkageTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

getShrinkage

public double getShrinkage()
Get the value of Shrinkage.

Returns:
Value of Shrinkage.

setShrinkage

public void setShrinkage(double newShrinkage)
Set the value of Shrinkage.

Parameters:
newShrinkage - Value to assign to Shrinkage.

likelihoodThresholdTipText

public java.lang.String likelihoodThresholdTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

getLikelihoodThreshold

public double getLikelihoodThreshold()
Get the value of Precision.

Returns:
Value of Precision.

setLikelihoodThreshold

public void setLikelihoodThreshold(double newPrecision)
Set the value of Precision.

Parameters:
newPrecision - Value to assign to Precision.

numRunsTipText

public java.lang.String numRunsTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

getNumRuns

public int getNumRuns()
Get the value of NumRuns.

Returns:
Value of NumRuns.

setNumRuns

public void setNumRuns(int newNumRuns)
Set the value of NumRuns.

Parameters:
newNumRuns - Value to assign to NumRuns.

numFoldsTipText

public java.lang.String numFoldsTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

getNumFolds

public int getNumFolds()
Get the value of NumFolds.

Returns:
Value of NumFolds.

setNumFolds

public void setNumFolds(int newNumFolds)
Set the value of NumFolds.

Parameters:
newNumFolds - Value to assign to NumFolds.

useResamplingTipText

public java.lang.String useResamplingTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setUseResampling

public void setUseResampling(boolean r)
Set resampling mode


getUseResampling

public boolean getUseResampling()
Get whether resampling is turned on

Returns:
true if resampling output is on

weightThresholdTipText

public java.lang.String weightThresholdTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setWeightThreshold

public void setWeightThreshold(int threshold)
Set weight thresholding


getWeightThreshold

public int getWeightThreshold()
Get the degree of weight thresholding

Returns:
the percentage of weight mass used for training

buildClassifier

public void buildClassifier(Instances data)
                     throws java.lang.Exception
Builds the boosted classifier

Overrides:
buildClassifier in class IteratedSingleClassifierEnhancer
Parameters:
data - the training data to be used for generating the bagged classifier.
Throws:
java.lang.Exception - if the classifier could not be built successfully

initialProbs

private double[][] initialProbs(int numInstances)
Gets the intial class probabilities.


logLikelihood

private double logLikelihood(double[][] trainYs,
                             double[][] probs)
Computes loglikelihood given class values and estimated probablities.


performIteration

private void performIteration(double[][] trainYs,
                              double[][] trainFs,
                              double[][] probs,
                              Instances data,
                              double origSumOfWeights)
                       throws java.lang.Exception
Performs one boosting iteration.

Throws:
java.lang.Exception

classifiers

public Classifier[][] classifiers()
Returns the array of classifiers that have been built.


probs

private double[] probs(double[] Fs)
Computes probabilities from F scores


distributionForInstance

public double[] distributionForInstance(Instance instance)
                                 throws java.lang.Exception
Calculates the class membership probabilities for the given test instance.

Overrides:
distributionForInstance in class Classifier
Parameters:
instance - the instance to be classified
Returns:
predicted class probability distribution
Throws:
java.lang.Exception - if instance could not be classified successfully

toSource

public java.lang.String toSource(java.lang.String className)
                          throws java.lang.Exception
Returns the boosted model as Java source code.

Specified by:
toSource in interface Sourcable
Parameters:
className - the name that should be given to the source class.
Returns:
the tree as Java source code
Throws:
java.lang.Exception - if something goes wrong

toString

public java.lang.String toString()
Returns description of the boosted classifier.

Returns:
description of the boosted classifier as a string

main

public static void main(java.lang.String[] argv)
Main method for testing this class.

Parameters:
argv - the options