weka.classifiers.functions
Class PaceRegression

java.lang.Object
  extended byweka.classifiers.Classifier
      extended byweka.classifiers.functions.PaceRegression
All Implemented Interfaces:
java.lang.Cloneable, OptionHandler, java.io.Serializable, WeightedInstancesHandler

public class PaceRegression
extends Classifier
implements OptionHandler, WeightedInstancesHandler

Class for building pace regression linear models and using them for prediction.

Under regularity conditions, pace regression is provably optimal when the number of coefficients tends to infinity. It consists of a group of estimators that are either overall optimal or optimal under certain conditions.

The current work of the pace regression theory, and therefore also this implementation, do not handle:

- missing values
- non-binary nominal attributes
- the case that n - k is small where n is number of instances and k is number of coefficients (the threshold used in this implmentation is 20)

Valid options are:

-D
Produce debugging output.

-E estimator
The estimator can be one of the following:

-S Threshold for the olsc estimator

REFERENCES

Wang, Y. (2000). "A new approach to fitting linear models in high dimensional spaces." PhD Thesis. Department of Computer Science, University of Waikato, New Zealand.

Wang, Y. and Witten, I. H. (2002). "Modeling for optimal probability prediction." Proceedings of ICML'2002. Sydney.

Version:
$Revision: 1.1 $
Author:
Yong Wang (yongwang@cs.waikato.ac.nz), Gabi Schmidberger (gabi@cs.waikato.ac.nz)
See Also:
Serialized Form

Field Summary
private static int aicEstimator
           
private static int bicEstimator
           
private static int ebEstimator
           
private  int m_ClassIndex
          The index of the class attribute
private  double[] m_Coefficients
          Array for storing coefficients of linear regression.
private  boolean m_Debug
          True if debug output will be printed
(package private)  Instances m_Model
          The model used
private static int nestedEstimator
           
private static int olscEstimator
           
private  double olscThreshold
           
private static int olsEstimator
           
private static int pace2Estimator
           
private static int pace4Estimator
           
private static int pace6Estimator
           
private  int paceEstimator
           
private static int ricEstimator
           
private static int subsetEstimator
           
static Tag[] TAGS_ESTIMATOR
           
 
Constructor Summary
PaceRegression()
           
 
Method Summary
 void buildClassifier(Instances data)
          Builds a pace regression model for the given data.
 boolean checkForMissing(Instance instance, Instances model)
          Checks if an instance has a missing value.
 boolean checkForMissing(Instances data)
          Checks if instances have a missing value.
 boolean checkForNonBinary(Instances data)
          Checks if any of the nominal attributes is non-binary.
 double classifyInstance(Instance instance)
          Classifies the given instance using the linear regression function.
 double[] coefficients()
          Returns the coefficients for this linear model.
 java.lang.String debugTipText()
          Returns the tip text for this property
 java.lang.String estimatorTipText()
          Returns the tip text for this property
 boolean getDebug()
          Controls whether debugging output will be printed
 SelectedTag getEstimator()
          Gets the estimator
 java.lang.String[] getOptions()
          Gets the current settings of the classifier.
 double getThreshold()
          Gets the threshold for olsc estimator
private  double[][] getTransformedDataMatrix(Instances data, int classIndex)
          Transforms dataset into a two-dimensional array.
 java.lang.String globalInfo()
          Returns a string describing this classifier
 java.util.Enumeration listOptions()
          Returns an enumeration describing the available options.
static void main(java.lang.String[] argv)
          Generates a linear regression function predictor.
 int numParameters()
          Get the number of coefficients used in the model
private  double[] pace(double[][] matrix_X, double[] vector_Y)
          pace regression
private  double regressionPrediction(Instance transformedInstance, double[] coefficients)
          Calculate the dependent value for a given instance for a given regression model.
 void setDebug(boolean debug)
          Controls whether debugging output will be printed
 void setEstimator(SelectedTag estimator)
          Sets the estimator.
 void setOptions(java.lang.String[] options)
          Parses a given list of options.
 void setThreshold(double newThreshold)
          Set threshold for the olsc estimator
 java.lang.String thresholdTipText()
          Returns the tip text for this property
 java.lang.String toString()
          Outputs the linear regression model as a string.
 
Methods inherited from class weka.classifiers.Classifier
distributionForInstance, forName, makeCopies
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

m_Model

Instances m_Model
The model used


m_Coefficients

private double[] m_Coefficients
Array for storing coefficients of linear regression.


m_ClassIndex

private int m_ClassIndex
The index of the class attribute


m_Debug

private boolean m_Debug
True if debug output will be printed


olsEstimator

private static final int olsEstimator
See Also:
Constant Field Values

ebEstimator

private static final int ebEstimator
See Also:
Constant Field Values

nestedEstimator

private static final int nestedEstimator
See Also:
Constant Field Values

subsetEstimator

private static final int subsetEstimator
See Also:
Constant Field Values

pace2Estimator

private static final int pace2Estimator
See Also:
Constant Field Values

pace4Estimator

private static final int pace4Estimator
See Also:
Constant Field Values

pace6Estimator

private static final int pace6Estimator
See Also:
Constant Field Values

olscEstimator

private static final int olscEstimator
See Also:
Constant Field Values

aicEstimator

private static final int aicEstimator
See Also:
Constant Field Values

bicEstimator

private static final int bicEstimator
See Also:
Constant Field Values

ricEstimator

private static final int ricEstimator
See Also:
Constant Field Values

TAGS_ESTIMATOR

public static final Tag[] TAGS_ESTIMATOR

paceEstimator

private int paceEstimator

olscThreshold

private double olscThreshold
Constructor Detail

PaceRegression

public PaceRegression()
Method Detail

globalInfo

public java.lang.String globalInfo()
Returns a string describing this classifier

Returns:
a description of the classifier suitable for displaying in the explorer/experimenter gui

buildClassifier

public void buildClassifier(Instances data)
                     throws java.lang.Exception
Builds a pace regression model for the given data.

Specified by:
buildClassifier in class Classifier
Parameters:
data - the training data to be used for generating the linear regression function
Throws:
java.lang.Exception - if the classifier could not be built successfully

pace

private double[] pace(double[][] matrix_X,
                      double[] vector_Y)
pace regression

Parameters:
matrix_X - matrix with observations
vector_Y - vektor with class values
Returns:
vector with coefficients
Throws:
java.lang.Exception - if pace regression cannot be done successfully

checkForMissing

public boolean checkForMissing(Instances data)
Checks if instances have a missing value.

Parameters:
data - the data set
Returns:
true if missing value is present in data set

checkForMissing

public boolean checkForMissing(Instance instance,
                               Instances model)
Checks if an instance has a missing value.

Parameters:
instance - the instance
Returns:
true if missing value is present

checkForNonBinary

public boolean checkForNonBinary(Instances data)
Checks if any of the nominal attributes is non-binary.

Parameters:
data - the data set
Returns:
true if non binary attribute is present

getTransformedDataMatrix

private double[][] getTransformedDataMatrix(Instances data,
                                            int classIndex)
Transforms dataset into a two-dimensional array.

Parameters:
data - dataset
classIndex - index of the class attribute

classifyInstance

public double classifyInstance(Instance instance)
                        throws java.lang.Exception
Classifies the given instance using the linear regression function.

Overrides:
classifyInstance in class Classifier
Parameters:
instance - the test instance
Returns:
the classification
Throws:
java.lang.Exception - if classification can't be done successfully

toString

public java.lang.String toString()
Outputs the linear regression model as a string.


listOptions

public java.util.Enumeration listOptions()
Returns an enumeration describing the available options.

Specified by:
listOptions in interface OptionHandler
Overrides:
listOptions in class Classifier
Returns:
an enumeration of all the available options.

setOptions

public void setOptions(java.lang.String[] options)
                throws java.lang.Exception
Parses a given list of options.

Specified by:
setOptions in interface OptionHandler
Overrides:
setOptions in class Classifier
Parameters:
options - the list of options as an array of strings
Throws:
java.lang.Exception - if an option is not supported

coefficients

public double[] coefficients()
Returns the coefficients for this linear model.


getOptions

public java.lang.String[] getOptions()
Gets the current settings of the classifier.

Specified by:
getOptions in interface OptionHandler
Overrides:
getOptions in class Classifier
Returns:
an array of strings suitable for passing to setOptions

numParameters

public int numParameters()
Get the number of coefficients used in the model

Returns:
the number of coefficients

debugTipText

public java.lang.String debugTipText()
Returns the tip text for this property

Overrides:
debugTipText in class Classifier
Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setDebug

public void setDebug(boolean debug)
Controls whether debugging output will be printed

Overrides:
setDebug in class Classifier
Parameters:
debug - true if debugging output should be printed

getDebug

public boolean getDebug()
Controls whether debugging output will be printed

Overrides:
getDebug in class Classifier
Returns:
true if debugging output is on

estimatorTipText

public java.lang.String estimatorTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

getEstimator

public SelectedTag getEstimator()
Gets the estimator

Returns:
the estimator

setEstimator

public void setEstimator(SelectedTag estimator)
Sets the estimator.

Parameters:
estimator - the new estimator

thresholdTipText

public java.lang.String thresholdTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setThreshold

public void setThreshold(double newThreshold)
Set threshold for the olsc estimator


getThreshold

public double getThreshold()
Gets the threshold for olsc estimator

Returns:
the threshold

regressionPrediction

private double regressionPrediction(Instance transformedInstance,
                                    double[] coefficients)
                             throws java.lang.Exception
Calculate the dependent value for a given instance for a given regression model.

Parameters:
transformedInstance - the input instance
coefficients - an array of coefficients for the regression model
Returns:
the regression value for the instance.
Throws:
java.lang.Exception - if the class attribute of the input instance is not assigned

main

public static void main(java.lang.String[] argv)
Generates a linear regression function predictor.