weka.classifiers.functions
Class LinearRegression

java.lang.Object
  extended byweka.classifiers.Classifier
      extended byweka.classifiers.functions.LinearRegression
All Implemented Interfaces:
java.lang.Cloneable, OptionHandler, java.io.Serializable, WeightedInstancesHandler

public class LinearRegression
extends Classifier
implements OptionHandler, WeightedInstancesHandler

Class for using linear regression for prediction. Uses the Akaike criterion for model selection, and is able to deal with weighted instances.

Valid options are:

-D
Produce debugging output.

-S num
Set the attriute selection method to use. 1 = None, 2 = Greedy (default 0 = M5' method)

-C
Do not try to eliminate colinear attributes

-R num
The ridge parameter (default 1.0e-8)

Version:
$Revision: 1.19 $
Author:
Eibe Frank (eibe@cs.waikato.ac.nz), Len Trigg (trigg@cs.waikato.ac.nz)
See Also:
Serialized Form

Field Summary
private  boolean b_Debug
          True if debug output will be printed
private  int m_AttributeSelection
          The current attribute selection method
private  boolean m_checksTurnedOff
          Turn off all checks and conversions?
private  int m_ClassIndex
          The index of the class attribute
private  double m_ClassMean
          The mean of the class attribute
private  double m_ClassStdDev
          The standard deviations of the class attribute
private  double[] m_Coefficients
          Array for storing coefficients of linear regression.
private  boolean m_EliminateColinearAttributes
          Try to eliminate correlated attributes?
private  double[] m_Means
          The attributes means
private  ReplaceMissingValues m_MissingFilter
          The filter for removing missing values.
private  double m_Ridge
          The ridge parameter
private  boolean[] m_SelectedAttributes
          Which attributes are relevant?
private  double[] m_StdDevs
          The attribute standard deviations
private  Instances m_TransformedData
          Variable for storing transformed training data.
private  NominalToBinary m_TransformFilter
          The filter storing the transformation from nominal to binary attributes.
static int SELECTION_GREEDY
           
static int SELECTION_M5
           
static int SELECTION_NONE
           
static Tag[] TAGS_SELECTION
           
 
Fields inherited from class weka.classifiers.Classifier
m_Debug
 
Constructor Summary
LinearRegression()
           
 
Method Summary
 java.lang.String attributeSelectionMethodTipText()
          Returns the tip text for this property
 void buildClassifier(Instances data)
          Builds a regression model for the given data.
private  double calculateSE(boolean[] selectedAttributes, double[] coefficients)
          Calculate the squared error of a regression model on the training data
 double classifyInstance(Instance instance)
          Classifies the given instance using the linear regression function.
 double[] coefficients()
          Returns the coefficients for this linear model.
 java.lang.String debugTipText()
          Returns the tip text for this property
private  boolean deselectColinearAttributes(boolean[] selectedAttributes, double[] coefficients)
          Removes the attribute with the highest standardised coefficient greater than 1.5 from the selected attributes.
private  double[] doRegression(boolean[] selectedAttributes)
          Calculate a linear regression using the selected attributes
 java.lang.String eliminateColinearAttributesTipText()
          Returns the tip text for this property
private  void findBestModel()
          Performs a greedy search for the best regression model using Akaike's criterion.
 SelectedTag getAttributeSelectionMethod()
          Gets the method used to select attributes for use in the linear regression.
 boolean getDebug()
          Controls whether debugging output will be printed
 boolean getEliminateColinearAttributes()
          Get the value of EliminateColinearAttributes.
 java.lang.String[] getOptions()
          Gets the current settings of the classifier.
 double getRidge()
          Get the value of Ridge.
 java.lang.String globalInfo()
          Returns a string describing this classifier
 java.util.Enumeration listOptions()
          Returns an enumeration describing the available options.
static void main(java.lang.String[] argv)
          Generates a linear regression function predictor.
 int numParameters()
          Get the number of coefficients used in the model
private  double regressionPrediction(Instance transformedInstance, boolean[] selectedAttributes, double[] coefficients)
          Calculate the dependent value for a given instance for a given regression model.
 java.lang.String ridgeTipText()
          Returns the tip text for this property
 void setAttributeSelectionMethod(SelectedTag method)
          Sets the method used to select attributes for use in the linear regression.
 void setDebug(boolean debug)
          Controls whether debugging output will be printed
 void setEliminateColinearAttributes(boolean newEliminateColinearAttributes)
          Set the value of EliminateColinearAttributes.
 void setOptions(java.lang.String[] options)
          Parses a given list of options.
 void setRidge(double newRidge)
          Set the value of Ridge.
 java.lang.String toString()
          Outputs the linear regression model as a string.
 void turnChecksOff()
          Turns off checks for missing values, etc.
 void turnChecksOn()
          Turns on checks for missing values, etc.
 
Methods inherited from class weka.classifiers.Classifier
distributionForInstance, forName, makeCopies
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

m_Coefficients

private double[] m_Coefficients
Array for storing coefficients of linear regression.


m_SelectedAttributes

private boolean[] m_SelectedAttributes
Which attributes are relevant?


m_TransformedData

private Instances m_TransformedData
Variable for storing transformed training data.


m_MissingFilter

private ReplaceMissingValues m_MissingFilter
The filter for removing missing values.


m_TransformFilter

private NominalToBinary m_TransformFilter
The filter storing the transformation from nominal to binary attributes.


m_ClassStdDev

private double m_ClassStdDev
The standard deviations of the class attribute


m_ClassMean

private double m_ClassMean
The mean of the class attribute


m_ClassIndex

private int m_ClassIndex
The index of the class attribute


m_Means

private double[] m_Means
The attributes means


m_StdDevs

private double[] m_StdDevs
The attribute standard deviations


b_Debug

private boolean b_Debug
True if debug output will be printed


m_AttributeSelection

private int m_AttributeSelection
The current attribute selection method


SELECTION_M5

public static final int SELECTION_M5
See Also:
Constant Field Values

SELECTION_NONE

public static final int SELECTION_NONE
See Also:
Constant Field Values

SELECTION_GREEDY

public static final int SELECTION_GREEDY
See Also:
Constant Field Values

TAGS_SELECTION

public static final Tag[] TAGS_SELECTION

m_EliminateColinearAttributes

private boolean m_EliminateColinearAttributes
Try to eliminate correlated attributes?


m_checksTurnedOff

private boolean m_checksTurnedOff
Turn off all checks and conversions?


m_Ridge

private double m_Ridge
The ridge parameter

Constructor Detail

LinearRegression

public LinearRegression()
Method Detail

turnChecksOff

public void turnChecksOff()
Turns off checks for missing values, etc. Use with caution. Also turns off scaling.


turnChecksOn

public void turnChecksOn()
Turns on checks for missing values, etc. Also turns on scaling.


globalInfo

public java.lang.String globalInfo()
Returns a string describing this classifier

Returns:
a description of the classifier suitable for displaying in the explorer/experimenter gui

buildClassifier

public void buildClassifier(Instances data)
                     throws java.lang.Exception
Builds a regression model for the given data.

Specified by:
buildClassifier in class Classifier
Parameters:
data - the training data to be used for generating the linear regression function
Throws:
java.lang.Exception - if the classifier could not be built successfully

classifyInstance

public double classifyInstance(Instance instance)
                        throws java.lang.Exception
Classifies the given instance using the linear regression function.

Overrides:
classifyInstance in class Classifier
Parameters:
instance - the test instance
Returns:
the classification
Throws:
java.lang.Exception - if classification can't be done successfully

toString

public java.lang.String toString()
Outputs the linear regression model as a string.


listOptions

public java.util.Enumeration listOptions()
Returns an enumeration describing the available options.

Specified by:
listOptions in interface OptionHandler
Overrides:
listOptions in class Classifier
Returns:
an enumeration of all the available options.

setOptions

public void setOptions(java.lang.String[] options)
                throws java.lang.Exception
Parses a given list of options. Valid options are:

-D
Produce debugging output.

-S num
Set the attriute selection method to use. 1 = None, 2 = Greedy (default 0 = M5' method)

-C
Do not try to eliminate colinear attributes

-R num
The ridge parameter (default 1.0e-8)

Specified by:
setOptions in interface OptionHandler
Overrides:
setOptions in class Classifier
Parameters:
options - the list of options as an array of strings
Throws:
java.lang.Exception - if an option is not supported

coefficients

public double[] coefficients()
Returns the coefficients for this linear model.


getOptions

public java.lang.String[] getOptions()
Gets the current settings of the classifier.

Specified by:
getOptions in interface OptionHandler
Overrides:
getOptions in class Classifier
Returns:
an array of strings suitable for passing to setOptions

ridgeTipText

public java.lang.String ridgeTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

getRidge

public double getRidge()
Get the value of Ridge.

Returns:
Value of Ridge.

setRidge

public void setRidge(double newRidge)
Set the value of Ridge.

Parameters:
newRidge - Value to assign to Ridge.

eliminateColinearAttributesTipText

public java.lang.String eliminateColinearAttributesTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

getEliminateColinearAttributes

public boolean getEliminateColinearAttributes()
Get the value of EliminateColinearAttributes.

Returns:
Value of EliminateColinearAttributes.

setEliminateColinearAttributes

public void setEliminateColinearAttributes(boolean newEliminateColinearAttributes)
Set the value of EliminateColinearAttributes.

Parameters:
newEliminateColinearAttributes - Value to assign to EliminateColinearAttributes.

numParameters

public int numParameters()
Get the number of coefficients used in the model

Returns:
the number of coefficients

attributeSelectionMethodTipText

public java.lang.String attributeSelectionMethodTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setAttributeSelectionMethod

public void setAttributeSelectionMethod(SelectedTag method)
Sets the method used to select attributes for use in the linear regression.

Parameters:
method - the attribute selection method to use.

getAttributeSelectionMethod

public SelectedTag getAttributeSelectionMethod()
Gets the method used to select attributes for use in the linear regression.

Returns:
the method to use.

debugTipText

public java.lang.String debugTipText()
Returns the tip text for this property

Overrides:
debugTipText in class Classifier
Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setDebug

public void setDebug(boolean debug)
Controls whether debugging output will be printed

Overrides:
setDebug in class Classifier
Parameters:
debug - true if debugging output should be printed

getDebug

public boolean getDebug()
Controls whether debugging output will be printed

Overrides:
getDebug in class Classifier
Returns:
true if debugging output is on

deselectColinearAttributes

private boolean deselectColinearAttributes(boolean[] selectedAttributes,
                                           double[] coefficients)
Removes the attribute with the highest standardised coefficient greater than 1.5 from the selected attributes.

Parameters:
selectedAttributes - an array of flags indicating which attributes are included in the regression model
coefficients - an array of coefficients for the regression model
Returns:
true if an attribute was removed

findBestModel

private void findBestModel()
                    throws java.lang.Exception
Performs a greedy search for the best regression model using Akaike's criterion.

Throws:
java.lang.Exception - if regression can't be done

calculateSE

private double calculateSE(boolean[] selectedAttributes,
                           double[] coefficients)
                    throws java.lang.Exception
Calculate the squared error of a regression model on the training data

Parameters:
selectedAttributes - an array of flags indicating which attributes are included in the regression model
coefficients - an array of coefficients for the regression model
Returns:
the mean squared error on the training data
Throws:
java.lang.Exception - if there is a missing class value in the training data

regressionPrediction

private double regressionPrediction(Instance transformedInstance,
                                    boolean[] selectedAttributes,
                                    double[] coefficients)
                             throws java.lang.Exception
Calculate the dependent value for a given instance for a given regression model.

Parameters:
transformedInstance - the input instance
selectedAttributes - an array of flags indicating which attributes are included in the regression model
coefficients - an array of coefficients for the regression model
Returns:
the regression value for the instance.
Throws:
java.lang.Exception - if the class attribute of the input instance is not assigned

doRegression

private double[] doRegression(boolean[] selectedAttributes)
                       throws java.lang.Exception
Calculate a linear regression using the selected attributes

Parameters:
selectedAttributes - an array of booleans where each element is true if the corresponding attribute should be included in the regression.
Returns:
an array of coefficients for the linear regression model.
Throws:
java.lang.Exception - if an error occurred during the regression.

main

public static void main(java.lang.String[] argv)
Generates a linear regression function predictor.