|
|||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectweka.classifiers.BVDecomposeSegCVSub
This class performs Bias-Variance decomposion on any classifier using the sub-sampled cross-validation procedure as specified in:
Geoffrey I. Webb & Paul Conilione (2002), Estimating bias and variance from data , School of Computer Science and Software Engineering, Monash University, Australia
The Kohavi and Wolpert definition of bias and variance is specified in:
R. Kohavi & D. Wolpert (1996), Bias plus variance decomposition for zero-one loss functions, in Proc. of the Thirteenth International Machine Learning Conference (ICML96) download postscript.
The Webb definition of bias and variance is specified in:
Geoffrey I. Webb (2000), MultiBoosting: A Technique for Combining Boosting and Wagging, Machine Learning, 40(2), pages 159-196
Valid options are:
-c num
Specify the index of the class attribute (default last).
-D
Turn on debugging output.
-l num
Set the number times each instance is to be classified (default 10).
-p num
Set the proportion of instances that are the same between any two
training sets. Training set size/(Dataset size - 1) < num < 1.0
(Default is Training set size/(Dataset size - 1) )
-s num
Set the seed for the dataset randomisation (default 1).
-t filename
Set the arff file to use for the decomposition (required).
-T num
Set the size of the training sets. Must be greater than 0 and
less size of the dataset. (default half of dataset size)
-W classname
Specify the full class name of a learner to perform the
decomposition on (required).
Options after -- are passed to the designated sub-learner.
Field Summary | |
protected Classifier |
m_Classifier
An instantiated base classifier used for getting and testing options. |
protected java.lang.String[] |
m_ClassifierOptions
The options to be passed to the base classifier. |
protected int |
m_ClassifyIterations
The number of times an instance is classified |
protected int |
m_ClassIndex
The index of the class attribute |
protected java.lang.String |
m_DataFileName
The name of the data file used for the decomposition |
protected boolean |
m_Debug
Debugging mode, gives extra output if true. |
protected double |
m_Error
The error rate |
protected double |
m_KWBias
The calculated Kohavi & Wolpert bias (squared) |
protected double |
m_KWSigma
The calculated Kohavi & Wolpert sigma |
protected double |
m_KWVariance
The calculated Kohavi & Wolpert variance |
protected double |
m_P
Proportion of instances common between any two training sets. |
protected int |
m_Seed
The random number seed |
protected int |
m_TrainSize
The training set size |
protected double |
m_WBias
The calculated Webb bias |
protected double |
m_WVariance
The calculated Webb variance |
Constructor Summary | |
BVDecomposeSegCVSub()
|
Method Summary | |
void |
decompose()
Carry out the bias-variance decomposition using the sub-sampled cross-validation method. |
java.util.Vector |
findCentralTendencies(double[] predProbs)
Finds the central tendency, given the classifications for an instance. |
Classifier |
getClassifier()
Gets the name of the classifier being analysed |
int |
getClassifyIterations()
Gets the number of times an instance is classified |
int |
getClassIndex()
Get the index (starting from 1) of the attribute used as the class. |
java.lang.String |
getDataFileName()
Get the name of the data file used for the decomposition |
boolean |
getDebug()
Gets whether debugging is turned on |
double |
getError()
Get the calculated error rate |
double |
getKWBias()
Get the calculated bias squared according to the Kohavi and Wolpert definition |
double |
getKWSigma()
Get the calculated sigma according to the Kohavi and Wolpert definition |
double |
getKWVariance()
Get the calculated variance according to the Kohavi and Wolpert definition |
java.lang.String[] |
getOptions()
Gets the current settings of the CheckClassifier. |
double |
getP()
Get the proportion of instances that are common between two training sets. |
int |
getSeed()
Gets the random number seed |
int |
getTrainSize()
Get the training size |
double |
getWBias()
Get the calculated bias according to the Webb definition |
double |
getWVariance()
Get the calculated variance according to the Webb definition |
java.util.Enumeration |
listOptions()
Returns an enumeration describing the available options. |
static void |
main(java.lang.String[] args)
Test method for this class |
void |
randomize(int[] index,
java.util.Random random)
Accepts an array of ints and randomises the values in the array, using the random seed. |
void |
setClassifier(Classifier newClassifier)
Set the classifiers being analysed |
void |
setClassifyIterations(int classifyIterations)
Sets the number of times an instance is classified |
void |
setClassIndex(int classIndex)
Sets index of attribute to discretize on |
void |
setDataFileName(java.lang.String dataFileName)
Sets the name of the dataset file. |
void |
setDebug(boolean debug)
Sets debugging mode |
void |
setOptions(java.lang.String[] options)
Sets the OptionHandler's options using the given list. |
void |
setP(double proportion)
Set the proportion of instances that are common between two training sets used to train a classifier. |
void |
setSeed(int seed)
Sets the random number seed |
void |
setTrainSize(int size)
Set the training size. |
java.lang.String |
toString()
Returns description of the bias-variance decomposition results. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Field Detail |
protected boolean m_Debug
protected Classifier m_Classifier
protected java.lang.String[] m_ClassifierOptions
protected int m_ClassifyIterations
protected java.lang.String m_DataFileName
protected int m_ClassIndex
protected int m_Seed
protected double m_KWBias
protected double m_KWVariance
protected double m_KWSigma
protected double m_WBias
protected double m_WVariance
protected double m_Error
protected int m_TrainSize
protected double m_P
Constructor Detail |
public BVDecomposeSegCVSub()
Method Detail |
public java.util.Enumeration listOptions()
listOptions
in interface OptionHandler
public void setOptions(java.lang.String[] options) throws java.lang.Exception
setOptions
in interface OptionHandler
options
- the list of options as an array of strings
java.lang.Exception
- if an option is not supportedpublic java.lang.String[] getOptions()
getOptions
in interface OptionHandler
public void setClassifier(Classifier newClassifier)
newClassifier
- the Classifier to use.public Classifier getClassifier()
public void setDebug(boolean debug)
debug
- true if debug output should be printedpublic boolean getDebug()
public void setSeed(int seed)
public int getSeed()
public void setClassifyIterations(int classifyIterations)
classifyIterations
- number of times an instance is classifiedpublic int getClassifyIterations()
public void setDataFileName(java.lang.String dataFileName)
dataFileName
- name of dataset file.public java.lang.String getDataFileName()
public int getClassIndex()
public void setClassIndex(int classIndex)
classIndex
- the index (starting from 1) of the class attributepublic double getKWBias()
public double getWBias()
public double getKWVariance()
public double getWVariance()
public double getKWSigma()
public void setTrainSize(int size)
size
- the size of the training setpublic int getTrainSize()
public void setP(double proportion)
proportion
- the proportion of instances that are common between training
sets.public double getP()
public double getError()
public void decompose() throws java.lang.Exception
java.lang.Exception
- if the decomposition couldn't be carried outpublic java.util.Vector findCentralTendencies(double[] predProbs)
For example, instance 'x' may be classified out of 3 classes y = {1, 2, 3}, so if x is classified 10 times, and is classified as follows, '1' = 2 times, '2' = 5 times and '3' = 3 times. Then the central tendency is '2'.
However, it is important to note that this method returns a list of all classes that have the highest number of classifications. In cases where there are several classes with the largest number of classifications, then all of these classes are returned. For example if 'x' is classified '1' = 4 times, '2' = 4 times and '3' = 2 times. Then '1' and '2' are returned.
predProbs
- the array of classifications for a single instance.
public java.lang.String toString()
public static void main(java.lang.String[] args)
args
- the command line argumentspublic final void randomize(int[] index, java.util.Random random)
index
- is the array of integersrandom
- is the Random seed.
|
|||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |