|
|||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectweka.classifiers.Classifier
weka.classifiers.meta.ThresholdSelector
Class for selecting a threshold on a probability output by a distribution classifier. The threshold is set so that a given performance measure is optimized. Currently this is the F-measure. Performance is measured either on the training data, a hold-out set or using cross-validation. In addition, the probabilities returned by the base learner can have their range expanded so that the output probabilities will reside between 0 and 1 (this is useful if the scheme normally produces probabilities in a very narrow range).
Valid options are:
-C num
The class for which threshold is determined. Valid values are:
1, 2 (for first and second classes, respectively), 3 (for whichever
class is least frequent), 4 (for whichever class value is most
frequent), and 5 (for the first class named any of "yes","pos(itive)",
"1", or method 3 if no matches). (default 5).
-W classname
Specify the full class name of the base classifier.
-X num
Number of folds used for cross validation. If just a
hold-out set is used, this determines the size of the hold-out set
(default 3).
-R integer
Sets whether confidence range correction is applied. This can be used
to ensure the confidences range from 0 to 1. Use 0 for no range correction,
1 for correction based on the min/max values seen during threshold selection
(default 0).
-S seed
Random number seed (default 1).
-E integer
Sets the evaluation mode. Use 0 for evaluation using cross-validation,
1 for evaluation using hold-out set, and 2 for evaluation on the
training data (default 1).
Options after -- are passed to the designated sub-classifier.
Field Summary | |
static int |
EVAL_CROSS_VALIDATION
|
static int |
EVAL_TRAINING_SET
|
static int |
EVAL_TUNED_SPLIT
|
protected double |
m_BestThreshold
The threshold that lead to the best performance |
protected double |
m_BestValue
The best value that has been observed |
protected Classifier |
m_Classifier
The generated base classifier |
protected int |
m_ClassMode
Method to determine which class to optimize for |
protected int |
m_DesignatedClass
Designated class value, determined during building |
protected int |
m_EvalMode
The evaluation mode |
protected double |
m_HighThreshold
The upper threshold used as the basis of correction |
protected double |
m_LowThreshold
The lower threshold used as the basis of correction |
protected int |
m_NumXValFolds
The number of folds used in cross-validation |
protected int |
m_RangeMode
The range correction mode |
protected int |
m_Seed
Random number seed |
protected static double |
MIN_VALUE
The minimum value for the criterion. |
static int |
OPTIMIZE_0
|
static int |
OPTIMIZE_1
|
static int |
OPTIMIZE_LFREQ
|
static int |
OPTIMIZE_MFREQ
|
static int |
OPTIMIZE_POS_NAME
|
static int |
RANGE_BOUNDS
|
static int |
RANGE_NONE
|
static Tag[] |
TAGS_EVAL
|
static Tag[] |
TAGS_OPTIMIZE
|
static Tag[] |
TAGS_RANGE
|
Fields inherited from class weka.classifiers.Classifier |
m_Debug |
Fields inherited from interface weka.core.Drawable |
BayesNet, NOT_DRAWABLE, TREE |
Constructor Summary | |
ThresholdSelector()
|
Method Summary | |
void |
buildClassifier(Instances instances)
Generates the classifier. |
private boolean |
checkForInstance(Instances data)
Checks whether instance of designated class is in subset. |
java.lang.String |
classifierTipText()
|
java.lang.String |
designatedClassTipText()
|
double[] |
distributionForInstance(Instance instance)
Calculates the class membership probabilities for the given test instance. |
java.lang.String |
evaluationModeTipText()
|
protected void |
findThreshold(FastVector predictions)
Finds the best threshold, this implementation searches for the highest FMeasure. |
Classifier |
getClassifier()
Get the Classifier used as the classifier. |
protected java.lang.String |
getClassifierSpec()
Gets the classifier specification string, which contains the class name of the classifier and any options to the classifier |
SelectedTag |
getDesignatedClass()
Gets the method to determine which class value to optimize. |
SelectedTag |
getEvaluationMode()
Gets the evaluation mode used. |
int |
getNumXValFolds()
Get the number of folds used for cross-validation. |
java.lang.String[] |
getOptions()
Gets the current settings of the Classifier. |
protected FastVector |
getPredictions(Instances instances,
int mode,
int numFolds)
Collects the classifier predictions using the specified evaluation method. |
SelectedTag |
getRangeCorrection()
Gets the confidence range correction mode used. |
int |
getSeed()
Gets the random number seed. |
java.lang.String |
globalInfo()
|
java.lang.String |
graph()
Returns graph describing the classifier (if possible). |
int |
graphType()
Returns the type of graph this classifier represents. |
java.util.Enumeration |
listOptions()
Returns an enumeration describing the available options. |
static void |
main(java.lang.String[] argv)
Main method for testing this class. |
java.lang.String |
numXValFoldsTipText()
|
java.lang.String |
rangeCorrectionTipText()
|
java.lang.String |
seedTipText()
|
void |
setClassifier(Classifier newClassifier)
Set the Classifier for which threshold is set. |
void |
setDesignatedClass(SelectedTag newMethod)
Sets the method to determine which class value to optimize. |
void |
setEvaluationMode(SelectedTag newMethod)
Sets the evaluation mode used. |
void |
setNumXValFolds(int newNumFolds)
Set the number of folds used for cross-validation. |
void |
setOptions(java.lang.String[] options)
Parses a given list of options. |
void |
setRangeCorrection(SelectedTag newMethod)
Sets the confidence range correction mode used. |
void |
setSeed(int seed)
Sets the seed for random number generation. |
java.lang.String |
toString()
Returns description of the cross-validated classifier. |
Methods inherited from class weka.classifiers.Classifier |
classifyInstance, debugTipText, forName, getDebug, makeCopies, setDebug |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Field Detail |
public static final int RANGE_NONE
public static final int RANGE_BOUNDS
public static final Tag[] TAGS_RANGE
public static final int EVAL_TRAINING_SET
public static final int EVAL_TUNED_SPLIT
public static final int EVAL_CROSS_VALIDATION
public static final Tag[] TAGS_EVAL
public static final int OPTIMIZE_0
public static final int OPTIMIZE_1
public static final int OPTIMIZE_LFREQ
public static final int OPTIMIZE_MFREQ
public static final int OPTIMIZE_POS_NAME
public static final Tag[] TAGS_OPTIMIZE
protected Classifier m_Classifier
protected double m_HighThreshold
protected double m_LowThreshold
protected double m_BestThreshold
protected double m_BestValue
protected int m_NumXValFolds
protected int m_Seed
protected int m_DesignatedClass
protected int m_ClassMode
protected int m_EvalMode
protected int m_RangeMode
protected static final double MIN_VALUE
Constructor Detail |
public ThresholdSelector()
Method Detail |
protected FastVector getPredictions(Instances instances, int mode, int numFolds) throws java.lang.Exception
instances
- the set of Instances
to generate
predictions for.mode
- the evaluation mode.numFolds
- the number of folds to use if not evaluating on the
full training set.
FastVector
containing the predictions.
java.lang.Exception
- if an error occurs generating the predictions.protected void findThreshold(FastVector predictions)
predictions
- a FastVector
containing the predictions.public java.util.Enumeration listOptions()
listOptions
in interface OptionHandler
listOptions
in class Classifier
public void setOptions(java.lang.String[] options) throws java.lang.Exception
-C num
The class for which threshold is determined. Valid values are:
1, 2 (for first and second classes, respectively), 3 (for whichever
class is least frequent), 4 (for whichever class value is most
frequent), and 5 (for the first class named any of "yes","pos(itive)",
"1", or method 3 if no matches). (default 3).
-W classname
Specify the full class name of classifier to perform cross-validation
selection on.
-X num
Number of folds used for cross validation. If just a
hold-out set is used, this determines the size of the hold-out set
(default 3).
-R integer
Sets whether confidence range correction is applied. This can be used
to ensure the confidences range from 0 to 1. Use 0 for no range correction,
1 for correction based on the min/max values seen during threshold
selection (default 0).
-S seed
Random number seed (default 1).
-E integer
Sets the evaluation mode. Use 0 for evaluation using cross-validation,
1 for evaluation using hold-out set, and 2 for evaluation on the
training data (default 1).
Options after -- are passed to the designated sub-classifier.
setOptions
in interface OptionHandler
setOptions
in class Classifier
options
- the list of options as an array of strings
java.lang.Exception
- if an option is not supportedpublic java.lang.String[] getOptions()
getOptions
in interface OptionHandler
getOptions
in class Classifier
public void buildClassifier(Instances instances) throws java.lang.Exception
buildClassifier
in class Classifier
instances
- set of instances serving as training data
java.lang.Exception
- if the classifier has not been generated successfullyprivate boolean checkForInstance(Instances data) throws java.lang.Exception
java.lang.Exception
public double[] distributionForInstance(Instance instance) throws java.lang.Exception
distributionForInstance
in class Classifier
instance
- the instance to be classified
java.lang.Exception
- if instance could not be classified
successfullypublic java.lang.String globalInfo()
public java.lang.String designatedClassTipText()
public SelectedTag getDesignatedClass()
public void setDesignatedClass(SelectedTag newMethod)
newMethod
- the new class selection mode.public java.lang.String evaluationModeTipText()
public void setEvaluationMode(SelectedTag newMethod)
newMethod
- the new evaluation mode.public SelectedTag getEvaluationMode()
public java.lang.String rangeCorrectionTipText()
public void setRangeCorrection(SelectedTag newMethod)
newMethod
- the new correciton mode.public SelectedTag getRangeCorrection()
public java.lang.String seedTipText()
public void setSeed(int seed)
seed
- the random number seedpublic int getSeed()
public java.lang.String numXValFoldsTipText()
public int getNumXValFolds()
public void setNumXValFolds(int newNumFolds)
newNumFolds
- the number of folds used for cross-validation.public java.lang.String classifierTipText()
public void setClassifier(Classifier newClassifier)
newClassifier
- the Classifier to use.public Classifier getClassifier()
protected java.lang.String getClassifierSpec()
public int graphType()
graphType
in interface Drawable
public java.lang.String graph() throws java.lang.Exception
graph
in interface Drawable
java.lang.Exception
- if the classifier cannot be graphedpublic java.lang.String toString()
public static void main(java.lang.String[] argv)
argv
- the options
|
|||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |