Specify a threshold by which the AttributeSelection module can discard
attributes. Use in conjunction with -Q.
-Z
Turn on verbose output for monitoring the search
- Version:
- $Revision: 1.12 $
- Author:
- Mark Hall (mhall@cs.waikato.ac.nz)
- See Also:
- Serialized Form
Field Summary |
private static int |
BACKWARD_RACE
|
private static int |
FORWARD_RACE
search types |
private static int |
LEAVE_ONE_OUT
|
private ASEvaluation |
m_ASEval
the attribute evaluator to generate the initial ranking when
doing a rank race |
private double |
m_bestMerit
holds the merit of the best subset found |
private int |
m_calculatedNumToSelect
|
private int |
m_classIndex
the class index |
private boolean |
m_debug
verbose output for monitoring the search and debugging |
private double |
m_delta
threshold for comparisons |
private Instances |
m_Instances
|
private int |
m_numAttribs
the number of attributes in the data |
private int |
m_numFolds
number of cross validation folds---equal to the number of instances
for leave-one-out cv |
private int |
m_numToSelect
The number of attributes to retain if a ranking is requested. -1
indicates that all attributes are to be retained. |
private int |
m_raceType
the selected search type |
private double[][] |
m_rankedAtts
The ranked list of attributes produced if m_rankingRequested is true |
private int |
m_rankedSoFar
The number of attributes ranked so far (if ranking is requested) |
private int[] |
m_Ranking
will hold the attribute ranking produced by the above attribute
evaluator if doing a rank search |
private boolean |
m_rankingRequested
If true then produce a ranked list of attributes by fully traversing
a forward hillclimb race |
private int |
m_samples
the number of samples above which to begin testing for similarity
between competing subsets |
private double |
m_sigLevel
the significance level for comparisons |
private HoldOutSubsetEvaluator |
m_theEvaluator
the subset evaluator to use |
private double |
m_threshold
the threshold for removing attributes if ranking is requested |
private int |
m_totalEvals
the total number of partially/fully evaluated subsets |
private int |
m_xvalType
the selected xval type |
private static int |
RANK_RACE
|
private static int |
SCHEMATA_RACE
|
static Tag[] |
TAGS_SELECTION
|
private static int |
TEN_FOLD
xval types |
static Tag[] |
XVALTAGS_SELECTION
|
Method Summary |
java.lang.String |
attributeEvaluatorTipText()
Returns the tip text for this property |
private int[] |
attributeList(char[] list)
Convert an attribute set to an array of indices |
java.lang.String |
debugTipText()
Returns the tip text for this property |
private void |
determineNumToSelectFromThreshold(double[][] ranking)
|
java.lang.String |
foldsTipText()
Returns the tip text for this property |
java.lang.String |
generateRankingTipText()
Returns the tip text for this property |
ASEvaluation |
getAttributeEvaluator()
Get the attribute evaluator used to generate the ranking. |
int |
getCalculatedNumToSelect()
Gets the calculated number of attributes to retain. |
boolean |
getDebug()
Get whether output is to be verbose |
SelectedTag |
getFoldsType()
Get the xfold type |
boolean |
getGenerateRanking()
Gets whether ranking has been requested. |
int |
getNumToSelect()
Gets the number of attributes to be retained. |
java.lang.String[] |
getOptions()
Gets the current settings of BestFirst. |
SelectedTag |
getRaceType()
Get the race type |
double |
getSelectionThreshold()
Returns the threshold so that the AttributeSelection module can
discard attributes from the ranking. |
double |
getSignificanceLevel()
Get the significance level |
double |
getThreshold()
Get the threshold |
java.lang.String |
globalInfo()
Returns a string describing this search method |
private int[] |
hillclimbRace(Instances data,
java.util.Random random)
Performs a hill climbing race---all single attribute changes to a
base subset are raced in parallel. |
java.util.Enumeration |
listOptions()
Returns an enumeration describing the available options. |
java.lang.String |
numToSelectTipText()
Returns the tip text for this property |
private java.lang.String |
printSets(char[][] raceSets)
Print an attribute set. |
private double[] |
raceSubsets(char[][] raceSets,
Instances data,
boolean baseSetIncluded,
java.util.Random random)
Races the leave-one-out cross validation errors of a set of
attribute subsets on a set of instances. |
java.lang.String |
raceTypeTipText()
Returns the tip text for this property |
double[][] |
rankedAttributes()
Returns a X by 2 list of attribute indexes and corresponding
evaluations from best (highest) to worst. |
private int[] |
rankRace(Instances data,
java.util.Random random)
Performs a rank race---race consisting of no attributes, the top
ranked attribute, the top two attributes etc. |
protected void |
resetOptions()
Reset the search method. |
private int[] |
schemataRace(Instances data,
java.util.Random random)
Performs a schemata race---a series of races in parallel. |
int[] |
search(ASEvaluation ASEval,
Instances data)
Searches the attribute subset space by racing cross validation
errors of competing subsets |
java.lang.String |
selectionThresholdTipText()
Returns the tip text for this property |
void |
setAttributeEvaluator(ASEvaluation newEvaluator)
Set the attribute evaluator to use for generating the ranking. |
void |
setDebug(boolean d)
Set whether verbose output should be generated. |
void |
setFoldsType(SelectedTag d)
Set the xfold type |
void |
setGenerateRanking(boolean doRank)
Records whether the user has requested a ranked list of attributes. |
void |
setNumToSelect(int n)
Specify the number of attributes to select from the ranked list
(if generating a ranking). -1
indicates that all attributes are to be retained. |
void |
setOptions(java.lang.String[] options)
Parses a given list of options.
|
void |
setRaceType(SelectedTag d)
Set the race type |
void |
setSelectionThreshold(double threshold)
Set the threshold by which the AttributeSelection module can discard
attributes. |
void |
setSignificanceLevel(double sig)
Sets the significance level to use |
void |
setThreshold(double t)
Sets the threshold for comparisons |
java.lang.String |
significanceLevelTipText()
Returns the tip text for this property |
java.lang.String |
thresholdTipText()
Returns the tip text for this property |
java.lang.String |
toString()
|
private double |
ttest(Stats c1,
Stats c2)
|
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
m_Instances
private Instances m_Instances
FORWARD_RACE
private static final int FORWARD_RACE
- search types
- See Also:
- Constant Field Values
BACKWARD_RACE
private static final int BACKWARD_RACE
- See Also:
- Constant Field Values
SCHEMATA_RACE
private static final int SCHEMATA_RACE
- See Also:
- Constant Field Values
RANK_RACE
private static final int RANK_RACE
- See Also:
- Constant Field Values
TAGS_SELECTION
public static final Tag[] TAGS_SELECTION
m_raceType
private int m_raceType
- the selected search type
TEN_FOLD
private static final int TEN_FOLD
- xval types
- See Also:
- Constant Field Values
LEAVE_ONE_OUT
private static final int LEAVE_ONE_OUT
- See Also:
- Constant Field Values
XVALTAGS_SELECTION
public static final Tag[] XVALTAGS_SELECTION
m_xvalType
private int m_xvalType
- the selected xval type
m_classIndex
private int m_classIndex
- the class index
m_numAttribs
private int m_numAttribs
- the number of attributes in the data
m_totalEvals
private int m_totalEvals
- the total number of partially/fully evaluated subsets
m_bestMerit
private double m_bestMerit
- holds the merit of the best subset found
m_theEvaluator
private HoldOutSubsetEvaluator m_theEvaluator
- the subset evaluator to use
m_sigLevel
private double m_sigLevel
- the significance level for comparisons
m_delta
private double m_delta
- threshold for comparisons
m_samples
private int m_samples
- the number of samples above which to begin testing for similarity
between competing subsets
m_numFolds
private int m_numFolds
- number of cross validation folds---equal to the number of instances
for leave-one-out cv
m_ASEval
private ASEvaluation m_ASEval
- the attribute evaluator to generate the initial ranking when
doing a rank race
m_Ranking
private int[] m_Ranking
- will hold the attribute ranking produced by the above attribute
evaluator if doing a rank search
m_debug
private boolean m_debug
- verbose output for monitoring the search and debugging
m_rankingRequested
private boolean m_rankingRequested
- If true then produce a ranked list of attributes by fully traversing
a forward hillclimb race
m_rankedAtts
private double[][] m_rankedAtts
- The ranked list of attributes produced if m_rankingRequested is true
m_rankedSoFar
private int m_rankedSoFar
- The number of attributes ranked so far (if ranking is requested)
m_numToSelect
private int m_numToSelect
- The number of attributes to retain if a ranking is requested. -1
indicates that all attributes are to be retained. Has precedence over
m_threshold
m_calculatedNumToSelect
private int m_calculatedNumToSelect
m_threshold
private double m_threshold
- the threshold for removing attributes if ranking is requested
RaceSearch
public RaceSearch()
globalInfo
public java.lang.String globalInfo()
- Returns a string describing this search method
- Returns:
- a description of the search method suitable for
displaying in the explorer/experimenter gui
raceTypeTipText
public java.lang.String raceTypeTipText()
- Returns the tip text for this property
- Returns:
- tip text for this property suitable for
displaying in the explorer/experimenter gui
setRaceType
public void setRaceType(SelectedTag d)
- Set the race type
- Parameters:
d
- the type of race
getRaceType
public SelectedTag getRaceType()
- Get the race type
- Returns:
- the type of race
significanceLevelTipText
public java.lang.String significanceLevelTipText()
- Returns the tip text for this property
- Returns:
- tip text for this property suitable for
displaying in the explorer/experimenter gui
setSignificanceLevel
public void setSignificanceLevel(double sig)
- Sets the significance level to use
- Parameters:
sig
- the significance level
getSignificanceLevel
public double getSignificanceLevel()
- Get the significance level
- Returns:
- the current significance level
thresholdTipText
public java.lang.String thresholdTipText()
- Returns the tip text for this property
- Returns:
- tip text for this property suitable for
displaying in the explorer/experimenter gui
setThreshold
public void setThreshold(double t)
- Sets the threshold for comparisons
- Specified by:
setThreshold
in interface RankedOutputSearch
- Parameters:
t
- the threshold to use
getThreshold
public double getThreshold()
- Get the threshold
- Specified by:
getThreshold
in interface RankedOutputSearch
- Returns:
- the current threshold
foldsTipText
public java.lang.String foldsTipText()
- Returns the tip text for this property
- Returns:
- tip text for this property suitable for
displaying in the explorer/experimenter gui
setFoldsType
public void setFoldsType(SelectedTag d)
- Set the xfold type
- Parameters:
d
- the type of xval
getFoldsType
public SelectedTag getFoldsType()
- Get the xfold type
- Returns:
- the type of xval
debugTipText
public java.lang.String debugTipText()
- Returns the tip text for this property
- Returns:
- tip text for this property suitable for
displaying in the explorer/experimenter gui
setDebug
public void setDebug(boolean d)
- Set whether verbose output should be generated.
- Parameters:
d
- true if output is to be verbose.
getDebug
public boolean getDebug()
- Get whether output is to be verbose
- Returns:
- true if output will be verbose
attributeEvaluatorTipText
public java.lang.String attributeEvaluatorTipText()
- Returns the tip text for this property
- Returns:
- tip text for this property suitable for
displaying in the explorer/experimenter gui
setAttributeEvaluator
public void setAttributeEvaluator(ASEvaluation newEvaluator)
- Set the attribute evaluator to use for generating the ranking.
- Parameters:
newEvaluator
- the attribute evaluator to use.
getAttributeEvaluator
public ASEvaluation getAttributeEvaluator()
- Get the attribute evaluator used to generate the ranking.
- Returns:
- the evaluator used to generate the ranking.
generateRankingTipText
public java.lang.String generateRankingTipText()
- Returns the tip text for this property
- Returns:
- tip text for this property suitable for
displaying in the explorer/experimenter gui
setGenerateRanking
public void setGenerateRanking(boolean doRank)
- Records whether the user has requested a ranked list of attributes.
- Specified by:
setGenerateRanking
in interface RankedOutputSearch
- Parameters:
doRank
- true if ranking is requested
getGenerateRanking
public boolean getGenerateRanking()
- Gets whether ranking has been requested. This is used by the
AttributeSelection module to determine if rankedAttributes()
should be called.
- Specified by:
getGenerateRanking
in interface RankedOutputSearch
- Returns:
- true if ranking has been requested.
numToSelectTipText
public java.lang.String numToSelectTipText()
- Returns the tip text for this property
- Returns:
- tip text for this property suitable for
displaying in the explorer/experimenter gui
setNumToSelect
public void setNumToSelect(int n)
- Specify the number of attributes to select from the ranked list
(if generating a ranking). -1
indicates that all attributes are to be retained.
- Specified by:
setNumToSelect
in interface RankedOutputSearch
- Parameters:
n
- the number of attributes to retain
getNumToSelect
public int getNumToSelect()
- Gets the number of attributes to be retained.
- Specified by:
getNumToSelect
in interface RankedOutputSearch
- Returns:
- the number of attributes to retain
getCalculatedNumToSelect
public int getCalculatedNumToSelect()
- Gets the calculated number of attributes to retain. This is the
actual number of attributes to retain. This is the same as
getNumToSelect if the user specifies a number which is not less
than zero. Otherwise it should be the number of attributes in the
(potentially transformed) data.
- Specified by:
getCalculatedNumToSelect
in interface RankedOutputSearch
selectionThresholdTipText
public java.lang.String selectionThresholdTipText()
- Returns the tip text for this property
- Returns:
- tip text for this property suitable for
displaying in the explorer/experimenter gui
setSelectionThreshold
public void setSelectionThreshold(double threshold)
- Set the threshold by which the AttributeSelection module can discard
attributes.
- Parameters:
threshold
- the threshold.
getSelectionThreshold
public double getSelectionThreshold()
- Returns the threshold so that the AttributeSelection module can
discard attributes from the ranking.
listOptions
public java.util.Enumeration listOptions()
- Returns an enumeration describing the available options.
- Specified by:
listOptions
in interface OptionHandler
- Returns:
- an enumeration of all the available options.
setOptions
public void setOptions(java.lang.String[] options)
throws java.lang.Exception
- Parses a given list of options.
Valid options are:
-R
0 = forward, 1 = backward, 2 = schemata, 3 = rank.
-L
significance level to use for t-tests.
-T
threshold for considering mean errors of two subsets the same
-F
0 = 10 fold, 1 = leave-one-out (selected automatically for schemata race
-A
the attribute evaluator to use when doing a rank search
-Q
produce a ranked list of attributes. Selecting this option forces
the race type to be forward. Racing continues until *all* attributes
have been selected, thus producing a ranked list of attributes.
-N
Specify the number of attributes to retain. Overides any threshold.
Use in conjunction with -Q.
-J
Specify a threshold by which the AttributeSelection module can discard
attributes. Use in conjunction with -Q.
-Z
Turn on verbose output for monitoring the search
- Specified by:
setOptions
in interface OptionHandler
- Parameters:
options
- the list of options as an array of strings
- Throws:
java.lang.Exception
- if an option is not supported
getOptions
public java.lang.String[] getOptions()
- Gets the current settings of BestFirst.
- Specified by:
getOptions
in interface OptionHandler
- Returns:
- an array of strings suitable for passing to setOptions()
search
public int[] search(ASEvaluation ASEval,
Instances data)
throws java.lang.Exception
- Searches the attribute subset space by racing cross validation
errors of competing subsets
- Specified by:
search
in class ASSearch
- Parameters:
data
- the training instances.ASEval
- the attribute evaluator to guide the search
- Returns:
- an array (not necessarily ordered) of selected attribute indexes
- Throws:
java.lang.Exception
- if the search can't be completed
rankedAttributes
public double[][] rankedAttributes()
throws java.lang.Exception
- Description copied from interface:
RankedOutputSearch
- Returns a X by 2 list of attribute indexes and corresponding
evaluations from best (highest) to worst.
- Specified by:
rankedAttributes
in interface RankedOutputSearch
- Returns:
- the ranked list of attribute indexes in an array of ints
- Throws:
java.lang.Exception
- if the ranking can't be produced
determineNumToSelectFromThreshold
private void determineNumToSelectFromThreshold(double[][] ranking)
printSets
private java.lang.String printSets(char[][] raceSets)
- Print an attribute set.
schemataRace
private int[] schemataRace(Instances data,
java.util.Random random)
throws java.lang.Exception
- Performs a schemata race---a series of races in parallel.
- Parameters:
data
- the instances to estimate accuracy over.random
- a random number generator
- Returns:
- an array of selected attribute indices.
- Throws:
java.lang.Exception
ttest
private double ttest(Stats c1,
Stats c2)
throws java.lang.Exception
- Throws:
java.lang.Exception
rankRace
private int[] rankRace(Instances data,
java.util.Random random)
throws java.lang.Exception
- Performs a rank race---race consisting of no attributes, the top
ranked attribute, the top two attributes etc. The initial ranking
is determined by an attribute evaluator.
- Parameters:
data
- the instances to estimate accuracy overrandom
- a random number generator
- Returns:
- an array of selected attribute indices.
- Throws:
java.lang.Exception
hillclimbRace
private int[] hillclimbRace(Instances data,
java.util.Random random)
throws java.lang.Exception
- Performs a hill climbing race---all single attribute changes to a
base subset are raced in parallel. The winner is chosen and becomes
the new base subset and the process is repeated until there is no
improvement in error over the base subset.
- Parameters:
data
- the instances to estimate accuracy overrandom
- a random number generator
- Returns:
- an array of selected attribute indices.
- Throws:
java.lang.Exception
attributeList
private int[] attributeList(char[] list)
- Convert an attribute set to an array of indices
raceSubsets
private double[] raceSubsets(char[][] raceSets,
Instances data,
boolean baseSetIncluded,
java.util.Random random)
throws java.lang.Exception
- Races the leave-one-out cross validation errors of a set of
attribute subsets on a set of instances.
- Parameters:
raceSets
- a set of attribute subset specificationsdata
- the instances to use when cross validatingbaseSetIncluded
- true if the first attribute set is a
base set generated from the previous racerandom
- a random number generator
- Returns:
- the index of the winning subset
- Throws:
java.lang.Exception
- if an error occurs during cross validation
toString
public java.lang.String toString()
resetOptions
protected void resetOptions()
- Reset the search method.