weka.attributeSelection
Class RandomSearch

java.lang.Object
  extended byweka.attributeSelection.ASSearch
      extended byweka.attributeSelection.RandomSearch
All Implemented Interfaces:
OptionHandler, java.io.Serializable, StartSetHandler

public class RandomSearch
extends ASSearch
implements StartSetHandler, OptionHandler

Class for performing a random search.

Valid options are:

-P
Specify a starting set of attributes. Eg 1,4,7-9.

-F Percentage of the search space to consider. (default = 25).

-V
Verbose output. Output new best subsets as the search progresses.

Version:
$Revision: 1.12 $
Author:
Mark Hall (mhall@cs.waikato.ac.nz)
See Also:
Serialized Form

Field Summary
private  java.util.BitSet m_bestGroup
          the best feature set found during the search
private  double m_bestMerit
          the merit of the best subset found
private  int m_classIndex
          holds the class index
private  boolean m_hasClass
          does the data have a class
private  int m_iterations
          the number of iterations performed
private  int m_numAttribs
          number of attributes in the data
private  boolean m_onlyConsiderBetterAndSmaller
          only accept a feature set as being "better" than the best if its merit is better or equal to the best, and it contains fewer features than the best (this allows LVF to be implimented).
private  java.util.Random m_random
          random number object
private  double m_searchSize
          percentage of the search space to consider
private  int m_seed
          seed for random number generation
private  int[] m_starting
          holds a starting set as an array of attributes.
private  Range m_startRange
          holds the start set as a range
private  boolean m_verbose
          output new best subsets as the search progresses
 
Constructor Summary
RandomSearch()
          Constructor
 
Method Summary
private  int[] attributeList(java.util.BitSet group)
          converts a BitSet into a list of attribute indexes
private  int countFeatures(java.util.BitSet featureSet)
          counts the number of features in a subset
private  java.util.BitSet generateRandomSubset()
          generates a random subset
 java.lang.String[] getOptions()
          Gets the current settings of RandomSearch.
 double getSearchPercent()
          get the percentage of the search space to consider
 java.lang.String getStartSet()
          Returns a list of attributes (and or attribute ranges) as a String
 boolean getVerbose()
          get whether or not output is verbose
 java.lang.String globalInfo()
          Returns a string describing this search method
 java.util.Enumeration listOptions()
          Returns an enumeration describing the available options.
private  java.lang.String printSubset(java.util.BitSet temp)
          prints a subset as a series of attribute numbers
private  void resetOptions()
          resets to defaults
 int[] search(ASEvaluation ASEval, Instances data)
          Searches the attribute subset space randomly.
 java.lang.String searchPercentTipText()
          Returns the tip text for this property
 void setOptions(java.lang.String[] options)
          Parses a given list of options.
 void setSearchPercent(double p)
          set the percentage of the search space to consider
 void setStartSet(java.lang.String startSet)
          Sets a starting set of attributes for the search.
 void setVerbose(boolean v)
          set whether or not to output new best subsets as the search proceeds
 java.lang.String startSetTipText()
          Returns the tip text for this property
private  java.lang.String startSetToString()
          converts the array of starting attributes to a string.
 java.lang.String toString()
          prints a description of the search
 java.lang.String verboseTipText()
          Returns the tip text for this property
 
Methods inherited from class weka.attributeSelection.ASSearch
forName
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

m_starting

private int[] m_starting
holds a starting set as an array of attributes.


m_startRange

private Range m_startRange
holds the start set as a range


m_bestGroup

private java.util.BitSet m_bestGroup
the best feature set found during the search


m_bestMerit

private double m_bestMerit
the merit of the best subset found


m_onlyConsiderBetterAndSmaller

private boolean m_onlyConsiderBetterAndSmaller
only accept a feature set as being "better" than the best if its merit is better or equal to the best, and it contains fewer features than the best (this allows LVF to be implimented).


m_hasClass

private boolean m_hasClass
does the data have a class


m_classIndex

private int m_classIndex
holds the class index


m_numAttribs

private int m_numAttribs
number of attributes in the data


m_seed

private int m_seed
seed for random number generation


m_searchSize

private double m_searchSize
percentage of the search space to consider


m_iterations

private int m_iterations
the number of iterations performed


m_random

private java.util.Random m_random
random number object


m_verbose

private boolean m_verbose
output new best subsets as the search progresses

Constructor Detail

RandomSearch

public RandomSearch()
Constructor

Method Detail

globalInfo

public java.lang.String globalInfo()
Returns a string describing this search method

Returns:
a description of the search suitable for displaying in the explorer/experimenter gui

listOptions

public java.util.Enumeration listOptions()
Returns an enumeration describing the available options.

Specified by:
listOptions in interface OptionHandler
Returns:
an enumeration of all the available options.

setOptions

public void setOptions(java.lang.String[] options)
                throws java.lang.Exception
Parses a given list of options. Valid options are:

-P
Specify a starting set of attributes. Eg 1,4,7-9.

-F Percentage of the search space to consider. (default = 25).

-V
Verbose output. Output new best subsets as the search progresses.

Specified by:
setOptions in interface OptionHandler
Parameters:
options - the list of options as an array of strings
Throws:
java.lang.Exception - if an option is not supported

startSetTipText

public java.lang.String startSetTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setStartSet

public void setStartSet(java.lang.String startSet)
                 throws java.lang.Exception
Sets a starting set of attributes for the search. It is the search method's responsibility to report this start set (if any) in its toString() method.

Specified by:
setStartSet in interface StartSetHandler
Parameters:
startSet - a string containing a list of attributes (and or ranges), eg. 1,2,6,10-15. "" indicates no start point. If a start point is supplied, random search evaluates the start point and then looks for subsets that are as good as or better than the start point with the same or lower cardinality.
Throws:
java.lang.Exception - if start set can't be set.

getStartSet

public java.lang.String getStartSet()
Returns a list of attributes (and or attribute ranges) as a String

Specified by:
getStartSet in interface StartSetHandler
Returns:
a list of attributes (and or attribute ranges)

verboseTipText

public java.lang.String verboseTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setVerbose

public void setVerbose(boolean v)
set whether or not to output new best subsets as the search proceeds

Parameters:
v - true if output is to be verbose

getVerbose

public boolean getVerbose()
get whether or not output is verbose

Returns:
true if output is set to verbose

searchPercentTipText

public java.lang.String searchPercentTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setSearchPercent

public void setSearchPercent(double p)
set the percentage of the search space to consider

Parameters:
p - percent of the search space ( 0 < p <= 100)

getSearchPercent

public double getSearchPercent()
get the percentage of the search space to consider

Returns:
the percent of the search space explored

getOptions

public java.lang.String[] getOptions()
Gets the current settings of RandomSearch.

Specified by:
getOptions in interface OptionHandler
Returns:
an array of strings suitable for passing to setOptions()

startSetToString

private java.lang.String startSetToString()
converts the array of starting attributes to a string. This is used by getOptions to return the actual attributes specified as the starting set. This is better than using m_startRanges.getRanges() as the same start set can be specified in different ways from the command line---eg 1,2,3 == 1-3. This is to ensure that stuff that is stored in a database is comparable.

Returns:
a comma seperated list of individual attribute numbers as a String

toString

public java.lang.String toString()
prints a description of the search

Returns:
a description of the search as a string

search

public int[] search(ASEvaluation ASEval,
                    Instances data)
             throws java.lang.Exception
Searches the attribute subset space randomly.

Specified by:
search in class ASSearch
Parameters:
data - the training instances.
ASEval - the attribute evaluator to guide the search
Returns:
an array (not necessarily ordered) of selected attribute indexes
Throws:
java.lang.Exception - if the search can't be completed

printSubset

private java.lang.String printSubset(java.util.BitSet temp)
prints a subset as a series of attribute numbers

Parameters:
temp - the subset to print
Returns:
a subset as a String of attribute numbers

attributeList

private int[] attributeList(java.util.BitSet group)
converts a BitSet into a list of attribute indexes

Parameters:
group - the BitSet to convert
Returns:
an array of attribute indexes

generateRandomSubset

private java.util.BitSet generateRandomSubset()
generates a random subset

Returns:
a random subset as a BitSet

countFeatures

private int countFeatures(java.util.BitSet featureSet)
counts the number of features in a subset

Parameters:
featureSet - the feature set for which to count the features
Returns:
the number of features in the subset

resetOptions

private void resetOptions()
resets to defaults