weka.attributeSelection
Class BestFirst

java.lang.Object
  extended byweka.attributeSelection.ASSearch
      extended byweka.attributeSelection.BestFirst
All Implemented Interfaces:
OptionHandler, java.io.Serializable, StartSetHandler

public class BestFirst
extends ASSearch
implements OptionHandler, StartSetHandler

Class for performing a best first search.

Valid options are:

-P
Specify a starting set of attributes. Eg 1,4,7-9.

-D <-1 = backward | 0 = bidirectional | 1 = forward>
Direction of the search. (default = 1).

-N
Number of non improving nodes to consider before terminating search. (default = 5).

-S
Size of lookup cache for evaluated subsets. Expressed as a multiple of the number of attributes in the data set. (default = 1).

Version:
$Revision: 1.23 $
Author:
Mark Hall (mhall@cs.waikato.ac.nz)
See Also:
Serialized Form

Nested Class Summary
 class BestFirst.Link2
          Class for a node in a linked list.
 class BestFirst.LinkedList2
          Class for handling a linked list.
 
Field Summary
private  double m_bestMerit
          holds the merit of the best subset found
private  int m_cacheSize
          holds the maximum size of the lookup cache for evaluated subsets
private  int m_classIndex
          holds the class index
private  boolean m_debug
          for debugging
private  boolean m_hasClass
          does the data have a class
private  int m_maxStale
          maximum number of stale nodes before terminating search
private  int m_numAttribs
          number of attributes in the data
private  int m_searchDirection
          0 == backward search, 1 == forward search, 2 == bidirectional
private  int[] m_starting
          holds an array of starting attributes
private  Range m_startRange
          holds the start set for the search as a Range
private  int m_totalEvals
          total number of subsets evaluated during a search
private static int SELECTION_BACKWARD
          search directions
private static int SELECTION_BIDIRECTIONAL
           
private static int SELECTION_FORWARD
           
static Tag[] TAGS_SELECTION
           
 
Constructor Summary
BestFirst()
          Constructor
 
Method Summary
private  int[] attributeList(java.util.BitSet group)
          converts a BitSet into a list of attribute indexes
 java.lang.String directionTipText()
          Returns the tip text for this property
 SelectedTag getDirection()
          Get the search direction
 int getLookupCacheSize()
          Return the maximum size of the evaluated subset cache (expressed as a multiplier for the number of attributes in a data set.
 java.lang.String[] getOptions()
          Gets the current settings of BestFirst.
 int getSearchTermination()
          Get the termination criterion (number of non-improving nodes).
 java.lang.String getStartSet()
          Returns a list of attributes (and or attribute ranges) as a String
 java.lang.String globalInfo()
          Returns a string describing this search method
 java.util.Enumeration listOptions()
          Returns an enumeration describing the available options.
 java.lang.String lookupCacheSizeTipText()
          Returns the tip text for this property
private  void printGroup(java.util.BitSet tt, int numAttribs)
           
protected  void resetOptions()
          Reset options to default values
 int[] search(ASEvaluation ASEval, Instances data)
          Searches the attribute subset space by best first search
 java.lang.String searchTerminationTipText()
          Returns the tip text for this property
 void setDirection(SelectedTag d)
          Set the search direction
 void setLookupCacheSize(int size)
          Set the maximum size of the evaluated subset cache (hashtable).
 void setOptions(java.lang.String[] options)
          Parses a given list of options.
 void setSearchTermination(int t)
          Set the numnber of non-improving nodes to consider before terminating search.
 void setStartSet(java.lang.String startSet)
          Sets a starting set of attributes for the search.
 java.lang.String startSetTipText()
          Returns the tip text for this property
private  java.lang.String startSetToString()
          converts the array of starting attributes to a string.
 java.lang.String toString()
          returns a description of the search as a String
 
Methods inherited from class weka.attributeSelection.ASSearch
forName
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

m_maxStale

private int m_maxStale
maximum number of stale nodes before terminating search


m_searchDirection

private int m_searchDirection
0 == backward search, 1 == forward search, 2 == bidirectional


SELECTION_BACKWARD

private static final int SELECTION_BACKWARD
search directions

See Also:
Constant Field Values

SELECTION_FORWARD

private static final int SELECTION_FORWARD
See Also:
Constant Field Values

SELECTION_BIDIRECTIONAL

private static final int SELECTION_BIDIRECTIONAL
See Also:
Constant Field Values

TAGS_SELECTION

public static final Tag[] TAGS_SELECTION

m_starting

private int[] m_starting
holds an array of starting attributes


m_startRange

private Range m_startRange
holds the start set for the search as a Range


m_hasClass

private boolean m_hasClass
does the data have a class


m_classIndex

private int m_classIndex
holds the class index


m_numAttribs

private int m_numAttribs
number of attributes in the data


m_totalEvals

private int m_totalEvals
total number of subsets evaluated during a search


m_debug

private boolean m_debug
for debugging


m_bestMerit

private double m_bestMerit
holds the merit of the best subset found


m_cacheSize

private int m_cacheSize
holds the maximum size of the lookup cache for evaluated subsets

Constructor Detail

BestFirst

public BestFirst()
Constructor

Method Detail

globalInfo

public java.lang.String globalInfo()
Returns a string describing this search method

Returns:
a description of the search method suitable for displaying in the explorer/experimenter gui

listOptions

public java.util.Enumeration listOptions()
Returns an enumeration describing the available options.

Specified by:
listOptions in interface OptionHandler
Returns:
an enumeration of all the available options.

setOptions

public void setOptions(java.lang.String[] options)
                throws java.lang.Exception
Parses a given list of options. Valid options are:

-P
Specify a starting set of attributes. Eg 1,4,7-9.

-D <-1 = backward | 0 = bidirectional | 1 = forward>
Direction of the search. (default = 1).

-N
Number of non improving nodes to consider before terminating search. (default = 5).

-S
Size of lookup cache for evaluated subsets. Expressed as a multiple of the number of attributes in the data set. (default = 1).

Specified by:
setOptions in interface OptionHandler
Parameters:
options - the list of options as an array of strings
Throws:
java.lang.Exception - if an option is not supported

setLookupCacheSize

public void setLookupCacheSize(int size)
Set the maximum size of the evaluated subset cache (hashtable). This is expressed as a multiplier for the number of attributes in the data set. (default = 1).

Parameters:
size - the maximum size of the hashtable

getLookupCacheSize

public int getLookupCacheSize()
Return the maximum size of the evaluated subset cache (expressed as a multiplier for the number of attributes in a data set.

Returns:
the maximum size of the hashtable.

lookupCacheSizeTipText

public java.lang.String lookupCacheSizeTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

startSetTipText

public java.lang.String startSetTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setStartSet

public void setStartSet(java.lang.String startSet)
                 throws java.lang.Exception
Sets a starting set of attributes for the search. It is the search method's responsibility to report this start set (if any) in its toString() method.

Specified by:
setStartSet in interface StartSetHandler
Parameters:
startSet - a string containing a list of attributes (and or ranges), eg. 1,2,6,10-15.
Throws:
java.lang.Exception - if start set can't be set.

getStartSet

public java.lang.String getStartSet()
Returns a list of attributes (and or attribute ranges) as a String

Specified by:
getStartSet in interface StartSetHandler
Returns:
a list of attributes (and or attribute ranges)

searchTerminationTipText

public java.lang.String searchTerminationTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setSearchTermination

public void setSearchTermination(int t)
                          throws java.lang.Exception
Set the numnber of non-improving nodes to consider before terminating search.

Parameters:
t - the number of non-improving nodes
Throws:
java.lang.Exception - if t is less than 1

getSearchTermination

public int getSearchTermination()
Get the termination criterion (number of non-improving nodes).

Returns:
the number of non-improving nodes

directionTipText

public java.lang.String directionTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setDirection

public void setDirection(SelectedTag d)
Set the search direction

Parameters:
d - the direction of the search

getDirection

public SelectedTag getDirection()
Get the search direction

Returns:
the direction of the search

getOptions

public java.lang.String[] getOptions()
Gets the current settings of BestFirst.

Specified by:
getOptions in interface OptionHandler
Returns:
an array of strings suitable for passing to setOptions()

startSetToString

private java.lang.String startSetToString()
converts the array of starting attributes to a string. This is used by getOptions to return the actual attributes specified as the starting set. This is better than using m_startRanges.getRanges() as the same start set can be specified in different ways from the command line---eg 1,2,3 == 1-3. This is to ensure that stuff that is stored in a database is comparable.

Returns:
a comma seperated list of individual attribute numbers as a String

toString

public java.lang.String toString()
returns a description of the search as a String

Returns:
a description of the search

printGroup

private void printGroup(java.util.BitSet tt,
                        int numAttribs)

search

public int[] search(ASEvaluation ASEval,
                    Instances data)
             throws java.lang.Exception
Searches the attribute subset space by best first search

Specified by:
search in class ASSearch
Parameters:
data - the training instances.
ASEval - the attribute evaluator to guide the search
Returns:
an array (not necessarily ordered) of selected attribute indexes
Throws:
java.lang.Exception - if the search can't be completed

resetOptions

protected void resetOptions()
Reset options to default values


attributeList

private int[] attributeList(java.util.BitSet group)
converts a BitSet into a list of attribute indexes

Parameters:
group - the BitSet to convert
Returns:
an array of attribute indexes