weka.attributeSelection
Class Ranker

java.lang.Object
  extended byweka.attributeSelection.ASSearch
      extended byweka.attributeSelection.Ranker
All Implemented Interfaces:
OptionHandler, RankedOutputSearch, java.io.Serializable, StartSetHandler

public class Ranker
extends ASSearch
implements RankedOutputSearch, StartSetHandler, OptionHandler

Class for ranking the attributes evaluated by a AttributeEvaluator Valid options are:

-P
Specify a starting set of attributes. Eg 1,4,7-9.

-T
Specify a threshold by which the AttributeSelection module can.
discard attributes.

Version:
$Revision: 1.20 $
Author:
Mark Hall (mhall@cs.waikato.ac.nz)
See Also:
Serialized Form

Field Summary
private  int[] m_attributeList
          Holds the ordered list of attributes
private  double[] m_attributeMerit
          Holds the list of attribute merit scores
private  int m_calculatedNumToSelect
          Used to compute the number to select
private  int m_classIndex
          Class index of the data if supervised evaluator
private  boolean m_hasClass
          Data has class attribute---if unsupervised evaluator then no class
private  int m_numAttribs
          The number of attribtes
private  int m_numToSelect
          The number of attributes to select. -1 indicates that all attributes are to be retained.
private  int[] m_starting
          Holds the starting set as an array of attributes
private  Range m_startRange
          Holds the start set for the search as a range
private  double m_threshold
          A threshold by which to discard attributes---used by the AttributeSelection module
 
Constructor Summary
Ranker()
          Constructor
 
Method Summary
private  void determineNumToSelectFromThreshold(double[][] ranking)
           
private  void determineThreshFromNumToSelect(double[][] ranking)
           
 java.lang.String generateRankingTipText()
          Returns the tip text for this property
 int getCalculatedNumToSelect()
          Gets the calculated number to select.
 boolean getGenerateRanking()
          This is a dummy method.
 int getNumToSelect()
          Gets the number of attributes to be retained.
 java.lang.String[] getOptions()
          Gets the current settings of ReliefFAttributeEval.
 java.lang.String getStartSet()
          Returns a list of attributes (and or attribute ranges) as a String
 double getThreshold()
          Returns the threshold so that the AttributeSelection module can discard attributes from the ranking.
 java.lang.String globalInfo()
          Returns a string describing this search method
private  boolean inStarting(int feat)
           
 java.util.Enumeration listOptions()
          Returns an enumeration describing the available options.
 java.lang.String numToSelectTipText()
          Returns the tip text for this property
 double[][] rankedAttributes()
          Sorts the evaluated attribute list
protected  void resetOptions()
          Resets stuff to default values
 int[] search(ASEvaluation ASEval, Instances data)
          Kind of a dummy search algorithm.
 void setGenerateRanking(boolean doRank)
          This is a dummy set method---Ranker is ONLY capable of producing a ranked list of attributes for attribute evaluators.
 void setNumToSelect(int n)
          Specify the number of attributes to select from the ranked list. -1 indicates that all attributes are to be retained.
 void setOptions(java.lang.String[] options)
          Parses a given list of options.
 void setStartSet(java.lang.String startSet)
          Sets a starting set of attributes for the search.
 void setThreshold(double threshold)
          Set the threshold by which the AttributeSelection module can discard attributes.
 java.lang.String startSetTipText()
          Returns the tip text for this property
private  java.lang.String startSetToString()
          converts the array of starting attributes to a string.
 java.lang.String thresholdTipText()
          Returns the tip text for this property
 java.lang.String toString()
          returns a description of the search as a String
 
Methods inherited from class weka.attributeSelection.ASSearch
forName
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

m_starting

private int[] m_starting
Holds the starting set as an array of attributes


m_startRange

private Range m_startRange
Holds the start set for the search as a range


m_attributeList

private int[] m_attributeList
Holds the ordered list of attributes


m_attributeMerit

private double[] m_attributeMerit
Holds the list of attribute merit scores


m_hasClass

private boolean m_hasClass
Data has class attribute---if unsupervised evaluator then no class


m_classIndex

private int m_classIndex
Class index of the data if supervised evaluator


m_numAttribs

private int m_numAttribs
The number of attribtes


m_threshold

private double m_threshold
A threshold by which to discard attributes---used by the AttributeSelection module


m_numToSelect

private int m_numToSelect
The number of attributes to select. -1 indicates that all attributes are to be retained. Has precedence over m_threshold


m_calculatedNumToSelect

private int m_calculatedNumToSelect
Used to compute the number to select

Constructor Detail

Ranker

public Ranker()
Constructor

Method Detail

globalInfo

public java.lang.String globalInfo()
Returns a string describing this search method

Returns:
a description of the search suitable for displaying in the explorer/experimenter gui

numToSelectTipText

public java.lang.String numToSelectTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setNumToSelect

public void setNumToSelect(int n)
Specify the number of attributes to select from the ranked list. -1 indicates that all attributes are to be retained.

Specified by:
setNumToSelect in interface RankedOutputSearch
Parameters:
n - the number of attributes to retain

getNumToSelect

public int getNumToSelect()
Gets the number of attributes to be retained.

Specified by:
getNumToSelect in interface RankedOutputSearch
Returns:
the number of attributes to retain

getCalculatedNumToSelect

public int getCalculatedNumToSelect()
Gets the calculated number to select. This might be computed from a threshold, or if < 0 is set as the number to select then it is set to the number of attributes in the (transformed) data.

Specified by:
getCalculatedNumToSelect in interface RankedOutputSearch
Returns:
the calculated number of attributes to select

thresholdTipText

public java.lang.String thresholdTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setThreshold

public void setThreshold(double threshold)
Set the threshold by which the AttributeSelection module can discard attributes.

Specified by:
setThreshold in interface RankedOutputSearch
Parameters:
threshold - the threshold.

getThreshold

public double getThreshold()
Returns the threshold so that the AttributeSelection module can discard attributes from the ranking.

Specified by:
getThreshold in interface RankedOutputSearch
Returns:
a threshold by which to discard attributes

generateRankingTipText

public java.lang.String generateRankingTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setGenerateRanking

public void setGenerateRanking(boolean doRank)
This is a dummy set method---Ranker is ONLY capable of producing a ranked list of attributes for attribute evaluators.

Specified by:
setGenerateRanking in interface RankedOutputSearch
Parameters:
doRank - this parameter is N/A and is ignored

getGenerateRanking

public boolean getGenerateRanking()
This is a dummy method. Ranker can ONLY be used with attribute evaluators and as such can only produce a ranked list of attributes

Specified by:
getGenerateRanking in interface RankedOutputSearch
Returns:
true all the time.

startSetTipText

public java.lang.String startSetTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setStartSet

public void setStartSet(java.lang.String startSet)
                 throws java.lang.Exception
Sets a starting set of attributes for the search. It is the search method's responsibility to report this start set (if any) in its toString() method.

Specified by:
setStartSet in interface StartSetHandler
Parameters:
startSet - a string containing a list of attributes (and or ranges), eg. 1,2,6,10-15.
Throws:
java.lang.Exception - if start set can't be set.

getStartSet

public java.lang.String getStartSet()
Returns a list of attributes (and or attribute ranges) as a String

Specified by:
getStartSet in interface StartSetHandler
Returns:
a list of attributes (and or attribute ranges)

listOptions

public java.util.Enumeration listOptions()
Returns an enumeration describing the available options.

Specified by:
listOptions in interface OptionHandler
Returns:
an enumeration of all the available options.

setOptions

public void setOptions(java.lang.String[] options)
                throws java.lang.Exception
Parses a given list of options. Valid options are:

-P
Specify a starting set of attributes. Eg 1,4,7-9.

-T
Specify a threshold by which the AttributeSelection module can
discard attributes.

-N
Specify the number of attributes to retain. Overides any threshold.

Specified by:
setOptions in interface OptionHandler
Parameters:
options - the list of options as an array of strings
Throws:
java.lang.Exception - if an option is not supported

getOptions

public java.lang.String[] getOptions()
Gets the current settings of ReliefFAttributeEval.

Specified by:
getOptions in interface OptionHandler
Returns:
an array of strings suitable for passing to setOptions()

startSetToString

private java.lang.String startSetToString()
converts the array of starting attributes to a string. This is used by getOptions to return the actual attributes specified as the starting set. This is better than using m_startRanges.getRanges() as the same start set can be specified in different ways from the command line---eg 1,2,3 == 1-3. This is to ensure that stuff that is stored in a database is comparable.

Returns:
a comma seperated list of individual attribute numbers as a String

search

public int[] search(ASEvaluation ASEval,
                    Instances data)
             throws java.lang.Exception
Kind of a dummy search algorithm. Calls a Attribute evaluator to evaluate each attribute not included in the startSet and then sorts them to produce a ranked list of attributes.

Specified by:
search in class ASSearch
Parameters:
data - the training instances.
ASEval - the attribute evaluator to guide the search
Returns:
an array (not necessarily ordered) of selected attribute indexes
Throws:
java.lang.Exception - if the search can't be completed

rankedAttributes

public double[][] rankedAttributes()
                            throws java.lang.Exception
Sorts the evaluated attribute list

Specified by:
rankedAttributes in interface RankedOutputSearch
Returns:
an array of sorted (highest eval to lowest) attribute indexes
Throws:
java.lang.Exception - of sorting can't be done.

determineNumToSelectFromThreshold

private void determineNumToSelectFromThreshold(double[][] ranking)

determineThreshFromNumToSelect

private void determineThreshFromNumToSelect(double[][] ranking)
                                     throws java.lang.Exception
Throws:
java.lang.Exception

toString

public java.lang.String toString()
returns a description of the search as a String

Returns:
a description of the search

resetOptions

protected void resetOptions()
Resets stuff to default values


inStarting

private boolean inStarting(int feat)