weka.filters.unsupervised.instance
Class RemoveFolds

java.lang.Object
  extended byweka.filters.Filter
      extended byweka.filters.unsupervised.instance.RemoveFolds
All Implemented Interfaces:
OptionHandler, java.io.Serializable, UnsupervisedFilter

public class RemoveFolds
extends Filter
implements UnsupervisedFilter, OptionHandler

This filter takes a dataset and outputs a specified fold for cross validation. If you want the folds to be stratified use the supervised version. Valid options are:

-V
Specifies if inverse of selection is to be output.

-N number of folds
Specifies number of folds dataset is split into (default 10).

-F fold
Specifies which fold is selected. (default 1)

-S seed
Specifies a random number seed for shuffling the dataset. (default 0, don't randomize)

Version:
$Revision: 1.1 $
Author:
Eibe Frank (eibe@cs.waikato.ac.nz)
See Also:
Serialized Form

Field Summary
private  int m_Fold
          Fold to output
private  boolean m_Inverse
          Indicates if inverse of selection is to be output.
private  int m_NumFolds
          Number of folds to split dataset into
private  long m_Seed
          Random number seed.
 
Fields inherited from class weka.filters.Filter
m_NewBatch
 
Constructor Summary
RemoveFolds()
           
 
Method Summary
 boolean batchFinished()
          Signify that this batch of input to the filter is finished.
 java.lang.String foldTipText()
          Returns the tip text for this property
 int getFold()
          Gets the fold which is selected.
 boolean getInvertSelection()
          Gets if selection is to be inverted.
 int getNumFolds()
          Gets the number of folds in which dataset is to be split into.
 java.lang.String[] getOptions()
          Gets the current settings of the filter.
 long getSeed()
          Gets the random number seed used for shuffling the dataset.
 java.lang.String globalInfo()
          Returns a string describing this filter
 java.lang.String invertSelectionTipText()
          Returns the tip text for this property
 java.util.Enumeration listOptions()
          Gets an enumeration describing the available options..
static void main(java.lang.String[] argv)
          Main method for testing this class.
 java.lang.String numFoldsTipText()
          Returns the tip text for this property
 java.lang.String seedTipText()
          Returns the tip text for this property
 void setFold(int fold)
          Selects a fold.
 boolean setInputFormat(Instances instanceInfo)
          Sets the format of the input instances.
 void setInvertSelection(boolean inverse)
          Sets if selection is to be inverted.
 void setNumFolds(int numFolds)
          Sets the number of folds the dataset is split into.
 void setOptions(java.lang.String[] options)
          Parses the options for this object.
 void setSeed(long seed)
          Sets the random number seed for shuffling the dataset.
 
Methods inherited from class weka.filters.Filter
batchFilterFile, bufferInput, copyStringValues, copyStringValues, filterFile, flushInput, getInputFormat, getInputStringIndex, getOutputFormat, getOutputStringIndex, getStringIndices, input, inputFormat, inputFormatPeek, isOutputFormatDefined, numPendingOutput, output, outputFormat, outputFormatPeek, outputPeek, push, resetQueue, setOutputFormat, useFilter
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

m_Inverse

private boolean m_Inverse
Indicates if inverse of selection is to be output.


m_NumFolds

private int m_NumFolds
Number of folds to split dataset into


m_Fold

private int m_Fold
Fold to output


m_Seed

private long m_Seed
Random number seed.

Constructor Detail

RemoveFolds

public RemoveFolds()
Method Detail

listOptions

public java.util.Enumeration listOptions()
Gets an enumeration describing the available options..

Specified by:
listOptions in interface OptionHandler
Returns:
an enumeration of all the available options.

setOptions

public void setOptions(java.lang.String[] options)
                throws java.lang.Exception
Parses the options for this object. Valid options are:

-V
Specifies if inverse of selection is to be output.

-N number of folds
Specifies number of folds dataset is split into (default 10).

-F fold
Specifies which fold is selected. (default 1)

-S seed
Specifies a random number seed for shuffling the dataset. (default 0, no randomizing)

Specified by:
setOptions in interface OptionHandler
Parameters:
options - the list of options as an array of strings
Throws:
java.lang.Exception - if an option is not supported

getOptions

public java.lang.String[] getOptions()
Gets the current settings of the filter.

Specified by:
getOptions in interface OptionHandler
Returns:
an array of strings suitable for passing to setOptions

globalInfo

public java.lang.String globalInfo()
Returns a string describing this filter

Returns:
a description of the filter suitable for displaying in the explorer/experimenter gui

invertSelectionTipText

public java.lang.String invertSelectionTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

getInvertSelection

public boolean getInvertSelection()
Gets if selection is to be inverted.

Returns:
true if the selection is to be inverted

setInvertSelection

public void setInvertSelection(boolean inverse)
Sets if selection is to be inverted.

Parameters:
inverse - true if inversion is to be performed

numFoldsTipText

public java.lang.String numFoldsTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

getNumFolds

public int getNumFolds()
Gets the number of folds in which dataset is to be split into.

Returns:
the number of folds the dataset is to be split into.

setNumFolds

public void setNumFolds(int numFolds)
Sets the number of folds the dataset is split into. If the number of folds is zero, it won't split it into folds.

Parameters:
numFolds - number of folds dataset is to be split into
Throws:
java.lang.IllegalArgumentException - if number of folds is negative

foldTipText

public java.lang.String foldTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

getFold

public int getFold()
Gets the fold which is selected.

Returns:
the fold which is selected

setFold

public void setFold(int fold)
Selects a fold.

Parameters:
fold - the fold to be selected.
Throws:
java.lang.IllegalArgumentException - if fold's index is smaller than 1

seedTipText

public java.lang.String seedTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

getSeed

public long getSeed()
Gets the random number seed used for shuffling the dataset.

Returns:
the random number seed

setSeed

public void setSeed(long seed)
Sets the random number seed for shuffling the dataset. If seed is negative, shuffling won't be performed.

Parameters:
seed - the random number seed

setInputFormat

public boolean setInputFormat(Instances instanceInfo)
                       throws java.lang.Exception
Sets the format of the input instances.

Overrides:
setInputFormat in class Filter
Parameters:
instanceInfo - an Instances object containing the input instance structure (any instances contained in the object are ignored - only the structure is required).
Returns:
true because outputFormat can be collected immediately
Throws:
java.lang.Exception - if the input format can't be set successfully

batchFinished

public boolean batchFinished()
Signify that this batch of input to the filter is finished. Output() may now be called to retrieve the filtered instances.

Overrides:
batchFinished in class Filter
Returns:
true if there are instances pending output
Throws:
java.lang.IllegalStateException - if no input structure has been defined

main

public static void main(java.lang.String[] argv)
Main method for testing this class.

Parameters:
argv - should contain arguments to the filter: use -h for help