weka.filters.unsupervised.attribute
Class AddExpression

java.lang.Object
  extended byweka.filters.Filter
      extended byweka.filters.unsupervised.attribute.AddExpression
All Implemented Interfaces:
OptionHandler, java.io.Serializable, StreamableFilter, UnsupervisedFilter

public class AddExpression
extends Filter
implements UnsupervisedFilter, StreamableFilter, OptionHandler

Applys a mathematical expression involving attributes and numeric constants to a dataset. A new attribute is appended after the last attribute that contains the result of applying the expression. Supported operators are: +, -, *, /, ^, log, abs, cos, exp, sqrt, floor, ceil, rint, tan, sin, (, ). Attributes are specified by prefixing with 'a', eg. a7 is attribute number 7 (starting from 1).

Valid filter-specific options are:

-E expression
Specify the expression to apply. Eg. a1^2*a5/log(a7*4.0).

-N name
Specify a name for the new attribute. Default is to name it with the expression provided with the -E option.

-D
Debug. Names the attribute with the postfix parse of the expression.

Version:
$Revision: 1.2 $
Author:
Mark Hall (mhall@cs.waikato.ac.nz)
See Also:
Serialized Form

Nested Class Summary
private  class AddExpression.AttributeOperand
          Inner class handling an attribute index as an operand
private  class AddExpression.NumericOperand
          Inner class for storing numeric constant opperands
private  class AddExpression.Operator
          Inner class for storing operators
 
Field Summary
private  java.lang.String m_attributeName
          Name of the new attribute.
private  boolean m_Debug
          If true, makes the attribute name equal to the postfix parse of the expression
private  java.lang.String m_infixExpression
          The infix expression
private  java.util.Stack m_operatorStack
          Operator stack
private  java.util.Vector m_postFixExpVector
          Holds the expression in postfix form
private  java.lang.String m_previousTok
          Holds the previous token
private  boolean m_signMod
          True if the next numeric constant or attribute index is negative
private static java.lang.String OPERATORS
          Supported operators. l = log, b = abs, c = cos, e = exp, s = sqrt, f = floor, h = ceil, r = rint, t = tan, n = sin
private static java.lang.String UNARY_FUNCTIONS
           
 
Fields inherited from class weka.filters.Filter
m_NewBatch
 
Constructor Summary
AddExpression()
           
 
Method Summary
private  void convertInfixToPostfix(java.lang.String infixExp)
          Converts a string containing a mathematical expression in infix form to postfix form.
 java.lang.String debugTipText()
          Returns the tip text for this property
private  void evaluateExpression(double[] vals)
          Evaluate the expression using the supplied array of attribute values.
 java.lang.String expressionTipText()
          Returns the tip text for this property
 boolean getDebug()
          Gets whether debug is set
 java.lang.String getExpression()
          Get the expression
 java.lang.String getName()
          Returns the name of the new attribute
 java.lang.String[] getOptions()
          Gets the current settings of the filter.
 java.lang.String globalInfo()
          Returns a string describing this filter
private  void handleOperand(java.lang.String tok)
          Handles the processing of an infix operand to postfix
private  void handleOperator(java.lang.String tok)
          Handles the processing of an infix operator to postfix
private  int infixPriority(char opp)
          Return the infix priority of an operator
 boolean input(Instance instance)
          Input an instance for filtering.
private  boolean isOperator(char tok)
          Returns true if a token is an operator
private  boolean isUnaryFunction(char tok)
          Returns true if a token is a unary function
 java.util.Enumeration listOptions()
          Returns an enumeration describing the available options.
static void main(java.lang.String[] args)
          Main method for testing this class.
 java.lang.String nameTipText()
          Returns the tip text for this property
 void setDebug(boolean d)
          Set debug mode.
 void setExpression(java.lang.String expr)
          Set the expression to apply
 boolean setInputFormat(Instances instanceInfo)
          Sets the format of the input instances.
 void setName(java.lang.String name)
          Set the name for the new attribute.
 void setOptions(java.lang.String[] options)
          Parses a list of options for this object.
private  int stackPriority(char opp)
          Return the stack priority of an operator
 
Methods inherited from class weka.filters.Filter
batchFilterFile, batchFinished, bufferInput, copyStringValues, copyStringValues, filterFile, flushInput, getInputFormat, getInputStringIndex, getOutputFormat, getOutputStringIndex, getStringIndices, inputFormat, inputFormatPeek, isOutputFormatDefined, numPendingOutput, output, outputFormat, outputFormatPeek, outputPeek, push, resetQueue, setOutputFormat, useFilter
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

m_infixExpression

private java.lang.String m_infixExpression
The infix expression


m_operatorStack

private java.util.Stack m_operatorStack
Operator stack


OPERATORS

private static final java.lang.String OPERATORS
Supported operators. l = log, b = abs, c = cos, e = exp, s = sqrt, f = floor, h = ceil, r = rint, t = tan, n = sin

See Also:
Constant Field Values

UNARY_FUNCTIONS

private static final java.lang.String UNARY_FUNCTIONS
See Also:
Constant Field Values

m_postFixExpVector

private java.util.Vector m_postFixExpVector
Holds the expression in postfix form


m_signMod

private boolean m_signMod
True if the next numeric constant or attribute index is negative


m_previousTok

private java.lang.String m_previousTok
Holds the previous token


m_attributeName

private java.lang.String m_attributeName
Name of the new attribute. "expression" length string will use the provided expression as the new attribute name


m_Debug

private boolean m_Debug
If true, makes the attribute name equal to the postfix parse of the expression

Constructor Detail

AddExpression

public AddExpression()
Method Detail

globalInfo

public java.lang.String globalInfo()
Returns a string describing this filter

Returns:
a description of the filter suitable for displaying in the explorer/experimenter gui

handleOperand

private void handleOperand(java.lang.String tok)
                    throws java.lang.Exception
Handles the processing of an infix operand to postfix

Parameters:
tok - the infix operand
Throws:
java.lang.Exception - if there is difficulty parsing the operand

handleOperator

private void handleOperator(java.lang.String tok)
                     throws java.lang.Exception
Handles the processing of an infix operator to postfix

Parameters:
tok - the infix operator
Throws:
java.lang.Exception - if there is difficulty parsing the operator

convertInfixToPostfix

private void convertInfixToPostfix(java.lang.String infixExp)
                            throws java.lang.Exception
Converts a string containing a mathematical expression in infix form to postfix form. The result is stored in the vector m_postfixExpVector

Parameters:
infixExp - the infix expression to convert
Throws:
java.lang.Exception - if something goes wrong during the conversion

evaluateExpression

private void evaluateExpression(double[] vals)
                         throws java.lang.Exception
Evaluate the expression using the supplied array of attribute values. The result is stored in the last element of the array. Assumes that the infix expression has been converted to postfix and stored in m_postFixExpVector

Parameters:
vals - the values to apply the expression to
Throws:
java.lang.Exception - if something goes wrong

isOperator

private boolean isOperator(char tok)
Returns true if a token is an operator

Parameters:
tok - the token to check
Returns:
true if the supplied token is an operator

isUnaryFunction

private boolean isUnaryFunction(char tok)
Returns true if a token is a unary function

Parameters:
tok - the token to check
Returns:
true if the supplied token is a unary function

infixPriority

private int infixPriority(char opp)
Return the infix priority of an operator

Returns:
the infix priority

stackPriority

private int stackPriority(char opp)
Return the stack priority of an operator

Returns:
the stack priority

listOptions

public java.util.Enumeration listOptions()
Returns an enumeration describing the available options.

Specified by:
listOptions in interface OptionHandler
Returns:
an enumeration of all the available options.

setOptions

public void setOptions(java.lang.String[] options)
                throws java.lang.Exception
Parses a list of options for this object. Valid options are:

-E expression
Specify the expression to apply. Eg. a1^2*a5/log(a7*4.0).

-N name
Specify a name for the new attribute. Default is to name it with the expression provided with the -E option.

-D
Debug. Names the attribute with the postfix parse of the expression.

Specified by:
setOptions in interface OptionHandler
Parameters:
options - the list of options as an array of strings
Throws:
java.lang.Exception - if an option is not supported

getOptions

public java.lang.String[] getOptions()
Gets the current settings of the filter.

Specified by:
getOptions in interface OptionHandler
Returns:
an array of strings suitable for passing to setOptions

nameTipText

public java.lang.String nameTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setName

public void setName(java.lang.String name)
Set the name for the new attribute. The string "expression" can be used to make the name of the new attribute equal to the expression provided.

Parameters:
name - the name of the new attribute

getName

public java.lang.String getName()
Returns the name of the new attribute

Returns:
the name of the new attribute

debugTipText

public java.lang.String debugTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setDebug

public void setDebug(boolean d)
Set debug mode. Causes the new attribute to be named with the postfix parse of the expression

Parameters:
d - true if debug mode is to be used

getDebug

public boolean getDebug()
Gets whether debug is set

Returns:
true if debug is set

expressionTipText

public java.lang.String expressionTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setExpression

public void setExpression(java.lang.String expr)
Set the expression to apply

Parameters:
expr - a mathematical expression to apply

getExpression

public java.lang.String getExpression()
Get the expression

Returns:
the expression

setInputFormat

public boolean setInputFormat(Instances instanceInfo)
                       throws java.lang.Exception
Sets the format of the input instances.

Overrides:
setInputFormat in class Filter
Parameters:
instanceInfo - an Instances object containing the input instance structure (any instances contained in the object are ignored - only the structure is required).
Returns:
true if the outputFormat may be collected immediately
Throws:
java.lang.Exception - if the format couldn't be set successfully

input

public boolean input(Instance instance)
              throws java.lang.Exception
Input an instance for filtering. Ordinarily the instance is processed and made available for output immediately. Some filters require all instances be read before producing output.

Overrides:
input in class Filter
Parameters:
instance - the input instance
Returns:
true if the filtered instance may now be collected with output().
Throws:
java.lang.IllegalStateException - if no input format has been defined.
java.lang.Exception - if there was a problem during the filtering.

main

public static void main(java.lang.String[] args)
Main method for testing this class.

Parameters:
args - should contain arguments to the filter: use -h for help