weka.experiment
Class AveragingResultProducer

java.lang.Object
  extended byweka.experiment.AveragingResultProducer
All Implemented Interfaces:
AdditionalMeasureProducer, OptionHandler, ResultListener, ResultProducer, java.io.Serializable

public class AveragingResultProducer
extends java.lang.Object
implements ResultListener, ResultProducer, OptionHandler, AdditionalMeasureProducer

AveragingResultProducer takes the results from a ResultProducer and submits the average to the result listener. For non-numeric result fields, the first value is used.

Version:
$Revision: 1.14 $
Author:
Len Trigg (trigg@cs.waikato.ac.nz)
See Also:
Serialized Form

Field Summary
protected  java.lang.String[] m_AdditionalMeasures
          The names of any additional measures to look for in SplitEvaluators
protected  boolean m_CalculateStdDevs
          True if standard deviation fields should be produced
protected  java.lang.String m_CountFieldName
          The name of the field that will contain the number of results averaged over.
protected  int m_ExpectedResultsPerAverage
          The number of results expected to average over for each run
protected  Instances m_Instances
          The dataset of interest
protected  java.lang.String m_KeyFieldName
          The name of the key field to average over
protected  int m_KeyIndex
          The index of the field to average over in the resultproducers key
protected  FastVector m_Keys
          Collects the keys from a single run
protected  ResultListener m_ResultListener
          The ResultListener to send results to
protected  ResultProducer m_ResultProducer
          The ResultProducer used to generate results
protected  FastVector m_Results
          Collects the results from a single run
 
Constructor Summary
AveragingResultProducer()
           
 
Method Summary
 void acceptResult(ResultProducer rp, java.lang.Object[] key, java.lang.Object[] result)
          Accepts results from a ResultProducer.
 java.lang.String calculateStdDevsTipText()
          Returns the tip text for this property
protected  void checkForDuplicateKeys(java.lang.Object[] template)
          Checks whether any duplicate results (with respect to a key template) were received.
protected  void checkForMultipleDifferences()
          Checks that the keys for a run only differ in one key field.
 java.lang.String[] determineColumnConstraints(ResultProducer rp)
          Determines if there are any constraints (imposed by the destination) on the result columns to be produced by resultProducers.
protected  java.lang.Object[] determineTemplate(int run)
          Simulates a run to collect the keys the sub-resultproducer could generate.
protected  void doAverageResult(java.lang.Object[] template)
          Asks the resultlistener whether an average result is required, and if so, calculates it.
 void doRun(int run)
          Gets the results for a specified run number.
 void doRunKeys(int run)
          Gets the keys for a specified run number.
 java.util.Enumeration enumerateMeasures()
          Returns an enumeration of any additional measure names that might be in the result producer
 java.lang.String expectedResultsPerAverageTipText()
          Returns the tip text for this property
protected  int findKeyIndex()
          Scans through the key field names of the result producer to find the index of the key field to average over.
 boolean getCalculateStdDevs()
          Get the value of CalculateStdDevs.
 java.lang.String getCompatibilityState()
          Gets a description of the internal settings of the result producer, sufficient for distinguishing a ResultProducer instance from another with different settings (ignoring those settings set through this interface).
 int getExpectedResultsPerAverage()
          Get the value of ExpectedResultsPerAverage.
 java.lang.String getKeyFieldName()
          Get the value of KeyFieldName.
 java.lang.String[] getKeyNames()
          Gets the names of each of the columns produced for a single run.
 java.lang.Object[] getKeyTypes()
          Gets the data types of each of the columns produced for a single run.
 double getMeasure(java.lang.String additionalMeasureName)
          Returns the value of the named measure
 java.lang.String[] getOptions()
          Gets the current settings of the result producer.
 java.lang.String[] getResultNames()
          Gets the names of each of the columns produced for a single run.
 ResultProducer getResultProducer()
          Get the ResultProducer.
 java.lang.Object[] getResultTypes()
          Gets the data types of each of the columns produced for a single run.
 java.lang.String globalInfo()
          Returns a string describing this result producer
 boolean isResultRequired(ResultProducer rp, java.lang.Object[] key)
          Determines whether the results for a specified key must be generated.
 java.lang.String keyFieldNameTipText()
          Returns the tip text for this property
 java.util.Enumeration listOptions()
          Returns an enumeration describing the available options..
protected  boolean matchesTemplate(java.lang.Object[] template, java.lang.Object[] test)
          Compares a key to a template to see whether they match.
 void postProcess()
          When this method is called, it indicates that no more requests to generate results for the current experiment will be sent.
 void postProcess(ResultProducer rp)
          When this method is called, it indicates that no more results will be sent that need to be grouped together in any way.
 void preProcess()
          Prepare to generate results.
 void preProcess(ResultProducer rp)
          Prepare for the results to be received.
 java.lang.String resultProducerTipText()
          Returns the tip text for this property
 void setAdditionalMeasures(java.lang.String[] additionalMeasures)
          Set a list of method names for additional measures to look for in SplitEvaluators.
 void setCalculateStdDevs(boolean newCalculateStdDevs)
          Set the value of CalculateStdDevs.
 void setExpectedResultsPerAverage(int newExpectedResultsPerAverage)
          Set the value of ExpectedResultsPerAverage.
 void setInstances(Instances instances)
          Sets the dataset that results will be obtained for.
 void setKeyFieldName(java.lang.String newKeyFieldName)
          Set the value of KeyFieldName.
 void setOptions(java.lang.String[] options)
          Parses a given list of options.
 void setResultListener(ResultListener listener)
          Sets the object to send results of each run to.
 void setResultProducer(ResultProducer newResultProducer)
          Set the ResultProducer.
 java.lang.String toString()
          Gets a text descrption of the result producer.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

m_Instances

protected Instances m_Instances
The dataset of interest


m_ResultListener

protected ResultListener m_ResultListener
The ResultListener to send results to


m_ResultProducer

protected ResultProducer m_ResultProducer
The ResultProducer used to generate results


m_AdditionalMeasures

protected java.lang.String[] m_AdditionalMeasures
The names of any additional measures to look for in SplitEvaluators


m_ExpectedResultsPerAverage

protected int m_ExpectedResultsPerAverage
The number of results expected to average over for each run


m_CalculateStdDevs

protected boolean m_CalculateStdDevs
True if standard deviation fields should be produced


m_CountFieldName

protected java.lang.String m_CountFieldName
The name of the field that will contain the number of results averaged over.


m_KeyFieldName

protected java.lang.String m_KeyFieldName
The name of the key field to average over


m_KeyIndex

protected int m_KeyIndex
The index of the field to average over in the resultproducers key


m_Keys

protected FastVector m_Keys
Collects the keys from a single run


m_Results

protected FastVector m_Results
Collects the results from a single run

Constructor Detail

AveragingResultProducer

public AveragingResultProducer()
Method Detail

globalInfo

public java.lang.String globalInfo()
Returns a string describing this result producer

Returns:
a description of the result producer suitable for displaying in the explorer/experimenter gui

findKeyIndex

protected int findKeyIndex()
Scans through the key field names of the result producer to find the index of the key field to average over. Sets the value of m_KeyIndex to the index, or -1 if no matching key field was found.

Returns:
the index of the key field to average over

determineColumnConstraints

public java.lang.String[] determineColumnConstraints(ResultProducer rp)
                                              throws java.lang.Exception
Determines if there are any constraints (imposed by the destination) on the result columns to be produced by resultProducers. Null should be returned if there are NO constraints, otherwise a list of column names should be returned as an array of Strings.

Specified by:
determineColumnConstraints in interface ResultListener
Parameters:
rp - the ResultProducer to which the constraints will apply
Returns:
an array of column names to which resutltProducer's results will be restricted.
Throws:
java.lang.Exception - if constraints can't be determined

determineTemplate

protected java.lang.Object[] determineTemplate(int run)
                                        throws java.lang.Exception
Simulates a run to collect the keys the sub-resultproducer could generate. Does some checking on the keys and determines the template key.

Parameters:
run - the run number
Returns:
a template key (null for the field being averaged)
Throws:
java.lang.Exception - if an error occurs

doRunKeys

public void doRunKeys(int run)
               throws java.lang.Exception
Gets the keys for a specified run number. Different run numbers correspond to different randomizations of the data. Keys produced should be sent to the current ResultListener

Specified by:
doRunKeys in interface ResultProducer
Parameters:
run - the run number to get keys for.
Throws:
java.lang.Exception - if a problem occurs while getting the keys

doRun

public void doRun(int run)
           throws java.lang.Exception
Gets the results for a specified run number. Different run numbers correspond to different randomizations of the data. Results produced should be sent to the current ResultListener

Specified by:
doRun in interface ResultProducer
Parameters:
run - the run number to get results for.
Throws:
java.lang.Exception - if a problem occurs while getting the results

matchesTemplate

protected boolean matchesTemplate(java.lang.Object[] template,
                                  java.lang.Object[] test)
Compares a key to a template to see whether they match. Null fields in the template are ignored in the matching.

Parameters:
template - the template to match against
test - the key to test
Returns:
true if the test key matches the template on all non-null template fields

doAverageResult

protected void doAverageResult(java.lang.Object[] template)
                        throws java.lang.Exception
Asks the resultlistener whether an average result is required, and if so, calculates it.

Parameters:
template - the template to match keys against when calculating the average
Throws:
java.lang.Exception - if an error occurs

checkForDuplicateKeys

protected void checkForDuplicateKeys(java.lang.Object[] template)
                              throws java.lang.Exception
Checks whether any duplicate results (with respect to a key template) were received.

Parameters:
template - the template key.
Throws:
java.lang.Exception - if duplicate results are detected

checkForMultipleDifferences

protected void checkForMultipleDifferences()
                                    throws java.lang.Exception
Checks that the keys for a run only differ in one key field. If they differ in more than one field, a more sophisticated averager will submit multiple results - for now an exception is thrown. Currently assumes that the most differences will be shown between the first and last result received.

Throws:
java.lang.Exception - if the keys differ on fields other than the key averaging field

preProcess

public void preProcess(ResultProducer rp)
                throws java.lang.Exception
Prepare for the results to be received.

Specified by:
preProcess in interface ResultListener
Parameters:
rp - the ResultProducer that will generate the results
Throws:
java.lang.Exception - if an error occurs during preprocessing.

preProcess

public void preProcess()
                throws java.lang.Exception
Prepare to generate results. The ResultProducer should call preProcess(this) on the ResultListener it is to send results to.

Specified by:
preProcess in interface ResultProducer
Throws:
java.lang.Exception - if an error occurs during preprocessing.

postProcess

public void postProcess(ResultProducer rp)
                 throws java.lang.Exception
When this method is called, it indicates that no more results will be sent that need to be grouped together in any way.

Specified by:
postProcess in interface ResultListener
Parameters:
rp - the ResultProducer that generated the results
Throws:
java.lang.Exception - if an error occurs

postProcess

public void postProcess()
                 throws java.lang.Exception
When this method is called, it indicates that no more requests to generate results for the current experiment will be sent. The ResultProducer should call preProcess(this) on the ResultListener it is to send results to.

Specified by:
postProcess in interface ResultProducer
Throws:
java.lang.Exception - if an error occurs

acceptResult

public void acceptResult(ResultProducer rp,
                         java.lang.Object[] key,
                         java.lang.Object[] result)
                  throws java.lang.Exception
Accepts results from a ResultProducer.

Specified by:
acceptResult in interface ResultListener
Parameters:
rp - the ResultProducer that generated the results
key - an array of Objects (Strings or Doubles) that uniquely identify a result for a given ResultProducer with given compatibilityState
result - the results stored in an array. The objects stored in the array may be Strings, Doubles, or null (for the missing value).
Throws:
java.lang.Exception - if the result could not be accepted.

isResultRequired

public boolean isResultRequired(ResultProducer rp,
                                java.lang.Object[] key)
                         throws java.lang.Exception
Determines whether the results for a specified key must be generated.

Specified by:
isResultRequired in interface ResultListener
Parameters:
rp - the ResultProducer wanting to generate the results
key - an array of Objects (Strings or Doubles) that uniquely identify a result for a given ResultProducer with given compatibilityState
Returns:
true if the result should be generated
Throws:
java.lang.Exception - if it could not be determined if the result is needed.

getKeyNames

public java.lang.String[] getKeyNames()
                               throws java.lang.Exception
Gets the names of each of the columns produced for a single run.

Specified by:
getKeyNames in interface ResultProducer
Returns:
an array containing the name of each column
Throws:
java.lang.Exception - if key names cannot be generated

getKeyTypes

public java.lang.Object[] getKeyTypes()
                               throws java.lang.Exception
Gets the data types of each of the columns produced for a single run. This method should really be static.

Specified by:
getKeyTypes in interface ResultProducer
Returns:
an array containing objects of the type of each column. The objects should be Strings, or Doubles.
Throws:
java.lang.Exception - if the key types could not be determined (perhaps because of a problem from a nested sub-resultproducer)

getResultNames

public java.lang.String[] getResultNames()
                                  throws java.lang.Exception
Gets the names of each of the columns produced for a single run. A new result field is added for the number of results used to produce each average. If only averages are being produced the names are not altered, if standard deviations are produced then "Dev_" and "Avg_" are prepended to each result deviation and average field respectively.

Specified by:
getResultNames in interface ResultProducer
Returns:
an array containing the name of each column
Throws:
java.lang.Exception - if the result names could not be determined (perhaps because of a problem from a nested sub-resultproducer)

getResultTypes

public java.lang.Object[] getResultTypes()
                                  throws java.lang.Exception
Gets the data types of each of the columns produced for a single run.

Specified by:
getResultTypes in interface ResultProducer
Returns:
an array containing objects of the type of each column. The objects should be Strings, or Doubles.
Throws:
java.lang.Exception - if the result types could not be determined (perhaps because of a problem from a nested sub-resultproducer)

getCompatibilityState

public java.lang.String getCompatibilityState()
Gets a description of the internal settings of the result producer, sufficient for distinguishing a ResultProducer instance from another with different settings (ignoring those settings set through this interface). For example, a cross-validation ResultProducer may have a setting for the number of folds. For a given state, the results produced should be compatible. Typically if a ResultProducer is an OptionHandler, this string will represent the command line arguments required to set the ResultProducer to that state.

Specified by:
getCompatibilityState in interface ResultProducer
Returns:
the description of the ResultProducer state, or null if no state is defined

listOptions

public java.util.Enumeration listOptions()
Returns an enumeration describing the available options..

Specified by:
listOptions in interface OptionHandler
Returns:
an enumeration of all the available options.

setOptions

public void setOptions(java.lang.String[] options)
                throws java.lang.Exception
Parses a given list of options. Valid options are:

-F name
The field name that will be unique for a run (default "Fold")

-X num_results
The expected number of results per run. (default 10)

-S
Calculate standard deviations. (default only averages)

-W classname
Specify the full class name of the result producer.

All option after -- will be passed to the result producer.

Specified by:
setOptions in interface OptionHandler
Parameters:
options - the list of options as an array of strings
Throws:
java.lang.Exception - if an option is not supported

getOptions

public java.lang.String[] getOptions()
Gets the current settings of the result producer.

Specified by:
getOptions in interface OptionHandler
Returns:
an array of strings suitable for passing to setOptions

setAdditionalMeasures

public void setAdditionalMeasures(java.lang.String[] additionalMeasures)
Set a list of method names for additional measures to look for in SplitEvaluators. This could contain many measures (of which only a subset may be produceable by the current resultProducer) if an experiment is the type that iterates over a set of properties.

Specified by:
setAdditionalMeasures in interface ResultProducer
Parameters:
additionalMeasures - an array of measure names, null if none

enumerateMeasures

public java.util.Enumeration enumerateMeasures()
Returns an enumeration of any additional measure names that might be in the result producer

Specified by:
enumerateMeasures in interface AdditionalMeasureProducer
Returns:
an enumeration of the measure names

getMeasure

public double getMeasure(java.lang.String additionalMeasureName)
Returns the value of the named measure

Specified by:
getMeasure in interface AdditionalMeasureProducer
Parameters:
additionalMeasureName - the name of the measure to query for its value
Returns:
the value of the named measure
Throws:
java.lang.IllegalArgumentException - if the named measure is not supported

setInstances

public void setInstances(Instances instances)
Sets the dataset that results will be obtained for.

Specified by:
setInstances in interface ResultProducer
Parameters:
instances - a value of type 'Instances'.

calculateStdDevsTipText

public java.lang.String calculateStdDevsTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

getCalculateStdDevs

public boolean getCalculateStdDevs()
Get the value of CalculateStdDevs.

Returns:
Value of CalculateStdDevs.

setCalculateStdDevs

public void setCalculateStdDevs(boolean newCalculateStdDevs)
Set the value of CalculateStdDevs.

Parameters:
newCalculateStdDevs - Value to assign to CalculateStdDevs.

expectedResultsPerAverageTipText

public java.lang.String expectedResultsPerAverageTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

getExpectedResultsPerAverage

public int getExpectedResultsPerAverage()
Get the value of ExpectedResultsPerAverage.

Returns:
Value of ExpectedResultsPerAverage.

setExpectedResultsPerAverage

public void setExpectedResultsPerAverage(int newExpectedResultsPerAverage)
Set the value of ExpectedResultsPerAverage.

Parameters:
newExpectedResultsPerAverage - Value to assign to ExpectedResultsPerAverage.

keyFieldNameTipText

public java.lang.String keyFieldNameTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

getKeyFieldName

public java.lang.String getKeyFieldName()
Get the value of KeyFieldName.

Returns:
Value of KeyFieldName.

setKeyFieldName

public void setKeyFieldName(java.lang.String newKeyFieldName)
Set the value of KeyFieldName.

Parameters:
newKeyFieldName - Value to assign to KeyFieldName.

setResultListener

public void setResultListener(ResultListener listener)
Sets the object to send results of each run to.

Specified by:
setResultListener in interface ResultProducer
Parameters:
listener - a value of type 'ResultListener'

resultProducerTipText

public java.lang.String resultProducerTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

getResultProducer

public ResultProducer getResultProducer()
Get the ResultProducer.

Returns:
the ResultProducer.

setResultProducer

public void setResultProducer(ResultProducer newResultProducer)
Set the ResultProducer.

Parameters:
newResultProducer - new ResultProducer to use.

toString

public java.lang.String toString()
Gets a text descrption of the result producer.

Returns:
a text description of the result producer.