Environment for
DeveLoping
KDD-Applications
Supported by Index-Structures

de.lmu.ifi.dbs.elki.algorithm
Class DependencyDerivator<V extends RealVector<V,?>,D extends Distance<D>>

java.lang.Object
  extended by de.lmu.ifi.dbs.elki.logging.AbstractLoggable
      extended by de.lmu.ifi.dbs.elki.utilities.optionhandling.AbstractParameterizable
          extended by de.lmu.ifi.dbs.elki.algorithm.AbstractAlgorithm<O>
              extended by de.lmu.ifi.dbs.elki.algorithm.DistanceBasedAlgorithm<V,D>
                  extended by de.lmu.ifi.dbs.elki.algorithm.DependencyDerivator<V,D>
Type Parameters:
V - the type of RealVector handled by this Algorithm
D - the type of Distance used by this Algorithm
All Implemented Interfaces:
Algorithm<V>, Loggable, Parameterizable

public class DependencyDerivator<V extends RealVector<V,?>,D extends Distance<D>>
extends DistanceBasedAlgorithm<V,D>

Dependency derivator computes quantitativly linear dependencies among attributes of a given dataset based on a linear correlation PCA.

Reference:
E. Achtert, C. Boehm, H.-P. Kriegel, P. Kroeger, A. Zimek: Deriving Quantitative Dependencies for Correlation Clusters.
In Proc. 12th Int. Conf. on Knowledge Discovery and Data Mining (KDD '06), Philadelphia, PA 2006.

Author:
Arthur Zimek

Field Summary
 NumberFormat NF
          Number format for output of solution.
static OptionID OUTPUT_ACCURACY_ID
          OptionID for OUTPUT_ACCURACY_PARAM
private  IntParameter OUTPUT_ACCURACY_PARAM
          Parameter to specify the threshold for output accuracy fraction digits, must be an integer equal to or greater than 0.
private  LinearLocalPCA<V> pca
          Holds the object performing the pca.
private  Flag RANDOM_SAMPLE_FLAG
          Flag to use random sample (use knn query around centroid, if flag is not set).
static OptionID SAMPLE_SIZE_ID
          OptionID for SAMPLE_SIZE_PARAM
private  IntParameter SAMPLE_SIZE_PARAM
          Optional parameter to specify the treshold for the size of the random sample to use, must be an integer greater than 0.
private  Integer sampleSize
          Holds the value of SAMPLE_SIZE_PARAM.
private  CorrelationAnalysisSolution<V> solution
          Holds the solution.
 
Fields inherited from class de.lmu.ifi.dbs.elki.algorithm.DistanceBasedAlgorithm
DISTANCE_FUNCTION_ID, DISTANCE_FUNCTION_PARAM
 
Fields inherited from class de.lmu.ifi.dbs.elki.utilities.optionhandling.AbstractParameterizable
optionHandler
 
Fields inherited from class de.lmu.ifi.dbs.elki.logging.AbstractLoggable
debug
 
Constructor Summary
DependencyDerivator()
          Provides a dependency derivator, adding parameters OUTPUT_ACCURACY_PARAM, SAMPLE_SIZE_PARAM, and flag RANDOM_SAMPLE_FLAG to the option handler additionally to parameters of super class.
 
Method Summary
 List<AttributeSettings> getAttributeSettings()
          Calls DistanceBasedAlgorithm.getAttributeSettings() and adds to the returned attribute settings the attribute settings of the pca.
 Description getDescription()
          Returns a description of the algorithm.
 CorrelationAnalysisSolution<V> getResult()
          Returns the result of the algorithm.
 void runInTime(Database<V> db)
          Runs the pca.
 String[] setParameters(String[] args)
          Calls DistanceBasedAlgorithm#setParameters(args) and sets additionally the values of the parameters OUTPUT_ACCURACY_PARAM and SAMPLE_SIZE_PARAM.
 
Methods inherited from class de.lmu.ifi.dbs.elki.algorithm.DistanceBasedAlgorithm
getDistanceFunction
 
Methods inherited from class de.lmu.ifi.dbs.elki.algorithm.AbstractAlgorithm
description, isTime, isVerbose, run, setTime, setVerbose
 
Methods inherited from class de.lmu.ifi.dbs.elki.utilities.optionhandling.AbstractParameterizable
addOption, checkGlobalParameterConstraints, deleteOption, description, description, getParameters, getParameterValue, getPossibleOptions, inlineDescription, isSet, setParameters
 
Methods inherited from class de.lmu.ifi.dbs.elki.logging.AbstractLoggable
debugFine, debugFiner, debugFinest, exception, message, progress, progress, progress, verbose, verbose, warning
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface de.lmu.ifi.dbs.elki.utilities.optionhandling.Parameterizable
checkGlobalParameterConstraints, getParameters, getPossibleOptions, inlineDescription
 

Field Detail

OUTPUT_ACCURACY_ID

public static final OptionID OUTPUT_ACCURACY_ID
OptionID for OUTPUT_ACCURACY_PARAM


OUTPUT_ACCURACY_PARAM

private final IntParameter OUTPUT_ACCURACY_PARAM
Parameter to specify the threshold for output accuracy fraction digits, must be an integer equal to or greater than 0.

Default value: 4

Key: -derivator.accuracy


SAMPLE_SIZE_ID

public static final OptionID SAMPLE_SIZE_ID
OptionID for SAMPLE_SIZE_PARAM


SAMPLE_SIZE_PARAM

private final IntParameter SAMPLE_SIZE_PARAM
Optional parameter to specify the treshold for the size of the random sample to use, must be an integer greater than 0.

Default value: the size of the complete dataset

Key: -derivator.sampleSize


sampleSize

private Integer sampleSize
Holds the value of SAMPLE_SIZE_PARAM.


RANDOM_SAMPLE_FLAG

private final Flag RANDOM_SAMPLE_FLAG
Flag to use random sample (use knn query around centroid, if flag is not set).

Key: -derivator.randomSample


pca

private LinearLocalPCA<V extends RealVector<V,?>> pca
Holds the object performing the pca.


solution

private CorrelationAnalysisSolution<V extends RealVector<V,?>> solution
Holds the solution.


NF

public final NumberFormat NF
Number format for output of solution.

Constructor Detail

DependencyDerivator

public DependencyDerivator()
Provides a dependency derivator, adding parameters OUTPUT_ACCURACY_PARAM, SAMPLE_SIZE_PARAM, and flag RANDOM_SAMPLE_FLAG to the option handler additionally to parameters of super class.

Method Detail

getDescription

public Description getDescription()
Description copied from interface: Algorithm
Returns a description of the algorithm.

Returns:
a description of the algorithm
See Also:
Algorithm.getDescription()

runInTime

public void runInTime(Database<V> db)
               throws IllegalStateException
Runs the pca.

Specified by:
runInTime in class AbstractAlgorithm<V extends RealVector<V,?>>
Parameters:
db - the database
Throws:
IllegalStateException - if the algorithm has not been initialized properly (e.g. the setParameters(String[]) method has been failed to be called).
See Also:
AbstractAlgorithm.runInTime(Database)

getResult

public CorrelationAnalysisSolution<V> getResult()
Description copied from interface: Algorithm
Returns the result of the algorithm.

Returns:
the result of the algorithm
See Also:
Algorithm.getResult()

setParameters

public String[] setParameters(String[] args)
                       throws ParameterException
Calls DistanceBasedAlgorithm#setParameters(args) and sets additionally the values of the parameters OUTPUT_ACCURACY_PARAM and SAMPLE_SIZE_PARAM. The remaining parameters are passed to the pca.

Specified by:
setParameters in interface Parameterizable
Overrides:
setParameters in class DistanceBasedAlgorithm<V extends RealVector<V,?>,D extends Distance<D>>
Parameters:
args - parameters to set the attributes accordingly to
Returns:
String[] an array containing the unused parameters
Throws:
ParameterException - in case of wrong parameter-setting
See Also:
Parameterizable.setParameters(String[])

getAttributeSettings

public List<AttributeSettings> getAttributeSettings()
Calls DistanceBasedAlgorithm.getAttributeSettings() and adds to the returned attribute settings the attribute settings of the pca.

Specified by:
getAttributeSettings in interface Parameterizable
Overrides:
getAttributeSettings in class DistanceBasedAlgorithm<V extends RealVector<V,?>,D extends Distance<D>>
Returns:
the setting of the attributes of the parameterizable
See Also:
Parameterizable.getAttributeSettings()

Release 0.1 (2008-07-10_1838)