Environment for
DeveLoping
KDD-Applications
Supported by Index-Structures

de.lmu.ifi.dbs.elki.algorithm.statistics
Class DistanceStatisticsWithClasses<V extends RealVector<V,?>,D extends NumberDistance<D,?>>

java.lang.Object
  extended by de.lmu.ifi.dbs.elki.logging.AbstractLoggable
      extended by de.lmu.ifi.dbs.elki.utilities.optionhandling.AbstractParameterizable
          extended by de.lmu.ifi.dbs.elki.algorithm.AbstractAlgorithm<O,R>
              extended by de.lmu.ifi.dbs.elki.algorithm.DistanceBasedAlgorithm<V,D,CollectionResult<DoubleVector>>
                  extended by de.lmu.ifi.dbs.elki.algorithm.statistics.DistanceStatisticsWithClasses<V,D>
Type Parameters:
V - Vector type
All Implemented Interfaces:
Algorithm<V,CollectionResult<DoubleVector>>, Parameterizable

public class DistanceStatisticsWithClasses<V extends RealVector<V,?>,D extends NumberDistance<D,?>>
extends DistanceBasedAlgorithm<V,D,CollectionResult<DoubleVector>>

Algorithm to gather statistics over the distance distribution in the data set.

Author:
Erich Schubert

Field Summary
static OptionID HISTOGRAM_BINS_ID
          OptionID for HISTOGRAM_BINS_OPTION
private  IntParameter HISTOGRAM_BINS_OPTION
          Option to configure the number of bins to use.
private  int numbin
          Number of bins to use in sampling.
private  CollectionResult<DoubleVector> result
           
private  boolean sampling
          Sampling
private  Flag SAMPLING_FLAG
          Flag to enable sampling Key: -h
static OptionID SAMPLING_ID
          OptionID for SAMPLING_FLAG
 
Fields inherited from class de.lmu.ifi.dbs.elki.algorithm.DistanceBasedAlgorithm
DISTANCE_FUNCTION_ID, DISTANCE_FUNCTION_PARAM
 
Fields inherited from class de.lmu.ifi.dbs.elki.utilities.optionhandling.AbstractParameterizable
optionHandler
 
Fields inherited from class de.lmu.ifi.dbs.elki.logging.AbstractLoggable
debug, logger
 
Constructor Summary
DistanceStatisticsWithClasses()
          Empty constructor.
 
Method Summary
private  DoubleMinMax exactMinMax(Database<V> database, DistanceFunction<V,D> distFunc)
           
 Description getDescription()
          Describe the algorithm and it's use.
 CollectionResult<DoubleVector> getResult()
          Return a result object
protected  CollectionResult<DoubleVector> runInTime(Database<V> database)
          Iterates over all points in the database.
private  DoubleMinMax sampleMinMax(Database<V> database, DistanceFunction<V,D> distFunc)
           
 List<String> setParameters(List<String> args)
          Calls the super method and instantiates DistanceBasedAlgorithm.distanceFunction according to the value of parameter DistanceBasedAlgorithm.DISTANCE_FUNCTION_PARAM.
private  void shrinkHeap(TreeSet<FCPair<Double,Integer>> hotset, int k)
           
 
Methods inherited from class de.lmu.ifi.dbs.elki.algorithm.DistanceBasedAlgorithm
getDistanceFunction
 
Methods inherited from class de.lmu.ifi.dbs.elki.algorithm.AbstractAlgorithm
isTime, isVerbose, run, setTime, setVerbose
 
Methods inherited from class de.lmu.ifi.dbs.elki.utilities.optionhandling.AbstractParameterizable
addOption, addParameterizable, addParameterizable, checkGlobalParameterConstraints, collectOptions, getAttributeSettings, getParameters, rememberParametersExcept, removeOption, removeParameterizable, shortDescription
 
Methods inherited from class de.lmu.ifi.dbs.elki.logging.AbstractLoggable
debugFine, debugFiner, debugFinest, exception, progress, verbose, warning
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface de.lmu.ifi.dbs.elki.utilities.optionhandling.Parameterizable
checkGlobalParameterConstraints, collectOptions, getParameters, shortDescription
 

Field Detail

result

private CollectionResult<DoubleVector> result

SAMPLING_ID

public static final OptionID SAMPLING_ID
OptionID for SAMPLING_FLAG


SAMPLING_FLAG

private final Flag SAMPLING_FLAG
Flag to enable sampling

Key: -h


HISTOGRAM_BINS_ID

public static final OptionID HISTOGRAM_BINS_ID
OptionID for HISTOGRAM_BINS_OPTION


HISTOGRAM_BINS_OPTION

private final IntParameter HISTOGRAM_BINS_OPTION
Option to configure the number of bins to use.


numbin

private int numbin
Number of bins to use in sampling.


sampling

private boolean sampling
Sampling

Constructor Detail

DistanceStatisticsWithClasses

public DistanceStatisticsWithClasses()
Empty constructor. Nothing to do.

Method Detail

runInTime

protected CollectionResult<DoubleVector> runInTime(Database<V> database)
                                            throws IllegalStateException
Iterates over all points in the database.

Specified by:
runInTime in class AbstractAlgorithm<V extends RealVector<V,?>,CollectionResult<DoubleVector>>
Parameters:
database - the database to run the algorithm on
Returns:
the Result computed by this algorithm
Throws:
IllegalStateException - if the algorithm has not been initialized properly (e.g. the setParameters(String[]) method has been failed to be called).

sampleMinMax

private DoubleMinMax sampleMinMax(Database<V> database,
                                  DistanceFunction<V,D> distFunc)

exactMinMax

private DoubleMinMax exactMinMax(Database<V> database,
                                 DistanceFunction<V,D> distFunc)

shrinkHeap

private void shrinkHeap(TreeSet<FCPair<Double,Integer>> hotset,
                        int k)

getDescription

public Description getDescription()
Describe the algorithm and it's use.

Returns:
a description of the algorithm

getResult

public CollectionResult<DoubleVector> getResult()
Return a result object

Returns:
the result of the algorithm

setParameters

public List<String> setParameters(List<String> args)
                           throws ParameterException
Description copied from class: DistanceBasedAlgorithm
Calls the super method and instantiates DistanceBasedAlgorithm.distanceFunction according to the value of parameter DistanceBasedAlgorithm.DISTANCE_FUNCTION_PARAM. The remaining parameters are passed to the DistanceBasedAlgorithm.distanceFunction.

Specified by:
setParameters in interface Parameterizable
Overrides:
setParameters in class DistanceBasedAlgorithm<V extends RealVector<V,?>,D extends NumberDistance<D,?>,CollectionResult<DoubleVector>>
Parameters:
args - parameters to set the attributes accordingly to
Returns:
a list containing the unused parameters
Throws:
ParameterException - in case of wrong parameter-setting

Release 0.2 (2009-07-06_1820)