Environment for
DeveLoping
KDD-Applications
Supported by Index-Structures

de.lmu.ifi.dbs.elki.preprocessing
Class DiSHPreprocessor<V extends RealVector<V,N>,N extends Number>

java.lang.Object
  extended by de.lmu.ifi.dbs.elki.logging.AbstractLoggable
      extended by de.lmu.ifi.dbs.elki.utilities.optionhandling.AbstractParameterizable
          extended by de.lmu.ifi.dbs.elki.preprocessing.DiSHPreprocessor<V,N>
Type Parameters:
V - Vector type
N - Number type
All Implemented Interfaces:
PreferenceVectorPreprocessor<V>, Preprocessor<V>, Parameterizable

public class DiSHPreprocessor<V extends RealVector<V,N>,N extends Number>
extends AbstractParameterizable
implements PreferenceVectorPreprocessor<V>

Preprocessor for DiSH preference vector assignment to objects of a certain database.

Author:
Elke Achtert

Nested Class Summary
static class DiSHPreprocessor.Strategy
          Available strategies for determination of the preference vector.
 
Field Summary
private static String CONDITION
          Description for the determination of the preference vector.
static DoubleDistance DEFAULT_EPSILON
          The default value for epsilon.
static DiSHPreprocessor.Strategy DEFAULT_STRATEGY
          Default strategy.
private  DoubleDistance[] epsilon
          The epsilon value for each dimension;
static OptionID EPSILON_ID
          OptionID for EPSILON_PARAM
protected  DoubleListParameter EPSILON_PARAM
          Parameter Epsilon.
private  int minpts
          Threshold for minimum number of points in the neighborhood.
static OptionID MINPTS_ID
          OptionID for MINPTS_PARAM
static String MINPTS_P
          Option name
protected  IntParameter MINPTS_PARAM
          Parameter Minpts.
private  DiSHPreprocessor.Strategy strategy
          The strategy to determine the preference vector.
static OptionID STRATEGY_ID
          OptionID for STRATEGY_PARAM
private  PatternParameter STRATEGY_PARAM
          Parameter Strategy.
 
Fields inherited from class de.lmu.ifi.dbs.elki.utilities.optionhandling.AbstractParameterizable
optionHandler
 
Fields inherited from class de.lmu.ifi.dbs.elki.logging.AbstractLoggable
debug, logger
 
Constructor Summary
DiSHPreprocessor()
          Provides a new AdvancedHiSCPreprocessor that computes the preference vector of objects of a certain database.
 
Method Summary
private  BitSet determinePreferenceVector(Database<V> database, Set<Integer>[] neighborIDs, StringBuffer msg)
          Determines the preference vector according to the specified neighbor ids.
private  BitSet determinePreferenceVectorByApriori(Database<V> database, Set<Integer>[] neighborIDs, StringBuffer msg)
          Determines the preference vector with the apriori strategy.
private  BitSet determinePreferenceVectorByMaxIntersection(Set<Integer>[] neighborIDs, StringBuffer msg)
          Determines the preference vector with the max intersection strategy.
 DoubleDistance[] getEpsilon()
          Returns the value of the epsilon parameter.
 int getMinpts()
          Returns minpts.
private  DimensionSelectingDistanceFunction<N,V>[] initDistanceFunctions(Database<V> database, int dimensionality, boolean verbose, boolean time)
          Initializes the dimension selecting distancefunctions to determine the preference vectors.
private  int max(Map<Integer,Set<Integer>> candidates)
          Returns the set with the maximum size contained in the specified map.
private  int maxIntersection(Map<Integer,Set<Integer>> candidates, Set<Integer> set, Set<Integer> result)
          Returns the index of the set having the maximum intersection set with the specified set contained in the specified map.
 void run(Database<V> database, boolean verbose, boolean time)
          This method executes the actual preprocessing step of this Preprocessor for the objects of the specified database.
 List<String> setParameters(List<String> args)
          Grabs all specified options from the option handler.
 String shortDescription()
          Returns a short description of the class.
 
Methods inherited from class de.lmu.ifi.dbs.elki.utilities.optionhandling.AbstractParameterizable
addOption, addParameterizable, addParameterizable, checkGlobalParameterConstraints, collectOptions, getAttributeSettings, getParameters, rememberParametersExcept, removeOption, removeParameterizable
 
Methods inherited from class de.lmu.ifi.dbs.elki.logging.AbstractLoggable
debugFine, debugFiner, debugFinest, exception, progress, verbose, warning
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface de.lmu.ifi.dbs.elki.utilities.optionhandling.Parameterizable
checkGlobalParameterConstraints, collectOptions, getParameters
 

Field Detail

DEFAULT_EPSILON

public static final DoubleDistance DEFAULT_EPSILON
The default value for epsilon.


EPSILON_ID

public static final OptionID EPSILON_ID
OptionID for EPSILON_PARAM


MINPTS_P

public static final String MINPTS_P
Option name

See Also:
Constant Field Values

CONDITION

private static final String CONDITION
Description for the determination of the preference vector.

See Also:
Constant Field Values

MINPTS_ID

public static final OptionID MINPTS_ID
OptionID for MINPTS_PARAM


DEFAULT_STRATEGY

public static DiSHPreprocessor.Strategy DEFAULT_STRATEGY
Default strategy.


STRATEGY_ID

public static final OptionID STRATEGY_ID
OptionID for STRATEGY_PARAM


EPSILON_PARAM

protected final DoubleListParameter EPSILON_PARAM
Parameter Epsilon.


epsilon

private DoubleDistance[] epsilon
The epsilon value for each dimension;


MINPTS_PARAM

protected final IntParameter MINPTS_PARAM
Parameter Minpts.


minpts

private int minpts
Threshold for minimum number of points in the neighborhood.


STRATEGY_PARAM

private final PatternParameter STRATEGY_PARAM
Parameter Strategy.


strategy

private DiSHPreprocessor.Strategy strategy
The strategy to determine the preference vector.

Constructor Detail

DiSHPreprocessor

public DiSHPreprocessor()
Provides a new AdvancedHiSCPreprocessor that computes the preference vector of objects of a certain database.

Method Detail

run

public void run(Database<V> database,
                boolean verbose,
                boolean time)
Description copied from interface: Preprocessor
This method executes the actual preprocessing step of this Preprocessor for the objects of the specified database.

Specified by:
run in interface Preprocessor<V extends RealVector<V,N>>
Parameters:
database - the database for which the preprocessing is performed
verbose - flag to allow verbose messages while performing the algorithm
time - flag to request output of performance time

shortDescription

public String shortDescription()
Description copied from class: AbstractParameterizable
Returns a short description of the class.

Specified by:
shortDescription in interface Parameterizable
Overrides:
shortDescription in class AbstractParameterizable
Returns:
Description of the class

setParameters

public List<String> setParameters(List<String> args)
                           throws ParameterException
Description copied from class: AbstractParameterizable
Grabs all specified options from the option handler. Any extending class should call this method first and return the returned array without further changes, but after setting further required parameters. An example for overwriting this method taking advantage from the previously (in superclasses) defined options would be:

 {
   List remainingParameters = super.setParameters(args);
   // set parameters for your class
   // for example like this:
   if(isSet(MY_PARAM_VALUE_PARAM))
   {
      myParamValue = getParameterValue(MY_PARAM_VALUE_PARAM);
   }
   .
   .
   .
   return remainingParameters;
   // or in case of attributes requesting parameters themselves
   // return parameterizableAttribbute.setParameters(remainingParameters);
 }
 

Specified by:
setParameters in interface Parameterizable
Overrides:
setParameters in class AbstractParameterizable
Parameters:
args - parameters to set the attributes accordingly to
Returns:
a list containing the unused parameters
Throws:
ParameterException - in case of wrong parameter-setting

determinePreferenceVector

private BitSet determinePreferenceVector(Database<V> database,
                                         Set<Integer>[] neighborIDs,
                                         StringBuffer msg)
                                  throws ParameterException,
                                         UnableToComplyException
Determines the preference vector according to the specified neighbor ids.

Parameters:
database - the database storing the objects
neighborIDs - the list of ids of the neighbors in each dimension
msg - a string buffer for debug messages
Returns:
the preference vector
Throws:
ParameterException
UnableToComplyException

determinePreferenceVectorByApriori

private BitSet determinePreferenceVectorByApriori(Database<V> database,
                                                  Set<Integer>[] neighborIDs,
                                                  StringBuffer msg)
                                           throws ParameterException,
                                                  UnableToComplyException
Determines the preference vector with the apriori strategy.

Parameters:
database - the database storing the objects
neighborIDs - the list of ids of the neighbors in each dimension
msg - a string buffer for debug messages
Returns:
the preference vector
Throws:
ParameterException
UnableToComplyException

determinePreferenceVectorByMaxIntersection

private BitSet determinePreferenceVectorByMaxIntersection(Set<Integer>[] neighborIDs,
                                                          StringBuffer msg)
Determines the preference vector with the max intersection strategy.

Parameters:
neighborIDs - the list of ids of the neighbors in each dimension
msg - a string buffer for debug messages
Returns:
the preference vector

max

private int max(Map<Integer,Set<Integer>> candidates)
Returns the set with the maximum size contained in the specified map.

Parameters:
candidates - the map containing the sets
Returns:
the set with the maximum size

maxIntersection

private int maxIntersection(Map<Integer,Set<Integer>> candidates,
                            Set<Integer> set,
                            Set<Integer> result)
Returns the index of the set having the maximum intersection set with the specified set contained in the specified map.

Parameters:
candidates - the map containing the sets
set - the set to intersect with
result - the set to put the result in
Returns:
the set with the maximum size

initDistanceFunctions

private DimensionSelectingDistanceFunction<N,V>[] initDistanceFunctions(Database<V> database,
                                                                        int dimensionality,
                                                                        boolean verbose,
                                                                        boolean time)
                                                                                                        throws ParameterException
Initializes the dimension selecting distancefunctions to determine the preference vectors.

Parameters:
database - the database storing the objects
dimensionality - the dimensionality of the objects
verbose - flag to allow verbose messages while performing the algorithm
time - flag to request output of performance time
Returns:
the dimension selecting distancefunctions to determine the preference vectors
Throws:
ParameterException

getEpsilon

public DoubleDistance[] getEpsilon()
Returns the value of the epsilon parameter.

Returns:
the value of the epsilon parameter

getMinpts

public int getMinpts()
Returns minpts.

Returns:
minpts

Release 0.2.1 (2009-07-13_1605)