|
|
|||||||||||||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||||||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object de.lmu.ifi.dbs.elki.logging.AbstractLoggable de.lmu.ifi.dbs.elki.utilities.optionhandling.AbstractParameterizable de.lmu.ifi.dbs.elki.algorithm.AbstractAlgorithm<ParameterizationFunction> de.lmu.ifi.dbs.elki.algorithm.clustering.correlation.CASH
public class CASH
Subspace clustering algorithm based on the hough transform. todo elke hierarchy (later)
Field Summary | |
---|---|
private boolean |
adjust
Flag indicating that an adjustment of the applied heuristic for choosing an interval is performed after an interval is selected. |
private Flag |
ADJUST_FLAG
Flag to indicate that an adjustment of the applied heuristic for choosing an interval is performed after an interval is selected. |
private Database<ParameterizationFunction> |
database
The database holding the objects. |
private double |
jitter
The maximum allowed jitter for distance values. |
private DoubleParameter |
JITTER_PARAM
Parameter to specify the maximum jitter for distance values, must be a double greater than 0. |
private int |
maxLevel
The maximum level for splitting the hypercube. |
private IntParameter |
MAXLEVEL_PARAM
Parameter to specify the maximum level for splitting the hypercube, must be an integer greater than 0. |
private int |
minDim
The minmum dimensionality for the subspaces to be found. |
private IntParameter |
MINDIM_PARAM
Parameter to specify the minimum dimensionality of the subspaces to be found, must be an integer greater than 0. |
private int |
minPts
Minimum points in a cluster. |
private IntParameter |
MINPTS_PARAM
Parameter to specify the threshold for minimum number of points in a cluster, must be an integer greater than 0. |
private int |
noiseDim
Holds the dimensionality for noise. |
private Set<Integer> |
processedIDs
Holds a set of processed ids. |
private CASHResult |
result
The result. |
Fields inherited from class de.lmu.ifi.dbs.elki.utilities.optionhandling.AbstractParameterizable |
---|
optionHandler |
Fields inherited from class de.lmu.ifi.dbs.elki.logging.AbstractLoggable |
---|
debug |
Constructor Summary | |
---|---|
CASH()
Provides a new CASH algorithm. |
Method Summary | |
---|---|
private Database<ParameterizationFunction> |
buildDB(int dim,
Matrix basis,
Set<Integer> ids,
Database<ParameterizationFunction> database)
Builds a dim-1 dimensional database where the objects are projected into the specified subspace. |
private Database<RealVector> |
buildDerivatorDB(Database<ParameterizationFunction> database,
CASHInterval interval)
Builds a database for the derivator consisting of the ids in the specified interval. |
private Matrix |
determineBasis(double[] alpha)
Determines a basis defining a subspace described by the specified alpha values. |
private double[] |
determineMinMaxDistance(Database<ParameterizationFunction> database,
int dimensionality)
Determines the minimum and maximum function value of all parametrization functions stored in the specified database. |
private CASHInterval |
determineNextIntervalAtMaxLevel(DefaultHeap<Integer,CASHInterval> heap)
Determines the next ''best'' interval at maximum level, i.e. the next interval containing the most unprocessed obejcts. |
private CASHInterval |
doDetermineNextIntervalAtMaxLevel(DefaultHeap<Integer,CASHInterval> heap)
Recursive helper method to determine the next ''best'' interval at maximum level, i.e. the next interval containing the most unprocessed obejcts |
private SubspaceClusterMap |
doRun(Database<ParameterizationFunction> database,
Progress progress)
Runs the CASH algorithm on the specified database, this method is recursively called until only noise is left. |
private Set<Integer> |
getDatabaseIDs(Database<ParameterizationFunction> database)
Returns the set of ids belonging to the specified database. |
Description |
getDescription()
Returns a description of the algorithm. |
Result<ParameterizationFunction> |
getResult()
Returns the result of the algorithm. |
private void |
initHeap(DefaultHeap<Integer,CASHInterval> heap,
Database<ParameterizationFunction> database,
int dim,
Set<Integer> ids)
Initializes the heap with the root intervals. |
private ParameterizationFunction |
project(Matrix basis,
ParameterizationFunction f)
Projects the specified parametrization function into the subspace described by the given basis. |
private Matrix |
runDerivator(Database<ParameterizationFunction> database,
int dim,
CASHInterval interval,
Set<Integer> ids)
Runs the derivator on the specified inerval and assigns all points having a distance less then the standard deviation of the derivator model to the model to this model. |
protected void |
runInTime(Database<ParameterizationFunction> database)
The run method encapsulated in measure of runtime. |
String[] |
setParameters(String[] args)
Grabs all specified options from the option handler and sets the values for the flags AbstractAlgorithm.VERBOSE_FLAG and AbstractAlgorithm.TIME_FLAG . |
private double |
sinusProduct(int start,
int end,
double[] alpha)
Computes the product of all sinus values of the specified angles from start to end index. |
Methods inherited from class de.lmu.ifi.dbs.elki.algorithm.AbstractAlgorithm |
---|
description, isTime, isVerbose, run, setTime, setVerbose |
Methods inherited from class de.lmu.ifi.dbs.elki.utilities.optionhandling.AbstractParameterizable |
---|
addOption, checkGlobalParameterConstraints, deleteOption, description, description, getAttributeSettings, getParameters, getParameterValue, getPossibleOptions, inlineDescription, isSet, setParameters |
Methods inherited from class de.lmu.ifi.dbs.elki.logging.AbstractLoggable |
---|
debugFine, debugFiner, debugFinest, exception, message, progress, progress, progress, verbose, verbose, warning |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Methods inherited from interface de.lmu.ifi.dbs.elki.utilities.optionhandling.Parameterizable |
---|
checkGlobalParameterConstraints, getAttributeSettings, getParameters, getPossibleOptions, inlineDescription |
Field Detail |
---|
private final IntParameter MINPTS_PARAM
Key: -cash.minpts
private final IntParameter MAXLEVEL_PARAM
Key: -cash.maxlevel
private final IntParameter MINDIM_PARAM
Default value: 1
Key: -cash.mindim
private final DoubleParameter JITTER_PARAM
Key: -cash.jitter
private final Flag ADJUST_FLAG
Key: -cash.adjust
private CASHResult result
private int minPts
private boolean adjust
private int maxLevel
private int minDim
private double jitter
private int noiseDim
private Set<Integer> processedIDs
private Database<ParameterizationFunction> database
Constructor Detail |
---|
public CASH()
Method Detail |
---|
protected void runInTime(Database<ParameterizationFunction> database) throws IllegalStateException
runInTime
in class AbstractAlgorithm<ParameterizationFunction>
database
- the database to run the algorithm on
IllegalStateException
- if the algorithm has not been initialized properly (e.g. the
setParameters(String[]) method has been failed to be called).public Result<ParameterizationFunction> getResult()
public Description getDescription()
public String[] setParameters(String[] args) throws ParameterException
AbstractAlgorithm
AbstractAlgorithm.VERBOSE_FLAG
and AbstractAlgorithm.TIME_FLAG
.
Any extending class should
call this method first and return the returned array without further
changes, but after setting further required parameters. An example for
overwritting this method taking advantage from the previously (in
superclasses) defined options would be:
{ String[] remainingParameters = super.setParameters(args); // set parameters for your class // for example like this: if(isSet(MY_PARAM_VALUE_PARAM)) { myParamValue = getParameterValue(MY_PARAM_VALUE_PARAM); } . . . return remainingParameters; // or in case of attributes requesting parameters themselves // return parameterizableAttribbute.setParameters(remainingParameters); }
setParameters
in interface Parameterizable
setParameters
in class AbstractAlgorithm<ParameterizationFunction>
args
- parameters to set the attributes accordingly to
ParameterException
- in case of wrong parameter-settingParameterizable.setParameters(String[])
private SubspaceClusterMap doRun(Database<ParameterizationFunction> database, Progress progress) throws UnableToComplyException, ParameterException, NonNumericFeaturesException
database
- the current database to run the CASH algorithm onprogress
- the progress object for verbose messages
UnableToComplyException
- if an error according to the database occurs
ParameterException
- if the parameter setting is wrong
NonNumericFeaturesException
- if non numeric feature vectors are usedprivate void initHeap(DefaultHeap<Integer,CASHInterval> heap, Database<ParameterizationFunction> database, int dim, Set<Integer> ids)
heap
- the heap to be initializeddatabase
- the database storing the paramterization functionsdim
- the dimensionality of the databaseids
- the ids of the databaseprivate Database<ParameterizationFunction> buildDB(int dim, Matrix basis, Set<Integer> ids, Database<ParameterizationFunction> database) throws UnableToComplyException
dim
- the dimensionality of the databasebasis
- the basis defining the subspaceids
- the ids for the new databasedatabase
- the database storing the paramterization functions
UnableToComplyException
- if an error according to the database occursprivate ParameterizationFunction project(Matrix basis, ParameterizationFunction f)
basis
- the basis defining he subspacef
- the parametrization function to be projected
private Matrix determineBasis(double[] alpha)
alpha
- the alpha values
private double sinusProduct(int start, int end, double[] alpha)
start
- the index to startend
- the index to endalpha
- the array of angles
private CASHInterval determineNextIntervalAtMaxLevel(DefaultHeap<Integer,CASHInterval> heap)
heap
- the heap storing the intervals
private CASHInterval doDetermineNextIntervalAtMaxLevel(DefaultHeap<Integer,CASHInterval> heap)
heap
- the heap storing the intervals
private Set<Integer> getDatabaseIDs(Database<ParameterizationFunction> database)
database
- the database containing the parametrization functions.
private double[] determineMinMaxDistance(Database<ParameterizationFunction> database, int dimensionality)
database
- the database containing the parametrization functions.dimensionality
- the dimensionality of the database
private Matrix runDerivator(Database<ParameterizationFunction> database, int dim, CASHInterval interval, Set<Integer> ids) throws UnableToComplyException, ParameterException
database
- the database containing the parametrization functionsinterval
- the interval to build the modeldim
- the dimensinality of the databaseids
- an empty set to assign the ids
UnableToComplyException
- if an error according to the database occurs
ParameterException
- if the parameter setting is wrongprivate Database<RealVector> buildDerivatorDB(Database<ParameterizationFunction> database, CASHInterval interval) throws UnableToComplyException
database
- the database storing the paramterization functionsinterval
- the interval to build the database from
UnableToComplyException
- if an error according to the database occurs
|
|
||||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |