Environment for
DeveLoping
KDD-Applications
Supported by Index-Structures

de.lmu.ifi.dbs.elki.algorithm.clustering.correlation
Class COPAC<V extends NumberVector<V,?>>

java.lang.Object
  extended by de.lmu.ifi.dbs.elki.logging.AbstractLoggable
      extended by de.lmu.ifi.dbs.elki.algorithm.AbstractAlgorithm<V,Clustering<Model>>
          extended by de.lmu.ifi.dbs.elki.algorithm.clustering.correlation.COPAC<V>
Type Parameters:
V - the type of NumberVector handled by this Algorithm
All Implemented Interfaces:
Algorithm<V,Clustering<Model>>, ClusteringAlgorithm<Clustering<Model>,V>, Parameterizable

@Title(value="COPAC: COrrelation PArtition Clustering")
@Description(value="Partitions a database according to the correlation dimension of its objects and performs a clustering algorithm over the partitions.")
@Reference(authors="E. Achtert, C. B\u00f6hm, H.-P. Kriegel, P. Kr\u00f6ger P., A. Zimek",
           title="Robust, Complete, and Efficient Correlation Clustering",
           booktitle="Proc. 7th SIAM International Conference on Data Mining (SDM\'07), Minneapolis, MN, 2007",
           url="http://www.siam.org/proceedings/datamining/2007/dm07_037achtert.pdf")
public class COPAC<V extends NumberVector<V,?>>
extends AbstractAlgorithm<V,Clustering<Model>>
implements ClusteringAlgorithm<Clustering<Model>,V>

Provides the COPAC algorithm, an algorithm to partition a database according to the correlation dimension of its objects and to then perform an arbitrary clustering algorithm over the partitions.

Reference: Achtert E., Böhm C., Kriegel H.-P., Kröger P., Zimek A.: Robust, Complete, and Efficient Correlation Clustering.
In Proc. 7th SIAM International Conference on Data Mining (SDM'07), Minneapolis, MN, 2007

Author:
Arthur Zimek

Field Summary
static OptionID PARTITION_ALGORITHM_ID
          OptionID for PARTITION_ALGORITHM_PARAM
protected  ObjectParameter<ClusteringAlgorithm<Clustering<Model>,V>> PARTITION_ALGORITHM_PARAM
          Parameter to specify the clustering algorithm to apply to each partition, must extend ClusteringAlgorithm.
static OptionID PARTITION_DB_ID
          OptionID for {#PARTITION_DB_PARAM}
private  ClassParameter<Database<V>> PARTITION_DB_PARAM
          Parameter to specify the database class for each partition, must extend Database.
static OptionID PARTITION_DISTANCE_ID
          OptionID for PARTITION_DISTANCE_PARAM
protected  ObjectParameter<LocalPCAPreprocessorBasedDistanceFunction<V,?,?>> PARTITION_DISTANCE_PARAM
          Parameter to specify the distance function to use inside the partitions AbstractPreprocessorBasedDistanceFunction .
private  ClusteringAlgorithm<Clustering<Model>,V> partitionAlgorithm
          Holds the instance of the partitioning algorithm specified by PARTITION_ALGORITHM_PARAM.
private  Class<? extends Database<V>> partitionDatabase
          Holds the instance of the partition database specified by PARTITION_DB_PARAM.
private  Collection<Pair<OptionID,Object>> partitionDatabaseParameters
          Holds the parameters of the partition databases.
private  PreprocessorBasedDistanceFunction<V,?,?> partitionDistanceFunction
          Holds the instance of the preprocessed distance function PARTITION_DISTANCE_PARAM.
private  LocalPCAPreprocessor<V> preprocessor
          Holds the instance of preprocessor specified by PREPROCESSOR_PARAM .
static OptionID PREPROCESSOR_ID
          OptionID for PREPROCESSOR_PARAM
private  ClassParameter<LocalPCAPreprocessor<V>> PREPROCESSOR_PARAM
          Parameter to specify the local PCA preprocessor to derive partition criterion, must extend LocalPCAPreprocessor.
 
Fields inherited from class de.lmu.ifi.dbs.elki.logging.AbstractLoggable
debug, logger
 
Constructor Summary
COPAC(Parameterization config)
          Constructor, adhering to Parameterizable.
 
Method Summary
 ClusteringAlgorithm<Clustering<Model>,V> getPartitionAlgorithm()
          Returns the partition algorithm.
protected  Clustering<Model> runInTime(Database<V> database)
          Performs the COPAC algorithm on the given database.
private  Clustering<Model> runPartitionAlgorithm(Database<V> database, Map<Integer,List<Integer>> partitionMap)
          Runs the partition algorithm and creates the result.
 
Methods inherited from class de.lmu.ifi.dbs.elki.algorithm.AbstractAlgorithm
isTime, isVerbose, run, setTime, setVerbose
 
Methods inherited from class de.lmu.ifi.dbs.elki.logging.AbstractLoggable
debugFine, debugFiner, debugFinest, exception, progress, verbose, warning
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface de.lmu.ifi.dbs.elki.algorithm.clustering.ClusteringAlgorithm
run
 
Methods inherited from interface de.lmu.ifi.dbs.elki.algorithm.Algorithm
setTime, setVerbose
 

Field Detail

PREPROCESSOR_ID

public static final OptionID PREPROCESSOR_ID
OptionID for PREPROCESSOR_PARAM


PREPROCESSOR_PARAM

private final ClassParameter<LocalPCAPreprocessor<V extends NumberVector<V,?>>> PREPROCESSOR_PARAM
Parameter to specify the local PCA preprocessor to derive partition criterion, must extend LocalPCAPreprocessor.

Key: -copac.preprocessor


preprocessor

private LocalPCAPreprocessor<V extends NumberVector<V,?>> preprocessor
Holds the instance of preprocessor specified by PREPROCESSOR_PARAM .


PARTITION_DISTANCE_ID

public static final OptionID PARTITION_DISTANCE_ID
OptionID for PARTITION_DISTANCE_PARAM


PARTITION_DISTANCE_PARAM

protected final ObjectParameter<LocalPCAPreprocessorBasedDistanceFunction<V extends NumberVector<V,?>,?,?>> PARTITION_DISTANCE_PARAM
Parameter to specify the distance function to use inside the partitions AbstractPreprocessorBasedDistanceFunction .

Key: -copac.partitionDistance


partitionDistanceFunction

private PreprocessorBasedDistanceFunction<V extends NumberVector<V,?>,?,?> partitionDistanceFunction
Holds the instance of the preprocessed distance function PARTITION_DISTANCE_PARAM.


PARTITION_ALGORITHM_ID

public static final OptionID PARTITION_ALGORITHM_ID
OptionID for PARTITION_ALGORITHM_PARAM


PARTITION_ALGORITHM_PARAM

protected final ObjectParameter<ClusteringAlgorithm<Clustering<Model>,V extends NumberVector<V,?>>> PARTITION_ALGORITHM_PARAM
Parameter to specify the clustering algorithm to apply to each partition, must extend ClusteringAlgorithm.

Key: -copac.partitionAlgorithm


partitionAlgorithm

private ClusteringAlgorithm<Clustering<Model>,V extends NumberVector<V,?>> partitionAlgorithm
Holds the instance of the partitioning algorithm specified by PARTITION_ALGORITHM_PARAM.


PARTITION_DB_ID

public static final OptionID PARTITION_DB_ID
OptionID for {#PARTITION_DB_PARAM}


PARTITION_DB_PARAM

private final ClassParameter<Database<V extends NumberVector<V,?>>> PARTITION_DB_PARAM
Parameter to specify the database class for each partition, must extend Database.

Key: -copac.partitionDB


partitionDatabase

private Class<? extends Database<V extends NumberVector<V,?>>> partitionDatabase
Holds the instance of the partition database specified by PARTITION_DB_PARAM.


partitionDatabaseParameters

private Collection<Pair<OptionID,Object>> partitionDatabaseParameters
Holds the parameters of the partition databases.

Constructor Detail

COPAC

public COPAC(Parameterization config)
Constructor, adhering to Parameterizable.

Parameters:
config - Parameterization
Method Detail

runInTime

protected Clustering<Model> runInTime(Database<V> database)
                               throws IllegalStateException
Performs the COPAC algorithm on the given database.

Specified by:
runInTime in class AbstractAlgorithm<V extends NumberVector<V,?>,Clustering<Model>>
Parameters:
database - the database to run the algorithm on
Returns:
the Result computed by this algorithm
Throws:
IllegalStateException - if the algorithm has not been initialized properly (e.g. the setParameters(String[]) method has been failed to be called).

runPartitionAlgorithm

private Clustering<Model> runPartitionAlgorithm(Database<V> database,
                                                Map<Integer,List<Integer>> partitionMap)
Runs the partition algorithm and creates the result.

Parameters:
database - the database to run this algorithm on
partitionMap - the map of partition IDs to object ids

getPartitionAlgorithm

public ClusteringAlgorithm<Clustering<Model>,V> getPartitionAlgorithm()
Returns the partition algorithm.

Returns:
the specified partition algorithm

Release 0.3 (2010-03-31_1612)