de.lmu.ifi.dbs.elki.algorithm.clustering.correlation
Class COPAC<V extends NumberVector<V,?>,D extends Distance<D>>

java.lang.Object
  extended by de.lmu.ifi.dbs.elki.algorithm.AbstractAlgorithm<Clustering<Model>>
      extended by de.lmu.ifi.dbs.elki.algorithm.clustering.correlation.COPAC<V,D>
Type Parameters:
V - the type of NumberVector handled by this Algorithm
All Implemented Interfaces:
Algorithm, ClusteringAlgorithm<Clustering<Model>>, InspectionUtilFrequentlyScanned, Parameterizable

@Title(value="COPAC: COrrelation PArtition Clustering")
@Description(value="Partitions a database according to the correlation dimension of its objects and performs a clustering algorithm over the partitions.")
@Reference(authors="E. Achtert, C. B\u00f6hm, H.-P. Kriegel, P. Kr\u00f6ger P., A. Zimek",
           title="Robust, Complete, and Efficient Correlation Clustering",
           booktitle="Proc. 7th SIAM International Conference on Data Mining (SDM\'07), Minneapolis, MN, 2007",
           url="http://www.siam.org/proceedings/datamining/2007/dm07_037achtert.pdf")
public class COPAC<V extends NumberVector<V,?>,D extends Distance<D>>
extends AbstractAlgorithm<Clustering<Model>>
implements ClusteringAlgorithm<Clustering<Model>>

Provides the COPAC algorithm, an algorithm to partition a database according to the correlation dimension of its objects and to then perform an arbitrary clustering algorithm over the partitions.

Reference: Achtert E., Böhm C., Kriegel H.-P., Kröger P., Zimek A.: Robust, Complete, and Efficient Correlation Clustering.
In Proc. 7th SIAM International Conference on Data Mining (SDM'07), Minneapolis, MN, 2007


Nested Class Summary
static class COPAC.Parameterizer<V extends NumberVector<V,?>,D extends Distance<D>>
          Parameterization class.
 
Field Summary
private static Logging logger
          The logger for this class.
static OptionID PARTITION_ALGORITHM_ID
          Parameter to specify the clustering algorithm to apply to each partition, must extend ClusteringAlgorithm.
static OptionID PARTITION_DISTANCE_ID
          Parameter to specify the distance function to use inside the partitions AbstractIndexBasedDistanceFunction .
private  Class<? extends ClusteringAlgorithm<Clustering<Model>>> partitionAlgorithm
          Get the algorithm to run on each partition.
private  Collection<Pair<OptionID,Object>> partitionAlgorithmParameters
          Holds the parameters of the algorithm to run on each partition.
private  FilteredLocalPCABasedDistanceFunction<V,?,D> partitionDistanceFunction
          Holds the instance of the preprocessed distance function PARTITION_DISTANCE_ID.
private  FilteredLocalPCABasedDistanceFunction.Instance<V,LocalProjectionIndex<V,?>,D> partitionDistanceQuery
          The last used distance query
static OptionID PREPROCESSOR_ID
          Parameter to specify the local PCA preprocessor to derive partition criterion, must extend AbstractFilteredPCAIndex.
 
Constructor Summary
COPAC(FilteredLocalPCABasedDistanceFunction<V,?,D> partitionDistanceFunction, Class<? extends ClusteringAlgorithm<Clustering<Model>>> partitionAlgorithm, Collection<Pair<OptionID,Object>> partitionAlgorithmParameters)
          Constructor.
 
Method Summary
 TypeInformation[] getInputTypeRestriction()
          Get the input type restriction used for negotiating the data query.
protected  Logging getLogger()
          Get the (STATIC) logger for this class.
 ClusteringAlgorithm<Clustering<Model>> getPartitionAlgorithm(DistanceQuery<V,D> query)
          Returns the partition algorithm.
 FilteredLocalPCABasedDistanceFunction.Instance<V,LocalProjectionIndex<V,?>,D> getPartitionDistanceQuery()
          Get the last used distance query (to expose access to the preprocessor) Used by ERiC.
 Clustering<Model> run(Relation<V> relation)
          Performs the COPAC algorithm on the given database.
private  Clustering<Model> runPartitionAlgorithm(Relation<V> relation, Map<Integer,DBIDs> partitionMap, DistanceQuery<V,D> query)
          Runs the partition algorithm and creates the result.
 
Methods inherited from class de.lmu.ifi.dbs.elki.algorithm.AbstractAlgorithm
makeParameterDistanceFunction, run
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface de.lmu.ifi.dbs.elki.algorithm.clustering.ClusteringAlgorithm
run
 

Field Detail

logger

private static final Logging logger
The logger for this class.


PREPROCESSOR_ID

public static final OptionID PREPROCESSOR_ID
Parameter to specify the local PCA preprocessor to derive partition criterion, must extend AbstractFilteredPCAIndex.

Key: -copac.preprocessor


PARTITION_DISTANCE_ID

public static final OptionID PARTITION_DISTANCE_ID
Parameter to specify the distance function to use inside the partitions AbstractIndexBasedDistanceFunction .

Default value: LocallyWeightedDistanceFunction

Key: -copac.partitionDistance


PARTITION_ALGORITHM_ID

public static final OptionID PARTITION_ALGORITHM_ID
Parameter to specify the clustering algorithm to apply to each partition, must extend ClusteringAlgorithm.

Key: -copac.partitionAlgorithm


partitionDistanceFunction

private FilteredLocalPCABasedDistanceFunction<V extends NumberVector<V,?>,?,D extends Distance<D>> partitionDistanceFunction
Holds the instance of the preprocessed distance function PARTITION_DISTANCE_ID.


partitionAlgorithm

private Class<? extends ClusteringAlgorithm<Clustering<Model>>> partitionAlgorithm
Get the algorithm to run on each partition.


partitionAlgorithmParameters

private Collection<Pair<OptionID,Object>> partitionAlgorithmParameters
Holds the parameters of the algorithm to run on each partition.


partitionDistanceQuery

private FilteredLocalPCABasedDistanceFunction.Instance<V extends NumberVector<V,?>,LocalProjectionIndex<V extends NumberVector<V,?>,?>,D extends Distance<D>> partitionDistanceQuery
The last used distance query

Constructor Detail

COPAC

public COPAC(FilteredLocalPCABasedDistanceFunction<V,?,D> partitionDistanceFunction,
             Class<? extends ClusteringAlgorithm<Clustering<Model>>> partitionAlgorithm,
             Collection<Pair<OptionID,Object>> partitionAlgorithmParameters)
Constructor.

Parameters:
partitionDistanceFunction - Distance function
partitionAlgorithm - Algorithm to use on partitions
partitionAlgorithmParameters - Parameters for Algorithm to run on partitions
Method Detail

run

public Clustering<Model> run(Relation<V> relation)
                      throws IllegalStateException
Performs the COPAC algorithm on the given database.

Parameters:
relation - Relation to process
Returns:
Clustering result
Throws:
IllegalStateException

runPartitionAlgorithm

private Clustering<Model> runPartitionAlgorithm(Relation<V> relation,
                                                Map<Integer,DBIDs> partitionMap,
                                                DistanceQuery<V,D> query)
Runs the partition algorithm and creates the result.

Parameters:
relation - the database to run this algorithm on
partitionMap - the map of partition IDs to object ids
query - The preprocessor based query function

getPartitionAlgorithm

public ClusteringAlgorithm<Clustering<Model>> getPartitionAlgorithm(DistanceQuery<V,D> query)
Returns the partition algorithm.

Returns:
the specified partition algorithm

getPartitionDistanceQuery

public FilteredLocalPCABasedDistanceFunction.Instance<V,LocalProjectionIndex<V,?>,D> getPartitionDistanceQuery()
Get the last used distance query (to expose access to the preprocessor) Used by ERiC. TODO: migrate to factory pattern!

Returns:
distance query

getInputTypeRestriction

public TypeInformation[] getInputTypeRestriction()
Description copied from class: AbstractAlgorithm
Get the input type restriction used for negotiating the data query.

Specified by:
getInputTypeRestriction in interface Algorithm
Specified by:
getInputTypeRestriction in class AbstractAlgorithm<Clustering<Model>>
Returns:
Type restriction

getLogger

protected Logging getLogger()
Description copied from class: AbstractAlgorithm
Get the (STATIC) logger for this class.

Specified by:
getLogger in class AbstractAlgorithm<Clustering<Model>>
Returns:
the static logger

Release 0.4.0 (2011-09-20_1324)