de.lmu.ifi.dbs.elki.algorithm.clustering
Class SLINK<O,D extends Distance<D>>

java.lang.Object
  extended by de.lmu.ifi.dbs.elki.algorithm.AbstractAlgorithm<R>
      extended by de.lmu.ifi.dbs.elki.algorithm.AbstractDistanceBasedAlgorithm<O,D,Result>
          extended by de.lmu.ifi.dbs.elki.algorithm.clustering.SLINK<O,D>
Type Parameters:
O - the type of DatabaseObject the algorithm is applied on
D - the type of Distance used
All Implemented Interfaces:
Algorithm, InspectionUtilFrequentlyScanned, Parameterizable

@Title(value="SLINK: Single Link Clustering")
@Description(value="Hierarchical clustering algorithm based on single-link connectivity.")
@Reference(authors="R. Sibson",
           title="SLINK: An optimally efficient algorithm for the single-link cluster method",
           booktitle="The Computer Journal 16 (1973), No. 1, p. 30-34.",
           url="http://dx.doi.org/10.1093/comjnl/16.1.30")
public class SLINK<O,D extends Distance<D>>
extends AbstractDistanceBasedAlgorithm<O,D,Result>

Efficient implementation of the Single-Link Algorithm SLINK of R. Sibson.

Reference: R. Sibson: SLINK: An optimally efficient algorithm for the single-link cluster method.
In: The Computer Journal 16 (1973), No. 1, p. 30-34.


Nested Class Summary
private static class SLINK.CompareByLambda<D extends Distance<D>>
          Order a DBID collection by the lambda value.
static class SLINK.Parameterizer<O,D extends Distance<D>>
          Parameterization class.
 
Field Summary
private  WritableDataStore<D> lambda
          The values of the function Lambda of the pointer representation.
private static Logging logger
          The logger for this class.
private  Integer minclusters
          Minimum number of clusters to extract
private  WritableDataStore<DBID> pi
          The values of the function Pi of the pointer representation.
static OptionID SLINK_MINCLUSTERS_ID
          The minimum number of clusters to extract
 
Fields inherited from class de.lmu.ifi.dbs.elki.algorithm.AbstractDistanceBasedAlgorithm
DISTANCE_FUNCTION_ID
 
Constructor Summary
SLINK(DistanceFunction<? super O,D> distanceFunction, Integer minclusters)
          Constructor.
 
Method Summary
private  Cluster<DendrogramModel<D>> createParent(String name, Cluster<DendrogramModel<D>> leftChild, Cluster<DendrogramModel<D>> rightChild, D distance, ModifiableHierarchy<Cluster<DendrogramModel<D>>> hier)
           
private  Clustering<Model> extractClusters_erich(DBIDs ids, DataStore<DBID> pi, DataStore<D> lambda, int minclusters)
          Extract all clusters from the pi-lambda-representation.
private  Clustering<DendrogramModel<D>> extractClusters(DBIDs ids, DataStore<DBID> pi, DataStore<D> lambda, int minclusters)
          Extract all clusters from the pi-lambda-representation.
 TypeInformation[] getInputTypeRestriction()
          Get the input type restriction used for negotiating the data query.
protected  Logging getLogger()
          Get the (STATIC) logger for this class.
private  Cluster<DendrogramModel<D>> lastAncestor(Cluster<DendrogramModel<D>> cluster, ModifiableHierarchy<Cluster<DendrogramModel<D>>> hier)
          Determines recursively the last ancestor of the specified cluster.
private  DBID lastObjectInCluster(DBID id, D stopdist, DataStore<DBID> pi, DataStore<D> lambda)
           
private  Cluster<DendrogramModel<D>> root(Map<DBID,ModifiableDBIDs> cluster_ids, Map<DBID,D> cluster_distances, DataStore<DBID> pi, DataStore<D> lambda, ModifiableHierarchy<Cluster<DendrogramModel<D>>> hier, FiniteProgress progress)
           
 Result run(Database database, Relation<O> relation)
          Performs the SLINK algorithm on the given database.
private  void step1(DBID newID)
          First step: Initialize P(id) = id, L(id) = infinity.
private  void step2(DBID newID, DBIDs processedIDs, DistanceQuery<O,D> distFunc, WritableDataStore<D> m)
          Second step: Determine the pairwise distances from all objects in the pointer representation to the new object with the specified id.
private  void step3(DBID newID, DBIDs processedIDs, WritableDataStore<D> m)
          Third step: Determine the values for P and L
private  void step4(DBID newID, DBIDs processedIDs)
          Fourth step: Actualize the clusters if necessary
 
Methods inherited from class de.lmu.ifi.dbs.elki.algorithm.AbstractDistanceBasedAlgorithm
getDistanceFunction
 
Methods inherited from class de.lmu.ifi.dbs.elki.algorithm.AbstractAlgorithm
makeParameterDistanceFunction, run
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

logger

private static final Logging logger
The logger for this class.


SLINK_MINCLUSTERS_ID

public static final OptionID SLINK_MINCLUSTERS_ID
The minimum number of clusters to extract


pi

private WritableDataStore<DBID> pi
The values of the function Pi of the pointer representation.


lambda

private WritableDataStore<D extends Distance<D>> lambda
The values of the function Lambda of the pointer representation.


minclusters

private Integer minclusters
Minimum number of clusters to extract

Constructor Detail

SLINK

public SLINK(DistanceFunction<? super O,D> distanceFunction,
             Integer minclusters)
Constructor.

Parameters:
distanceFunction - Distance function
minclusters - Minimum clusters to extract. Can be null
Method Detail

run

public Result run(Database database,
                  Relation<O> relation)
Performs the SLINK algorithm on the given database.


step1

private void step1(DBID newID)
First step: Initialize P(id) = id, L(id) = infinity.

Parameters:
newID - the id of the object to be inserted into the pointer representation

step2

private void step2(DBID newID,
                   DBIDs processedIDs,
                   DistanceQuery<O,D> distFunc,
                   WritableDataStore<D> m)
Second step: Determine the pairwise distances from all objects in the pointer representation to the new object with the specified id.

Parameters:
newID - the id of the object to be inserted into the pointer representation
processedIDs - the already processed ids
distFunc - Distance function to use

step3

private void step3(DBID newID,
                   DBIDs processedIDs,
                   WritableDataStore<D> m)
Third step: Determine the values for P and L

Parameters:
newID - the id of the object to be inserted into the pointer representation
processedIDs - the already processed ids

step4

private void step4(DBID newID,
                   DBIDs processedIDs)
Fourth step: Actualize the clusters if necessary

Parameters:
newID - the id of the current object
processedIDs - the already processed ids

lastObjectInCluster

private DBID lastObjectInCluster(DBID id,
                                 D stopdist,
                                 DataStore<DBID> pi,
                                 DataStore<D> lambda)

extractClusters

private Clustering<DendrogramModel<D>> extractClusters(DBIDs ids,
                                                       DataStore<DBID> pi,
                                                       DataStore<D> lambda,
                                                       int minclusters)
Extract all clusters from the pi-lambda-representation.

Parameters:
ids - Object ids to process
pi - Pi store
lambda - Lambda store
minclusters - Minimum number of clusters to extract
Returns:
Hierarchical clustering

root

private Cluster<DendrogramModel<D>> root(Map<DBID,ModifiableDBIDs> cluster_ids,
                                         Map<DBID,D> cluster_distances,
                                         DataStore<DBID> pi,
                                         DataStore<D> lambda,
                                         ModifiableHierarchy<Cluster<DendrogramModel<D>>> hier,
                                         FiniteProgress progress)

lastAncestor

private Cluster<DendrogramModel<D>> lastAncestor(Cluster<DendrogramModel<D>> cluster,
                                                 ModifiableHierarchy<Cluster<DendrogramModel<D>>> hier)
Determines recursively the last ancestor of the specified cluster.

Parameters:
cluster - the child
hier - the cluster hierarchy
Returns:
the (currently) last ancestor

createParent

private Cluster<DendrogramModel<D>> createParent(String name,
                                                 Cluster<DendrogramModel<D>> leftChild,
                                                 Cluster<DendrogramModel<D>> rightChild,
                                                 D distance,
                                                 ModifiableHierarchy<Cluster<DendrogramModel<D>>> hier)

extractClusters_erich

private Clustering<Model> extractClusters_erich(DBIDs ids,
                                                DataStore<DBID> pi,
                                                DataStore<D> lambda,
                                                int minclusters)
Extract all clusters from the pi-lambda-representation.

Parameters:
ids - Object ids to process
pi - Pi store
lambda - Lambda store
minclusters - Minimum number of clusters to extract
Returns:
Hierarchical clustering

getInputTypeRestriction

public TypeInformation[] getInputTypeRestriction()
Description copied from class: AbstractAlgorithm
Get the input type restriction used for negotiating the data query.

Specified by:
getInputTypeRestriction in interface Algorithm
Specified by:
getInputTypeRestriction in class AbstractAlgorithm<Result>
Returns:
Type restriction

getLogger

protected Logging getLogger()
Description copied from class: AbstractAlgorithm
Get the (STATIC) logger for this class.

Specified by:
getLogger in class AbstractAlgorithm<Result>
Returns:
the static logger

Release 0.4.0 (2011-09-20_1324)