Environment for
DeveLoping
KDD-Applications
Supported by Index-Structures

de.lmu.ifi.dbs.elki.algorithm.clustering
Class ByLabelClustering<O extends DatabaseObject>

java.lang.Object
  extended by de.lmu.ifi.dbs.elki.logging.AbstractLoggable
      extended by de.lmu.ifi.dbs.elki.algorithm.AbstractAlgorithm<O,Clustering<Model>>
          extended by de.lmu.ifi.dbs.elki.algorithm.clustering.ByLabelClustering<O>
Type Parameters:
O - Object type
All Implemented Interfaces:
Algorithm<O,Clustering<Model>>, ClusteringAlgorithm<Clustering<Model>,O>, Parameterizable

@Title(value="Clustering by label")
@Description(value="Cluster points by a (pre-assigned!) label. For comparing results with a reference clustering.")
public class ByLabelClustering<O extends DatabaseObject>
extends AbstractAlgorithm<O,Clustering<Model>>
implements ClusteringAlgorithm<Clustering<Model>,O>

Pseudo clustering using labels. This "algorithm" puts elements into the same cluster when they agree in their labels. I.e. it just uses a predefined clustering, and is mostly useful for testing and evaluation (e.g. comparing the result of a real algorithm to a reference result / golden standard). If an assignment of an object to multiple clusters is desired, the labels of the object indicating the clusters need to be separated by blanks and the flag MULTIPLE_FLAG needs to be set. TODO: handling of data sets with no labels? TODO: Noise handling (e.g. allow the user to specify a noise label pattern?)

Author:
Erich Schubert

Field Summary
private  boolean multiple
          Holds the value of MULTIPLE_FLAG.
private  Flag MULTIPLE_FLAG
          Flag to indicate that multiple cluster assignment is possible.
static OptionID MULTIPLE_ID
          OptionID for MULTIPLE_FLAG
 
Fields inherited from class de.lmu.ifi.dbs.elki.logging.AbstractLoggable
debug, logger
 
Constructor Summary
ByLabelClustering()
          Constructor without parameters
ByLabelClustering(Parameterization config)
          Constructor, adhering to Parameterizable
 
Method Summary
private  void assign(HashMap<String,Collection<Integer>> labelMap, String label, Integer id)
          Assigns the specified id to the labelMap according to its label
private  HashMap<String,Collection<Integer>> multipleAssignment(Database<O> database)
          Assigns the objects of the database to multiple clusters according to their labels.
protected  Clustering<Model> runInTime(Database<O> database)
          Run the actual clustering algorithm.
 void setMultiple(boolean multiple)
          Sets the multiple flag to indicate that a multiple cluster assignment is possible.
private  HashMap<String,Collection<Integer>> singleAssignment(Database<O> database)
          Assigns the objects of the database to single clusters according to their labels.
 
Methods inherited from class de.lmu.ifi.dbs.elki.algorithm.AbstractAlgorithm
isTime, isVerbose, run, setTime, setVerbose
 
Methods inherited from class de.lmu.ifi.dbs.elki.logging.AbstractLoggable
debugFine, debugFiner, debugFinest, exception, progress, verbose, warning
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface de.lmu.ifi.dbs.elki.algorithm.clustering.ClusteringAlgorithm
run
 
Methods inherited from interface de.lmu.ifi.dbs.elki.algorithm.Algorithm
setTime, setVerbose
 

Field Detail

MULTIPLE_ID

public static final OptionID MULTIPLE_ID
OptionID for MULTIPLE_FLAG


MULTIPLE_FLAG

private final Flag MULTIPLE_FLAG
Flag to indicate that multiple cluster assignment is possible. If an assignment to multiple clusters is desired, the labels indicating the clusters need to be separated by blanks.

Key: -clique.prune


multiple

private boolean multiple
Holds the value of MULTIPLE_FLAG.

Constructor Detail

ByLabelClustering

public ByLabelClustering(Parameterization config)
Constructor, adhering to Parameterizable

Parameters:
config - Parameterization

ByLabelClustering

public ByLabelClustering()
Constructor without parameters

Method Detail

runInTime

protected Clustering<Model> runInTime(Database<O> database)
                               throws IllegalStateException
Run the actual clustering algorithm.

Specified by:
runInTime in class AbstractAlgorithm<O extends DatabaseObject,Clustering<Model>>
Parameters:
database - The database to process
Returns:
the Result computed by this algorithm
Throws:
IllegalStateException - if the algorithm has not been initialized properly (e.g. the setParameters(String[]) method has been failed to be called).

singleAssignment

private HashMap<String,Collection<Integer>> singleAssignment(Database<O> database)
Assigns the objects of the database to single clusters according to their labels.

Parameters:
database - the database storing the objects
Returns:
a mapping of labels to ids

multipleAssignment

private HashMap<String,Collection<Integer>> multipleAssignment(Database<O> database)
Assigns the objects of the database to multiple clusters according to their labels.

Parameters:
database - the database storing the objects
Returns:
a mapping of labels to ids

assign

private void assign(HashMap<String,Collection<Integer>> labelMap,
                    String label,
                    Integer id)
Assigns the specified id to the labelMap according to its label

Parameters:
labelMap - the mapping of label to ids
label - the label of the object to be assigned
id - the id of the object to be assigned

setMultiple

public void setMultiple(boolean multiple)
Sets the multiple flag to indicate that a multiple cluster assignment is possible.

Parameters:
multiple - the flag to be set

Release 0.3 (2010-03-31_1612)