Environment for
DeveLoping
KDD-Applications
Supported by Index-Structures

de.lmu.ifi.dbs.elki.algorithm.clustering.correlation
Class ORCLUS<V extends RealVector<V,?>>

java.lang.Object
  extended by de.lmu.ifi.dbs.elki.logging.AbstractLoggable
      extended by de.lmu.ifi.dbs.elki.utilities.optionhandling.AbstractParameterizable
          extended by de.lmu.ifi.dbs.elki.algorithm.AbstractAlgorithm<V>
              extended by de.lmu.ifi.dbs.elki.algorithm.clustering.subspace.ProjectedClustering<V>
                  extended by de.lmu.ifi.dbs.elki.algorithm.clustering.correlation.ORCLUS<V>
Type Parameters:
V - the type of Realvector handled by this Algorithm
All Implemented Interfaces:
Algorithm<V>, Clustering<V>, Loggable, Parameterizable

public class ORCLUS<V extends RealVector<V,?>>
extends ProjectedClustering<V>

ORCLUS provides the ORCLUS algorithm.

Author:
Elke Achtert

Nested Class Summary
private  class ORCLUS.Cluster
          Encapsulates the attributes of a cluster.
private  class ORCLUS.ProjectedEnergy
           
 
Field Summary
private  double alpha
          Holds alpha.
private  DoubleParameter ALPHA_PARAM
          Parameter to specify the factor for reducing the number of current clusters in each iteration, must be an integer greater than 0 and less than 1.
 
Fields inherited from class de.lmu.ifi.dbs.elki.algorithm.clustering.subspace.ProjectedClustering
K_I_ID, K_ID, L_ID
 
Fields inherited from class de.lmu.ifi.dbs.elki.utilities.optionhandling.AbstractParameterizable
optionHandler
 
Fields inherited from class de.lmu.ifi.dbs.elki.logging.AbstractLoggable
debug
 
Constructor Summary
ORCLUS()
          Adds the parameter ALPHA_PARAM additionally to the parameters provided by super-classes.
 
Method Summary
private  void assign(Database<V> database, List<ORCLUS.Cluster> clusters)
          Creates a partitioning of the database by assigning each object to its closest seed.
private  Matrix findBasis(Database<V> database, ORCLUS.Cluster cluster, int dim)
          Finds the basis of the subspace of dimensionality dim for the specified cluster.
 Description getDescription()
          Returns a description of the algorithm.
private  List<ORCLUS.Cluster> initialSeeds(Database<V> database, int k)
          Initializes the list of seeds wit a random sample of size k.
private  void merge(Database<V> database, List<ORCLUS.Cluster> clusters, int k_new, int d_new)
          Reduces the number of seeds to k_new
private  ORCLUS.ProjectedEnergy projectedEnergy(Database<V> database, ORCLUS.Cluster c_i, ORCLUS.Cluster c_j, int i, int j, int dim)
          Computes the projected energy of the specified clusters.
private  V projection(ORCLUS.Cluster c, V o)
          Returns the projection of real vector o in the subspace of cluster c.
protected  void runInTime(Database<V> database)
          The run method encapsulated in measure of runtime.
 String[] setParameters(String[] args)
          Sets the parameters ALPHA_PARAM additionally to the parameters set by the super-class' method.
private  ORCLUS.Cluster union(Database<V> database, ORCLUS.Cluster c1, ORCLUS.Cluster c2, int dim)
          Returns the union of the two specified clusters.
 
Methods inherited from class de.lmu.ifi.dbs.elki.algorithm.clustering.subspace.ProjectedClustering
getDistanceFunction, getK_i, getK, getL, getResult, setResult
 
Methods inherited from class de.lmu.ifi.dbs.elki.algorithm.AbstractAlgorithm
description, isTime, isVerbose, run, setTime, setVerbose
 
Methods inherited from class de.lmu.ifi.dbs.elki.utilities.optionhandling.AbstractParameterizable
addOption, checkGlobalParameterConstraints, deleteOption, description, description, getAttributeSettings, getParameters, getParameterValue, getPossibleOptions, inlineDescription, isSet, setParameters
 
Methods inherited from class de.lmu.ifi.dbs.elki.logging.AbstractLoggable
debugFine, debugFiner, debugFinest, exception, message, progress, progress, progress, verbose, verbose, warning
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface de.lmu.ifi.dbs.elki.algorithm.Algorithm
run, setTime, setVerbose
 
Methods inherited from interface de.lmu.ifi.dbs.elki.utilities.optionhandling.Parameterizable
checkGlobalParameterConstraints, description, getAttributeSettings, getParameters, getPossibleOptions, inlineDescription
 

Field Detail

ALPHA_PARAM

private final DoubleParameter ALPHA_PARAM
Parameter to specify the factor for reducing the number of current clusters in each iteration, must be an integer greater than 0 and less than 1.

Default value: 0.5

Key: -orclus.alpha


alpha

private double alpha
Holds alpha.

Constructor Detail

ORCLUS

public ORCLUS()
Adds the parameter ALPHA_PARAM additionally to the parameters provided by super-classes.

Method Detail

runInTime

protected void runInTime(Database<V> database)
                  throws IllegalStateException
Description copied from class: AbstractAlgorithm
The run method encapsulated in measure of runtime. An extending class needs not to take care of runtime itself.

Specified by:
runInTime in class AbstractAlgorithm<V extends RealVector<V,?>>
Parameters:
database - the database to run the algorithm on
Throws:
IllegalStateException - if the algorithm has not been initialized properly (e.g. the setParameters(String[]) method has been failed to be called).
See Also:
AbstractAlgorithm.runInTime(Database)

getDescription

public Description getDescription()
Description copied from interface: Algorithm
Returns a description of the algorithm.

Returns:
a description of the algorithm
See Also:
Algorithm.getDescription()

setParameters

public String[] setParameters(String[] args)
                       throws ParameterException
Sets the parameters ALPHA_PARAM additionally to the parameters set by the super-class' method.

Specified by:
setParameters in interface Parameterizable
Overrides:
setParameters in class ProjectedClustering<V extends RealVector<V,?>>
Parameters:
args - parameters to set the attributes accordingly to
Returns:
String[] an array containing the unused parameters
Throws:
ParameterException - in case of wrong parameter-setting
See Also:
Parameterizable.setParameters(String[])

initialSeeds

private List<ORCLUS.Cluster> initialSeeds(Database<V> database,
                                          int k)
Initializes the list of seeds wit a random sample of size k.

Parameters:
database - the database holding the objects
k - the size of the random sample
Returns:
the initial seed list

assign

private void assign(Database<V> database,
                    List<ORCLUS.Cluster> clusters)
Creates a partitioning of the database by assigning each object to its closest seed.

Parameters:
database - the database holding the objects
clusters - the array of clusters to which the objects should be assigned to

findBasis

private Matrix findBasis(Database<V> database,
                         ORCLUS.Cluster cluster,
                         int dim)
Finds the basis of the subspace of dimensionality dim for the specified cluster.

Parameters:
database - the database to run the algorithm on
cluster - the cluster
dim - the dimensionality of the subspace
Returns:
matrix defining the basis of the subspace for the specified cluster

merge

private void merge(Database<V> database,
                   List<ORCLUS.Cluster> clusters,
                   int k_new,
                   int d_new)
Reduces the number of seeds to k_new

Parameters:
database - the database holding the objects
clusters - the set of current seeds
k_new - the new number of seeds
d_new - the new dimensionality of the subspaces for each seed

projectedEnergy

private ORCLUS.ProjectedEnergy projectedEnergy(Database<V> database,
                                               ORCLUS.Cluster c_i,
                                               ORCLUS.Cluster c_j,
                                               int i,
                                               int j,
                                               int dim)
Computes the projected energy of the specified clusters. The projected energy is given by the mean square distance of the points to the centroid of the union cluster c, when all points in c are projected to the subspace of c.

Parameters:
database - the database holding the objects
c_i - the first cluster
c_j - the second cluster
i - the index of cluster c_i in the cluster list
j - the index of cluster c_j in the cluster list
dim - the dimensionality of the clusters
Returns:
the projected energy of the specified cluster

union

private ORCLUS.Cluster union(Database<V> database,
                             ORCLUS.Cluster c1,
                             ORCLUS.Cluster c2,
                             int dim)
Returns the union of the two specified clusters.

Parameters:
database - the database holding the objects
c1 - the first cluster
c2 - the second cluster
dim - the dimensionality of the union cluster
Returns:
the union of the two specified clusters

projection

private V projection(ORCLUS.Cluster c,
                     V o)
Returns the projection of real vector o in the subspace of cluster c.

Parameters:
c - the cluster
o - the double vector
Returns:
the projection of double vector o in the subspace of cluster c

Release 0.1 (2008-07-10_1838)