|
|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectde.lmu.ifi.dbs.elki.algorithm.AbstractAlgorithm<R>
de.lmu.ifi.dbs.elki.algorithm.clustering.AbstractProjectedClustering<Clustering<Model>,V>
de.lmu.ifi.dbs.elki.algorithm.clustering.correlation.ORCLUS<V>
V - the type of NumberVector handled by this Algorithm@Title(value="ORCLUS: Arbitrarily ORiented projected CLUSter generation")
@Description(value="Algorithm to find correlation clusters in high dimensional spaces.")
@Reference(authors="C. C. Aggarwal, P. S. Yu",
title="Finding Generalized Projected Clusters in High Dimensional Spaces",
booktitle="Proc. ACM SIGMOD Int. Conf. on Management of Data (SIGMOD \'00)",
url="http://dx.doi.org/10.1145/342009.335383")
public class ORCLUS<V extends NumberVector<V,?>>
ORCLUS provides the ORCLUS algorithm, an algorithm to find clusters in high dimensional spaces.
Reference: C. C. Aggarwal, P. S. Yu: Finding Generalized Projected Clusters
in High Dimensional Spaces.
In: Proc. ACM SIGMOD Int. Conf. on Management of Data (SIGMOD '00).
| Nested Class Summary | |
|---|---|
private class |
ORCLUS.ORCLUSCluster
Encapsulates the attributes of a cluster. |
static class |
ORCLUS.Parameterizer<V extends NumberVector<V,?>>
Parameterization class. |
private class |
ORCLUS.ProjectedEnergy
Encapsulates the projected energy for a cluster. |
| Field Summary | |
|---|---|
private double |
alpha
Holds the value of ALPHA_ID. |
static OptionID |
ALPHA_ID
Parameter to specify the factor for reducing the number of current clusters in each iteration, must be an integer greater than 0 and less than 1. |
private static Logging |
logger
The logger for this class. |
private PCARunner<V> |
pca
The PCA utility object. |
private Long |
seed
Holds the value of SEED_ID. |
static OptionID |
SEED_ID
Parameter to specify the random generator seed. |
| Fields inherited from class de.lmu.ifi.dbs.elki.algorithm.clustering.AbstractProjectedClustering |
|---|
k, k_i, K_I_ID, K_ID, l, L_ID |
| Constructor Summary | |
|---|---|
ORCLUS(int k,
int k_i,
int l,
double alpha,
long seed,
PCARunner<V> pca)
Java constructor. |
|
| Method Summary | |
|---|---|
private void |
assign(Relation<V> database,
DistanceQuery<V,DoubleDistance> distFunc,
List<ORCLUS.ORCLUSCluster> clusters)
Creates a partitioning of the database by assigning each object to its closest seed. |
private Matrix |
findBasis(Relation<V> database,
DistanceQuery<V,DoubleDistance> distFunc,
ORCLUS.ORCLUSCluster cluster,
int dim)
Finds the basis of the subspace of dimensionality dim for the
specified cluster. |
TypeInformation[] |
getInputTypeRestriction()
Get the input type restriction used for negotiating the data query. |
protected Logging |
getLogger()
Get the (STATIC) logger for this class. |
private List<ORCLUS.ORCLUSCluster> |
initialSeeds(Relation<V> database,
int k)
Initializes the list of seeds wit a random sample of size k. |
private void |
merge(Relation<V> database,
DistanceQuery<V,DoubleDistance> distFunc,
List<ORCLUS.ORCLUSCluster> clusters,
int k_new,
int d_new,
IndefiniteProgress cprogress)
Reduces the number of seeds to k_new |
private ORCLUS.ProjectedEnergy |
projectedEnergy(Relation<V> database,
DistanceQuery<V,DoubleDistance> distFunc,
ORCLUS.ORCLUSCluster c_i,
ORCLUS.ORCLUSCluster c_j,
int i,
int j,
int dim)
Computes the projected energy of the specified clusters. |
private V |
projection(ORCLUS.ORCLUSCluster c,
V o,
V factory)
Returns the projection of real vector o in the subspace of cluster c. |
Clustering<Model> |
run(Database database,
Relation<V> relation)
Performs the ORCLUS algorithm on the given database. |
private ORCLUS.ORCLUSCluster |
union(Relation<V> database,
DistanceQuery<V,DoubleDistance> distFunc,
ORCLUS.ORCLUSCluster c1,
ORCLUS.ORCLUSCluster c2,
int dim)
Returns the union of the two specified clusters. |
| Methods inherited from class de.lmu.ifi.dbs.elki.algorithm.clustering.AbstractProjectedClustering |
|---|
getDistanceFunction, getDistanceQuery |
| Methods inherited from class de.lmu.ifi.dbs.elki.algorithm.AbstractAlgorithm |
|---|
makeParameterDistanceFunction, run |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Methods inherited from interface de.lmu.ifi.dbs.elki.algorithm.clustering.ClusteringAlgorithm |
|---|
run |
| Field Detail |
|---|
private static final Logging logger
public static final OptionID ALPHA_ID
Default value: 0.5
Key: -orclus.alpha
public static final OptionID SEED_ID
private double alpha
ALPHA_ID.
private Long seed
SEED_ID.
private PCARunner<V extends NumberVector<V,?>> pca
| Constructor Detail |
|---|
public ORCLUS(int k,
int k_i,
int l,
double alpha,
long seed,
PCARunner<V> pca)
k - k Parameterk_i - k_i Parameterl - l Parameteralpha - Alpha Parameterseed - Seed parameterpca - PCA runner| Method Detail |
|---|
public Clustering<Model> run(Database database,
Relation<V> relation)
throws IllegalStateException
IllegalStateException
private List<ORCLUS.ORCLUSCluster> initialSeeds(Relation<V> database,
int k)
database - the database holding the objectsk - the size of the random sample
private void assign(Relation<V> database,
DistanceQuery<V,DoubleDistance> distFunc,
List<ORCLUS.ORCLUSCluster> clusters)
database - the database holding the objectsdistFunc - distance functionclusters - the array of clusters to which the objects should be
assigned to
private Matrix findBasis(Relation<V> database,
DistanceQuery<V,DoubleDistance> distFunc,
ORCLUS.ORCLUSCluster cluster,
int dim)
dim for the
specified cluster.
database - the database to run the algorithm ondistFunc - the distance functioncluster - the clusterdim - the dimensionality of the subspace
private void merge(Relation<V> database,
DistanceQuery<V,DoubleDistance> distFunc,
List<ORCLUS.ORCLUSCluster> clusters,
int k_new,
int d_new,
IndefiniteProgress cprogress)
database - the database holding the objectsdistFunc - the distance functionclusters - the set of current seedsk_new - the new number of seedsd_new - the new dimensionality of the subspaces for each seed
private ORCLUS.ProjectedEnergy projectedEnergy(Relation<V> database,
DistanceQuery<V,DoubleDistance> distFunc,
ORCLUS.ORCLUSCluster c_i,
ORCLUS.ORCLUSCluster c_j,
int i,
int j,
int dim)
database - the database holding the objectsdistFunc - the distance functionc_i - the first clusterc_j - the second clusteri - the index of cluster c_i in the cluster listj - the index of cluster c_j in the cluster listdim - the dimensionality of the clusters
private ORCLUS.ORCLUSCluster union(Relation<V> database,
DistanceQuery<V,DoubleDistance> distFunc,
ORCLUS.ORCLUSCluster c1,
ORCLUS.ORCLUSCluster c2,
int dim)
database - the database holding the objectsdistFunc - the distance functionc1 - the first clusterc2 - the second clusterdim - the dimensionality of the union cluster
private V projection(ORCLUS.ORCLUSCluster c,
V o,
V factory)
c - the clustero - the double vectorfactory - Factory object / prototype
public TypeInformation[] getInputTypeRestriction()
AbstractAlgorithm
getInputTypeRestriction in interface AlgorithmgetInputTypeRestriction in class AbstractAlgorithm<Clustering<Model>>protected Logging getLogger()
AbstractAlgorithm
getLogger in class AbstractAlgorithm<Clustering<Model>>
|
|
|||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||||