Environment for
DeveLoping
KDD-Applications
Supported by Index-Structures

de.lmu.ifi.dbs.elki.math.linearalgebra.pca
Class WeightedCovarianceMatrixBuilder<V extends NumberVector<V,?>,D extends NumberDistance<D,?>>

java.lang.Object
  extended by de.lmu.ifi.dbs.elki.logging.AbstractLoggable
      extended by de.lmu.ifi.dbs.elki.math.linearalgebra.pca.CovarianceMatrixBuilder<V,D>
          extended by de.lmu.ifi.dbs.elki.math.linearalgebra.pca.WeightedCovarianceMatrixBuilder<V,D>
Type Parameters:
V - Vector class to use
D - Distance type
All Implemented Interfaces:
Parameterizable

@Title(value="Weighted Covariance Matrix / PCA")
@Description(value="A PCA modification by using weights while building the covariance matrix, to obtain more stable results")
@Reference(authors="H.-P. Kriegel, P. Kr\u00f6ger, E. Schubert, A. Zimek",
           title="A General Framework for Increasing the Robustness of PCA-based Correlation Clustering Algorithms",
           booktitle="Proceedings of the 20th International Conference on Scientific and Statistical Database Management (SSDBM), Hong Kong, China, 2008",
           url="http://dx.doi.org/10.1007/978-3-540-69497-7_27")
public class WeightedCovarianceMatrixBuilder<V extends NumberVector<V,?>,D extends NumberDistance<D,?>>
extends CovarianceMatrixBuilder<V,D>

CovarianceMatrixBuilder with weights. This builder uses a weight function to weight points differently during build a covariance matrix. Covariance can be canonically extended with weights, as shown in the article A General Framework for Increasing the Robustness of PCA-Based Correlation Clustering Algorithms Hans-Peter Kriegel and Peer Kröger and Erich Schubert and Arthur Zimek In: Proc. 20th Int. Conf. on Scientific and Statistical Database Management (SSDBM), 2008, Hong Kong Lecture Notes in Computer Science 5069, Springer

Author:
Erich Schubert

Field Summary
static OptionID WEIGHT_ID
          OptionID for WEIGHT_PARAM
private  ObjectParameter<WeightFunction> WEIGHT_PARAM
          Parameter to specify the weight function to use in weighted PCA, must implement WeightFunction .
private  DistanceFunction<V,DoubleDistance> weightDistance
          Holds the distance function used for weight calculation
 WeightFunction weightfunction
          Holds the weight function.
 
Fields inherited from class de.lmu.ifi.dbs.elki.logging.AbstractLoggable
debug, logger
 
Constructor Summary
WeightedCovarianceMatrixBuilder(Parameterization config)
          Constructor, adhering to Parameterizable
 
Method Summary
private  double[][] finishCovarianceMatrix(double[] sums, double[][] squares, double weightsum)
          Finish the Covariance matrix in array "squares".
 Matrix processIds(Collection<Integer> ids, Database<V> database)
          Weighted Covariance Matrix for a set of IDs.
 Matrix processQueryResults(Collection<DistanceResultPair<D>> results, Database<V> database, int k)
          Compute Covariance Matrix for a QueryResult Collection By default it will just collect the ids and run processIds
 
Methods inherited from class de.lmu.ifi.dbs.elki.math.linearalgebra.pca.CovarianceMatrixBuilder
processDatabase, processQueryResults
 
Methods inherited from class de.lmu.ifi.dbs.elki.logging.AbstractLoggable
debugFine, debugFiner, debugFinest, exception, progress, verbose, warning
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

WEIGHT_ID

public static final OptionID WEIGHT_ID
OptionID for WEIGHT_PARAM


WEIGHT_PARAM

private final ObjectParameter<WeightFunction> WEIGHT_PARAM
Parameter to specify the weight function to use in weighted PCA, must implement WeightFunction .

Key: -pca.weight


weightfunction

public WeightFunction weightfunction
Holds the weight function.


weightDistance

private DistanceFunction<V extends NumberVector<V,?>,DoubleDistance> weightDistance
Holds the distance function used for weight calculation

Constructor Detail

WeightedCovarianceMatrixBuilder

public WeightedCovarianceMatrixBuilder(Parameterization config)
Constructor, adhering to Parameterizable

Parameters:
config - Parameterization
Method Detail

processIds

public Matrix processIds(Collection<Integer> ids,
                         Database<V> database)
Weighted Covariance Matrix for a set of IDs. Since we are not supplied any distance information, we'll need to compute it ourselves. Covariance is tied to Euclidean distance, so it probably does not make much sense to add support for other distance functions?

Specified by:
processIds in class CovarianceMatrixBuilder<V extends NumberVector<V,?>,D extends NumberDistance<D,?>>
Parameters:
ids - a collection of ids
database - the database used
Returns:
Covariance Matrix

processQueryResults

public Matrix processQueryResults(Collection<DistanceResultPair<D>> results,
                                  Database<V> database,
                                  int k)
Compute Covariance Matrix for a QueryResult Collection By default it will just collect the ids and run processIds

Overrides:
processQueryResults in class CovarianceMatrixBuilder<V extends NumberVector<V,?>,D extends NumberDistance<D,?>>
Parameters:
results - a collection of QueryResults
database - the database used
k - number of elements to process
Returns:
Covariance Matrix

finishCovarianceMatrix

private double[][] finishCovarianceMatrix(double[] sums,
                                          double[][] squares,
                                          double weightsum)
Finish the Covariance matrix in array "squares".

Parameters:
sums - Sums of values.
squares - Sums of squares. Contents are destroyed and replaced with Covariance Matrix!
weightsum - Sum of weights.
Returns:
modified squares array

Release 0.3 (2010-03-31_1612)