de.lmu.ifi.dbs.elki.algorithm.outlier
Class AbstractAggarwalYuOutlier<V extends NumberVector<?,?>>

java.lang.Object
  extended by de.lmu.ifi.dbs.elki.algorithm.AbstractAlgorithm<OutlierResult>
      extended by de.lmu.ifi.dbs.elki.algorithm.outlier.AbstractAggarwalYuOutlier<V>
All Implemented Interfaces:
Algorithm, OutlierAlgorithm, InspectionUtilFrequentlyScanned, Parameterizable
Direct Known Subclasses:
AggarwalYuEvolutionary, AggarwalYuNaive

@Reference(authors="C.C. Aggarwal, P. S. Yu",
           title="Outlier detection for high dimensional data",
           booktitle="Proc. ACM SIGMOD Int. Conf. on Management of Data (SIGMOD 2001), Santa Barbara, CA, 2001",
           url="http://dx.doi.org/10.1145/375663.375668")
public abstract class AbstractAggarwalYuOutlier<V extends NumberVector<?,?>>
extends AbstractAlgorithm<OutlierResult>
implements OutlierAlgorithm

Abstract base class for the sparse-grid-cell based outlier detection of Aggarwal and Yu.

Reference:
Outlier detection for high dimensional data Outlier detection for high dimensional data
C.C. Aggarwal, P. S. Yu
International Conference on Management of Data Proceedings of the 2001 ACM SIGMOD international conference on Management of data 2001, Santa Barbara, California, United States


Nested Class Summary
static class AbstractAggarwalYuOutlier.Parameterizer
          Parameterization class.
 
Field Summary
static int DONT_CARE
          Symbolic value for subspaces not in use.
protected  int k
          The target dimensionality.
static OptionID K_ID
          OptionID for the target dimensionality
protected  int phi
          The number of partitions for each dimension
static OptionID PHI_ID
          OptionID for the grid size
 
Constructor Summary
AbstractAggarwalYuOutlier(int k, int phi)
          Constructor.
 
Method Summary
protected  ArrayList<ArrayList<DBIDs>> buildRanges(Relation<V> database)
          Grid discretization of the data:
Each attribute of data is divided into phi equi-depth ranges.
protected  DBIDs computeSubspace(Vector<IntIntPair> subspace, ArrayList<ArrayList<DBIDs>> ranges)
          Method to get the ids in the given subspace
protected  DBIDs computeSubspaceForGene(int[] gene, ArrayList<ArrayList<DBIDs>> ranges)
          Get the DBIDs in the current subspace.
 TypeInformation[] getInputTypeRestriction()
          Get the input type restriction used for negotiating the data query.
protected  double sparsity(int setsize, int dbsize, int k)
          Method to calculate the sparsity coefficient of
 
Methods inherited from class de.lmu.ifi.dbs.elki.algorithm.AbstractAlgorithm
getLogger, makeParameterDistanceFunction, run
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface de.lmu.ifi.dbs.elki.algorithm.outlier.OutlierAlgorithm
run
 

Field Detail

PHI_ID

public static final OptionID PHI_ID
OptionID for the grid size


K_ID

public static final OptionID K_ID
OptionID for the target dimensionality


DONT_CARE

public static final int DONT_CARE
Symbolic value for subspaces not in use. Note: in some places, the implementations may rely on this having the value 0 currently!

See Also:
Constant Field Values

phi

protected int phi
The number of partitions for each dimension


k

protected int k
The target dimensionality.

Constructor Detail

AbstractAggarwalYuOutlier

public AbstractAggarwalYuOutlier(int k,
                                 int phi)
Constructor.

Parameters:
k - K parameter
phi - Phi parameter
Method Detail

buildRanges

protected ArrayList<ArrayList<DBIDs>> buildRanges(Relation<V> database)
Grid discretization of the data:
Each attribute of data is divided into phi equi-depth ranges.
Each range contains a fraction f=1/phi of the records.

Parameters:
database -
Returns:
range map

sparsity

protected double sparsity(int setsize,
                          int dbsize,
                          int k)
Method to calculate the sparsity coefficient of

Parameters:
setsize - Size of subset
dbsize - Size of database
k - Dimensionality
Returns:
sparsity coefficient

computeSubspace

protected DBIDs computeSubspace(Vector<IntIntPair> subspace,
                                ArrayList<ArrayList<DBIDs>> ranges)
Method to get the ids in the given subspace

Parameters:
subspace -
Returns:
ids

computeSubspaceForGene

protected DBIDs computeSubspaceForGene(int[] gene,
                                       ArrayList<ArrayList<DBIDs>> ranges)
Get the DBIDs in the current subspace.

Parameters:
gene - gene data
ranges - Database ranges
Returns:
resulting DBIDs

getInputTypeRestriction

public TypeInformation[] getInputTypeRestriction()
Description copied from class: AbstractAlgorithm
Get the input type restriction used for negotiating the data query.

Specified by:
getInputTypeRestriction in interface Algorithm
Specified by:
getInputTypeRestriction in class AbstractAlgorithm<OutlierResult>
Returns:
Type restriction

Release 0.4.0 (2011-09-20_1324)