de.lmu.ifi.dbs.elki.algorithm.outlier
Class ABOD<V extends NumberVector<V,?>>

java.lang.Object
  extended by de.lmu.ifi.dbs.elki.algorithm.AbstractAlgorithm<R>
      extended by de.lmu.ifi.dbs.elki.algorithm.AbstractDistanceBasedAlgorithm<V,DoubleDistance,OutlierResult>
          extended by de.lmu.ifi.dbs.elki.algorithm.outlier.ABOD<V>
Type Parameters:
V - Vector type
All Implemented Interfaces:
Algorithm, OutlierAlgorithm, InspectionUtilFrequentlyScanned, Parameterizable

@Title(value="ABOD: Angle-Based Outlier Detection")
@Description(value="Outlier detection using variance analysis on angles, especially for high dimensional data sets.")
@Reference(authors="H.-P. Kriegel, M. Schubert, and A. Zimek",
           title="Angle-Based Outlier Detection in High-dimensional Data",
           booktitle="Proc. 14th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD \'08), Las Vegas, NV, 2008",
           url="http://dx.doi.org/10.1145/1401890.1401946")
public class ABOD<V extends NumberVector<V,?>>
extends AbstractDistanceBasedAlgorithm<V,DoubleDistance,OutlierResult>
implements OutlierAlgorithm

Angle-Based Outlier Detection Outlier detection using variance analysis on angles, especially for high dimensional data sets. H.-P. Kriegel, M. Schubert, and A. Zimek: Angle-Based Outlier Detection in High-dimensional Data. In: Proc. 14th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD '08), Las Vegas, NV, 2008.


Nested Class Summary
static class ABOD.Parameterizer<V extends NumberVector<V,?>>
          Parameterization class.
 
Field Summary
static OptionID FAST_SAMPLE_ID
          Parameter for sample size to be used in fast mode.
private  int k
          k parameter
static OptionID K_ID
          Parameter for k, the number of neighbors used in kNN queries.
static OptionID KERNEL_FUNCTION_ID
          Parameter for the kernel function.
private static Logging logger
          The logger for this class.
static OptionID PREPROCESSOR_ID
          The preprocessor used to materialize the kNN neighborhoods.
private  PrimitiveSimilarityFunction<? super V,DoubleDistance> primitiveKernelFunction
          Store the configured Kernel version
(package private)  int sampleSize
          Variable to store fast mode sampling value.
private  ArrayModifiableDBIDs staticids
           
private static boolean useRNDSample
          use alternate code below
 
Fields inherited from class de.lmu.ifi.dbs.elki.algorithm.AbstractDistanceBasedAlgorithm
DISTANCE_FUNCTION_ID
 
Constructor Summary
ABOD(int k, int sampleSize, PrimitiveSimilarityFunction<? super V,DoubleDistance> primitiveKernelFunction, DistanceFunction<V,DoubleDistance> distanceFunction)
          Actual constructor, with parameters.
ABOD(int k, PrimitiveSimilarityFunction<? super V,DoubleDistance> primitiveKernelFunction, DistanceFunction<V,DoubleDistance> distanceFunction)
          Actual constructor, with parameters.
 
Method Summary
private  double calcCos(KernelMatrix kernelMatrix, DBID aKey, DBID bKey)
          Compute the cosinus value between vectors aKey and bKey.
private  double calcDenominator(KernelMatrix kernelMatrix, DBID aKey, DBID bKey, DBID cKey)
           
private  PriorityQueue<FCPair<Double,DBID>> calcDistsandNN(Relation<V> data, KernelMatrix kernelMatrix, int sampleSize, DBID aKey, HashMap<DBID,Double> dists)
           
private  PriorityQueue<FCPair<Double,DBID>> calcDistsandRNDSample(Relation<V> data, KernelMatrix kernelMatrix, int sampleSize, DBID aKey, HashMap<DBID,Double> dists)
           
private  double[] calcFastNormalization(DBID x, HashMap<DBID,Double> dists)
           
private  double[] calcNormalization(Integer xKey, HashMap<Integer,Double> dists)
           
private  double calcNumerator(KernelMatrix kernelMatrix, DBID aKey, DBID bKey, DBID cKey)
           
private  void generateExplanation(Relation<V> data, DBID key, LinkedList<DBID> expList)
           
private  double getAbofFilter(KernelMatrix kernelMatrix, DBID aKey, HashMap<DBID,Double> dists, double fulCounter, double counter, DBIDs neighbors)
           
 void getExplanations(Relation<V> data)
          Get explanations for points in the database.
 OutlierResult getFastRanking(Relation<V> relation, int k, int sampleSize)
          Main part of the algorithm.
 TypeInformation[] getInputTypeRestriction()
          Get the input type restriction used for negotiating the data query.
protected  Logging getLogger()
          Get the (STATIC) logger for this class.
 OutlierResult getRanking(Relation<V> relation, int k)
          Main part of the algorithm.
private  int mapDBID(DBID aKey)
           
 OutlierResult run(Database database, Relation<V> relation)
          Run ABOD on the data set
 
Methods inherited from class de.lmu.ifi.dbs.elki.algorithm.AbstractDistanceBasedAlgorithm
getDistanceFunction
 
Methods inherited from class de.lmu.ifi.dbs.elki.algorithm.AbstractAlgorithm
makeParameterDistanceFunction, run
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface de.lmu.ifi.dbs.elki.algorithm.outlier.OutlierAlgorithm
run
 

Field Detail

logger

private static final Logging logger
The logger for this class.


K_ID

public static final OptionID K_ID
Parameter for k, the number of neighbors used in kNN queries.


FAST_SAMPLE_ID

public static final OptionID FAST_SAMPLE_ID
Parameter for sample size to be used in fast mode.


KERNEL_FUNCTION_ID

public static final OptionID KERNEL_FUNCTION_ID
Parameter for the kernel function.


PREPROCESSOR_ID

public static final OptionID PREPROCESSOR_ID
The preprocessor used to materialize the kNN neighborhoods.


useRNDSample

private static final boolean useRNDSample
use alternate code below

See Also:
Constant Field Values

k

private int k
k parameter


sampleSize

int sampleSize
Variable to store fast mode sampling value.


primitiveKernelFunction

private PrimitiveSimilarityFunction<? super V extends NumberVector<V,?>,DoubleDistance> primitiveKernelFunction
Store the configured Kernel version


staticids

private ArrayModifiableDBIDs staticids
Constructor Detail

ABOD

public ABOD(int k,
            int sampleSize,
            PrimitiveSimilarityFunction<? super V,DoubleDistance> primitiveKernelFunction,
            DistanceFunction<V,DoubleDistance> distanceFunction)
Actual constructor, with parameters. Fast mode (sampling).

Parameters:
k - k parameter
sampleSize - sample size
primitiveKernelFunction - Kernel function to use
distanceFunction - Distance function

ABOD

public ABOD(int k,
            PrimitiveSimilarityFunction<? super V,DoubleDistance> primitiveKernelFunction,
            DistanceFunction<V,DoubleDistance> distanceFunction)
Actual constructor, with parameters. Slow mode (exact).

Parameters:
k - k parameter
primitiveKernelFunction - kernel function to use
distanceFunction - Distance function
Method Detail

getRanking

public OutlierResult getRanking(Relation<V> relation,
                                int k)
Main part of the algorithm. Exact version.

Parameters:
relation - Relation to query
k - k for kNN queries
Returns:
result

getFastRanking

public OutlierResult getFastRanking(Relation<V> relation,
                                    int k,
                                    int sampleSize)
Main part of the algorithm. Fast version.

Parameters:
relation - Relation to use
k - k for kNN queries
sampleSize - Sample size
Returns:
result

calcNormalization

private double[] calcNormalization(Integer xKey,
                                   HashMap<Integer,Double> dists)

calcFastNormalization

private double[] calcFastNormalization(DBID x,
                                       HashMap<DBID,Double> dists)

getAbofFilter

private double getAbofFilter(KernelMatrix kernelMatrix,
                             DBID aKey,
                             HashMap<DBID,Double> dists,
                             double fulCounter,
                             double counter,
                             DBIDs neighbors)

calcCos

private double calcCos(KernelMatrix kernelMatrix,
                       DBID aKey,
                       DBID bKey)
Compute the cosinus value between vectors aKey and bKey.

Parameters:
kernelMatrix -
aKey -
bKey -
Returns:
cosinus value

mapDBID

private int mapDBID(DBID aKey)

calcDenominator

private double calcDenominator(KernelMatrix kernelMatrix,
                               DBID aKey,
                               DBID bKey,
                               DBID cKey)

calcNumerator

private double calcNumerator(KernelMatrix kernelMatrix,
                             DBID aKey,
                             DBID bKey,
                             DBID cKey)

calcDistsandNN

private PriorityQueue<FCPair<Double,DBID>> calcDistsandNN(Relation<V> data,
                                                          KernelMatrix kernelMatrix,
                                                          int sampleSize,
                                                          DBID aKey,
                                                          HashMap<DBID,Double> dists)

calcDistsandRNDSample

private PriorityQueue<FCPair<Double,DBID>> calcDistsandRNDSample(Relation<V> data,
                                                                 KernelMatrix kernelMatrix,
                                                                 int sampleSize,
                                                                 DBID aKey,
                                                                 HashMap<DBID,Double> dists)

getExplanations

public void getExplanations(Relation<V> data)
Get explanations for points in the database.

Parameters:
data - to get explanations for

generateExplanation

private void generateExplanation(Relation<V> data,
                                 DBID key,
                                 LinkedList<DBID> expList)

run

public OutlierResult run(Database database,
                         Relation<V> relation)
Run ABOD on the data set

Parameters:
database -
relation -
Returns:
Outlier detection result

getInputTypeRestriction

public TypeInformation[] getInputTypeRestriction()
Description copied from class: AbstractAlgorithm
Get the input type restriction used for negotiating the data query.

Specified by:
getInputTypeRestriction in interface Algorithm
Specified by:
getInputTypeRestriction in class AbstractAlgorithm<OutlierResult>
Returns:
Type restriction

getLogger

protected Logging getLogger()
Description copied from class: AbstractAlgorithm
Get the (STATIC) logger for this class.

Specified by:
getLogger in class AbstractAlgorithm<OutlierResult>
Returns:
the static logger

Release 0.4.0 (2011-09-20_1324)