Environment for
DeveLoping
KDD-Applications
Supported by Index-Structures

de.lmu.ifi.dbs.elki.algorithm.outlier
Class ABOD<V extends RealVector<V,?>>

java.lang.Object
  extended by de.lmu.ifi.dbs.elki.logging.AbstractLoggable
      extended by de.lmu.ifi.dbs.elki.utilities.optionhandling.AbstractParameterizable
          extended by de.lmu.ifi.dbs.elki.algorithm.AbstractAlgorithm<O,R>
              extended by de.lmu.ifi.dbs.elki.algorithm.DistanceBasedAlgorithm<V,DoubleDistance,MultiResult>
                  extended by de.lmu.ifi.dbs.elki.algorithm.outlier.ABOD<V>
Type Parameters:
V - Vector type
All Implemented Interfaces:
Algorithm<V,MultiResult>, Parameterizable

public class ABOD<V extends RealVector<V,?>>
extends DistanceBasedAlgorithm<V,DoubleDistance,MultiResult>

Angle-Based Outlier Detection Outlier detection using variance analysis on angles, especially for high dimensional data sets. H.-P. Kriegel, M. Schubert, and A. Zimek: Angle-Based Outlier Detection in High-dimensional Data. In: Proc. 14th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD '08), Las Vegas, NV, 2008.

Author:
Matthias Schubert (Original Code), Erich Schubert (ELKIfication)

Field Summary
static AssociationID<Double> ABOD_NORM
          Association ID for ABOD Normalization value.
static AssociationID<Double> ABOD_SCORE
          Association ID for ABOD.
(package private)  boolean fast
          Variable to store fast mode flag.
private  Flag FAST_FLAG
          Flag for fast mode.
static OptionID FAST_ID
          OptionID for FAST_FLAG
static OptionID FAST_SAMPLE_ID
          OptionID for FAST_SAMPLE_PARAM
private  IntParameter FAST_SAMPLE_PARAM
          Parameter for sample size to be used in fast mode.
private  int k
          k parameter
static OptionID K_ID
          OptionID for K_PARAM
private  IntParameter K_PARAM
          Parameter for k, the number of neighbors used in kNN queries.
static OptionID KERNEL_FUNCTION_ID
          OptionID for KERNEL_FUNCTION_PARAM
private  ClassParameter<KernelFunction<V,DoubleDistance>> KERNEL_FUNCTION_PARAM
          Parameter for Kernel function.
(package private)  KernelFunction<V,DoubleDistance> kernelFunction
          Store the configured Kernel version
(package private)  MultiResult result
          Result storage.
(package private)  int sampleSize
          Variable to store fast mode flag.
 
Fields inherited from class de.lmu.ifi.dbs.elki.algorithm.DistanceBasedAlgorithm
DISTANCE_FUNCTION_ID, DISTANCE_FUNCTION_PARAM
 
Fields inherited from class de.lmu.ifi.dbs.elki.utilities.optionhandling.AbstractParameterizable
optionHandler
 
Fields inherited from class de.lmu.ifi.dbs.elki.logging.AbstractLoggable
debug, logger
 
Constructor Summary
ABOD()
          Constructor
 
Method Summary
private  double calcCos(KernelMatrix<V> kernelMatrix, Integer aKey, Integer bKey)
          Compute the cosinus value between vectors aKey and bKey.
private  double calcDenominator(KernelMatrix<V> kernelMatrix, Integer aKey, Integer bKey, Integer cKey)
           
private  PriorityQueue<FCPair<Double,Integer>> calcDistsandNN(Database<V> data, KernelMatrix<V> kernelMatrix, int sampleSize, Integer aKey, HashMap<Integer,Double> dists)
           
private  PriorityQueue<FCPair<Double,Integer>> calcDistsandRNDSample(Database<V> data, KernelMatrix<V> kernelMatrix, int sampleSize, Integer aKey, HashMap<Integer,Double> dists)
           
private  double[] calcFastNormalization(Integer x, HashMap<Integer,Double> dists)
           
private  double[] calcNormalization(Integer xKey, HashMap<Integer,Double> dists)
           
private  double calcNumerator(KernelMatrix<V> kernelMatrix, Integer aKey, Integer bKey, Integer cKey)
           
private  void generateExplanation(Database<V> data, Integer key, LinkedList<Integer> expList)
           
private  double getAbofFilter(KernelMatrix<V> kernelMatrix, Integer aKey, HashMap<Integer,Double> dists, double fulCounter, double counter, List<Integer> neighbors)
           
 Description getDescription()
          Return a description of the algorithm.
 void getExplanations(Database<V> data)
           
 MultiResult getFastRanking(Database<V> database, int k, int sampleSize)
          Main part of the algorithm.
 MultiResult getRanking(Database<V> database, int k)
          Main part of the algorithm.
 MultiResult getResult()
          Return the results of the last run.
protected  MultiResult runInTime(Database<V> database)
          The run method encapsulated in measure of runtime.
 List<String> setParameters(List<String> args)
          Calls the super method and sets parameters FAST_FLAG, FAST_SAMPLE_PARAM and KERNEL_FUNCTION_PARAM.
 
Methods inherited from class de.lmu.ifi.dbs.elki.algorithm.DistanceBasedAlgorithm
getDistanceFunction
 
Methods inherited from class de.lmu.ifi.dbs.elki.algorithm.AbstractAlgorithm
isTime, isVerbose, run, setTime, setVerbose
 
Methods inherited from class de.lmu.ifi.dbs.elki.utilities.optionhandling.AbstractParameterizable
addOption, addParameterizable, addParameterizable, checkGlobalParameterConstraints, collectOptions, getAttributeSettings, getParameters, rememberParametersExcept, removeOption, removeParameterizable, shortDescription
 
Methods inherited from class de.lmu.ifi.dbs.elki.logging.AbstractLoggable
debugFine, debugFiner, debugFinest, exception, progress, verbose, warning
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface de.lmu.ifi.dbs.elki.utilities.optionhandling.Parameterizable
checkGlobalParameterConstraints, collectOptions, getParameters, shortDescription
 

Field Detail

K_ID

public static final OptionID K_ID
OptionID for K_PARAM


K_PARAM

private final IntParameter K_PARAM
Parameter for k, the number of neighbors used in kNN queries.

Key: -abod.k

Default value: 30


k

private int k
k parameter


FAST_ID

public static final OptionID FAST_ID
OptionID for FAST_FLAG


FAST_FLAG

private final Flag FAST_FLAG
Flag for fast mode.

Key: -abod.fast


fast

boolean fast
Variable to store fast mode flag.


FAST_SAMPLE_ID

public static final OptionID FAST_SAMPLE_ID
OptionID for FAST_SAMPLE_PARAM


FAST_SAMPLE_PARAM

private final IntParameter FAST_SAMPLE_PARAM
Parameter for sample size to be used in fast mode.

Key: -abod.samplesize


sampleSize

int sampleSize
Variable to store fast mode flag.


KERNEL_FUNCTION_ID

public static final OptionID KERNEL_FUNCTION_ID
OptionID for KERNEL_FUNCTION_PARAM


KERNEL_FUNCTION_PARAM

private final ClassParameter<KernelFunction<V extends RealVector<V,?>,DoubleDistance>> KERNEL_FUNCTION_PARAM
Parameter for Kernel function.

Key: -abod.kernelfunction

Default: PolynomialKernelFunction


ABOD_SCORE

public static final AssociationID<Double> ABOD_SCORE
Association ID for ABOD.


ABOD_NORM

public static final AssociationID<Double> ABOD_NORM
Association ID for ABOD Normalization value.


kernelFunction

KernelFunction<V extends RealVector<V,?>,DoubleDistance> kernelFunction
Store the configured Kernel version


result

MultiResult result
Result storage.

Constructor Detail

ABOD

public ABOD()
Constructor

Method Detail

getRanking

public MultiResult getRanking(Database<V> database,
                              int k)
Main part of the algorithm. Exact version.

Parameters:
database - Database to use
k - k for kNN queries
Returns:
result

getFastRanking

public MultiResult getFastRanking(Database<V> database,
                                  int k,
                                  int sampleSize)
Main part of the algorithm. Fast version.

Parameters:
database - Database to use
k - k for kNN queries
sampleSize - Sample size
Returns:
result

calcNormalization

private double[] calcNormalization(Integer xKey,
                                   HashMap<Integer,Double> dists)

calcFastNormalization

private double[] calcFastNormalization(Integer x,
                                       HashMap<Integer,Double> dists)

getAbofFilter

private double getAbofFilter(KernelMatrix<V> kernelMatrix,
                             Integer aKey,
                             HashMap<Integer,Double> dists,
                             double fulCounter,
                             double counter,
                             List<Integer> neighbors)

calcCos

private double calcCos(KernelMatrix<V> kernelMatrix,
                       Integer aKey,
                       Integer bKey)
Compute the cosinus value between vectors aKey and bKey.

Parameters:
kernelMatrix -
aKey -
bKey -
Returns:
cosinus value

calcDenominator

private double calcDenominator(KernelMatrix<V> kernelMatrix,
                               Integer aKey,
                               Integer bKey,
                               Integer cKey)

calcNumerator

private double calcNumerator(KernelMatrix<V> kernelMatrix,
                             Integer aKey,
                             Integer bKey,
                             Integer cKey)

calcDistsandNN

private PriorityQueue<FCPair<Double,Integer>> calcDistsandNN(Database<V> data,
                                                             KernelMatrix<V> kernelMatrix,
                                                             int sampleSize,
                                                             Integer aKey,
                                                             HashMap<Integer,Double> dists)

calcDistsandRNDSample

private PriorityQueue<FCPair<Double,Integer>> calcDistsandRNDSample(Database<V> data,
                                                                    KernelMatrix<V> kernelMatrix,
                                                                    int sampleSize,
                                                                    Integer aKey,
                                                                    HashMap<Integer,Double> dists)

getExplanations

public void getExplanations(Database<V> data)
Parameters:
data -

generateExplanation

private void generateExplanation(Database<V> data,
                                 Integer key,
                                 LinkedList<Integer> expList)

runInTime

protected MultiResult runInTime(Database<V> database)
                         throws IllegalStateException
Description copied from class: AbstractAlgorithm
The run method encapsulated in measure of runtime. An extending class needs not to take care of runtime itself.

Specified by:
runInTime in class AbstractAlgorithm<V extends RealVector<V,?>,MultiResult>
Parameters:
database - the database to run the algorithm on
Returns:
the Result computed by this algorithm
Throws:
IllegalStateException - if the algorithm has not been initialized properly (e.g. the setParameters(String[]) method has been failed to be called).

getDescription

public Description getDescription()
Return a description of the algorithm.

Returns:
a description of the algorithm

getResult

public MultiResult getResult()
Return the results of the last run.

Returns:
the result of the algorithm

setParameters

public List<String> setParameters(List<String> args)
                           throws ParameterException
Calls the super method and sets parameters FAST_FLAG, FAST_SAMPLE_PARAM and KERNEL_FUNCTION_PARAM. The remaining parameters are then passed to the kernelFunction.

Specified by:
setParameters in interface Parameterizable
Overrides:
setParameters in class DistanceBasedAlgorithm<V extends RealVector<V,?>,DoubleDistance,MultiResult>
Parameters:
args - parameters to set the attributes accordingly to
Returns:
a list containing the unused parameters
Throws:
ParameterException - in case of wrong parameter-setting

Release 0.2.1 (2009-07-13_1605)