Package | Description |
---|---|
de.lmu.ifi.dbs.elki.algorithm |
Algorithms suitable as a task for the
KDDTask main routine. |
de.lmu.ifi.dbs.elki.algorithm.clustering |
Clustering algorithms.
|
de.lmu.ifi.dbs.elki.algorithm.clustering.correlation |
Correlation clustering algorithms
|
de.lmu.ifi.dbs.elki.algorithm.clustering.kmeans |
K-means clustering and variations.
|
de.lmu.ifi.dbs.elki.algorithm.clustering.subspace |
Axis-parallel subspace clustering algorithms
The clustering algorithms in this package are instances of both, projected clustering algorithms or
subspace clustering algorithms according to the classical but somewhat obsolete classification schema
of clustering algorithms for axis-parallel subspaces.
|
de.lmu.ifi.dbs.elki.algorithm.outlier |
Outlier detection algorithms
|
de.lmu.ifi.dbs.elki.algorithm.outlier.meta |
Meta outlier detection algorithms: external scores, score rescaling.
|
de.lmu.ifi.dbs.elki.algorithm.outlier.spatial |
Spatial outlier detection algorithms
|
de.lmu.ifi.dbs.elki.application.greedyensemble |
Greedy ensembles for outlier detection.
|
de.lmu.ifi.dbs.elki.application.internal |
Internal utilities for development.
|
de.lmu.ifi.dbs.elki.application.visualization |
Visualization applications in ELKI.
|
de.lmu.ifi.dbs.elki.distance.distancefunction |
Distance functions for use within ELKI.
|
de.lmu.ifi.dbs.elki.distance.distancefunction.colorhistogram |
Distance functions using correlations.
|
de.lmu.ifi.dbs.elki.distance.distancefunction.timeseries |
Distance functions designed for time series.
|
de.lmu.ifi.dbs.elki.evaluation.clustering |
Evaluation of clustering results.
|
de.lmu.ifi.dbs.elki.evaluation.clustering.pairsegments |
Pair-segment analysis of multiple clusterings.
|
de.lmu.ifi.dbs.elki.index.tree.metrical.mtreevariants.mtree | |
de.lmu.ifi.dbs.elki.index.tree.spatial.rstarvariants.query |
Queries on the R-Tree family of indexes: kNN and range queries.
|
de.lmu.ifi.dbs.elki.index.tree.spatial.rstarvariants.rstar | |
de.lmu.ifi.dbs.elki.index.tree.spatial.rstarvariants.strategies.bulk |
Packages for bulk-loading R*-Trees.
|
de.lmu.ifi.dbs.elki.index.tree.spatial.rstarvariants.strategies.insert |
Insertion strategies for R-Trees
|
de.lmu.ifi.dbs.elki.index.tree.spatial.rstarvariants.strategies.overflow |
Overflow treatment strategies for R-Trees
|
de.lmu.ifi.dbs.elki.index.tree.spatial.rstarvariants.strategies.reinsert |
Reinsertion strategies for R-Trees
|
de.lmu.ifi.dbs.elki.index.tree.spatial.rstarvariants.strategies.split |
Splitting strategies for R-Trees
|
de.lmu.ifi.dbs.elki.index.vafile |
Vector Approximation File
|
de.lmu.ifi.dbs.elki.math |
Mathematical operations and utilities used throughout the framework.
|
de.lmu.ifi.dbs.elki.math.geometry |
Algorithms from computational geometry.
|
de.lmu.ifi.dbs.elki.math.linearalgebra.pca |
Principal Component Analysis (PCA) and Eigenvector processing.
|
de.lmu.ifi.dbs.elki.math.spacefillingcurves |
Space filling curves.
|
de.lmu.ifi.dbs.elki.result |
Result types, representation and handling
|
de.lmu.ifi.dbs.elki.utilities.documentation |
Documentation utilities: Annotations for Title, Description, Reference
|
de.lmu.ifi.dbs.elki.utilities.scaling.outlier |
Scaling of Outlier scores, that require a statistical analysis of the occurring values
|
de.lmu.ifi.dbs.elki.visualization.visualizers.pairsegments |
Visualizers for inspecting cluster differences using pair counting segments.
|
de.lmu.ifi.dbs.elki.visualization.visualizers.scatterplot.density |
Visualizers for data set density in a scatterplot projection.
|
de.lmu.ifi.dbs.elki.visualization.visualizers.scatterplot.outlier |
Visualizers for outlier scores based on 2D projections.
|
Modifier and Type | Class and Description |
---|---|
class |
APRIORI
Provides the APRIORI algorithm for Mining Association Rules.
|
class |
DependencyDerivator<V extends NumberVector<V,?>,D extends Distance<D>>
Dependency derivator computes quantitatively linear dependencies among
attributes of a given dataset based on a linear correlation PCA.
|
Modifier and Type | Class and Description |
---|---|
class |
DBSCAN<O,D extends Distance<D>>
DBSCAN provides the DBSCAN algorithm, an algorithm to find density-connected
sets in a database.
|
class |
DeLiClu<NV extends NumberVector<NV,?>,D extends Distance<D>>
DeLiClu provides the DeLiClu algorithm, a hierarchical algorithm to find
density-connected sets in a database.
|
class |
EM<V extends NumberVector<V,?>>
Provides the EM algorithm (clustering by expectation maximization).
|
class |
OPTICS<O,D extends Distance<D>>
OPTICS provides the OPTICS algorithm.
|
class |
SLINK<O,D extends Distance<D>>
Efficient implementation of the Single-Link Algorithm SLINK of R.
|
class |
SNNClustering<O>
Shared nearest neighbor clustering.
|
Modifier and Type | Class and Description |
---|---|
class |
CASH
Provides the CASH algorithm, an subspace clustering algorithm based on the
Hough transform.
|
class |
COPAC<V extends NumberVector<V,?>,D extends Distance<D>>
Provides the COPAC algorithm, an algorithm to partition a database according
to the correlation dimension of its objects and to then perform an arbitrary
clustering algorithm over the partitions.
|
class |
ERiC<V extends NumberVector<V,?>>
Performs correlation clustering on the data partitioned according to local
correlation dimensionality and builds a hierarchy of correlation clusters
that allows multiple inheritance from the clustering result.
|
class |
FourC<V extends NumberVector<V,?>>
4C identifies local subgroups of data objects sharing a uniform correlation.
|
class |
HiCO<V extends NumberVector<V,?>>
Implementation of the HiCO algorithm, an algorithm for detecting hierarchies
of correlation clusters.
|
class |
LMCLUS
Linear manifold clustering in high dimensional spaces by stochastic search.
|
class |
ORCLUS<V extends NumberVector<V,?>>
ORCLUS provides the ORCLUS algorithm, an algorithm to find clusters in high
dimensional spaces.
|
Modifier and Type | Class and Description |
---|---|
class |
KMeansLloyd<V extends NumberVector<V,?>,D extends Distance<D>>
Provides the k-means algorithm, using Lloyd-style bulk iterations.
|
class |
KMeansMacQueen<V extends NumberVector<V,?>,D extends Distance<D>>
Provides the k-means algorithm, using MacQueen style incremental updates.
|
class |
KMeansPlusPlusInitialMeans<V extends NumberVector<V,?>,D extends NumberDistance<D,?>>
K-Means++ initialization for k-means.
|
Modifier and Type | Class and Description |
---|---|
class |
CLIQUE<V extends NumberVector<V,?>>
Implementation of the CLIQUE algorithm, a grid-based algorithm to identify
dense clusters in subspaces of maximum dimensionality.
|
class |
DiSH<V extends NumberVector<V,?>>
Algorithm for detecting subspace hierarchies.
|
class |
HiSC<V extends NumberVector<V,?>>
Implementation of the HiSC algorithm, an algorithm for detecting hierarchies
of subspace clusters.
|
class |
PreDeCon<V extends NumberVector<V,?>>
PreDeCon computes clusters of subspace preference weighted connected points.
|
class |
PROCLUS<V extends NumberVector<V,?>>
Provides the PROCLUS algorithm, an algorithm to find subspace clusters in
high dimensional spaces.
|
class |
SUBCLU<V extends NumberVector<V,?>>
Implementation of the SUBCLU algorithm, an algorithm to detect arbitrarily
shaped and positioned clusters in subspaces.
|
Modifier and Type | Class and Description |
---|---|
class |
ABOD<V extends NumberVector<V,?>>
Angle-Based Outlier Detection
Outlier detection using variance analysis on angles, especially for high
dimensional data sets.
|
class |
AbstractAggarwalYuOutlier<V extends NumberVector<?,?>>
Abstract base class for the sparse-grid-cell based outlier detection of
Aggarwal and Yu.
|
class |
AggarwalYuEvolutionary<V extends NumberVector<?,?>>
EAFOD provides the evolutionary outlier detection algorithm, an algorithm to
detect outliers for high dimensional data.
|
class |
AggarwalYuNaive<V extends NumberVector<?,?>>
BruteForce provides a naive brute force algorithm in which all k-subsets of
dimensions are examined and calculates the sparsity coefficient to find
outliers.
|
class |
DBOutlierDetection<O,D extends Distance<D>>
Simple distanced based outlier detection algorithm.
|
class |
DBOutlierScore<O,D extends Distance<D>>
Compute percentage of neighbors in the given neighborhood with size d.
|
class |
GaussianUniformMixture<V extends NumberVector<V,?>>
Outlier detection algorithm using a mixture model approach.
|
class |
INFLO<O,D extends NumberDistance<D,?>>
INFLO provides the Mining Algorithms (Two-way Search Method) for Influence
Outliers using Symmetric Relationship
Reference:
Jin, W., Tung, A., Han, J., and Wang, W. 2006 Ranking outliers using symmetric neighborhood relationship In Proc. |
class |
KNNOutlier<O,D extends NumberDistance<D,?>>
Outlier Detection based on the distance of an object to its k nearest
neighbor.
|
class |
KNNWeightOutlier<O,D extends NumberDistance<D,?>>
Outlier Detection based on the accumulated distances of a point to its k
nearest neighbors.
|
class |
LDOF<O,D extends NumberDistance<D,?>>
Computes the LDOF (Local Distance-Based Outlier Factor) for all objects of a
Database.
|
class |
LOCI<O,D extends NumberDistance<D,?>>
Fast Outlier Detection Using the "Local Correlation Integral".
|
class |
LOF<O,D extends NumberDistance<D,?>>
Algorithm to compute density-based local outlier factors in a database based
on a specified parameter
LOF.K_ID (-lof.k ). |
class |
LoOP<O,D extends NumberDistance<D,?>>
LoOP: Local Outlier Probabilities
Distance/density based algorithm similar to LOF to detect outliers, but with
statistical methods to achieve better result stability.
|
class |
OPTICSOF<O,D extends NumberDistance<D,?>>
OPTICSOF provides the Optics-of algorithm, an algorithm to find Local
Outliers in a database.
|
class |
OUTRES<V extends NumberVector<V,?>>
Adaptive outlierness for subspace outlier ranking (OUTRES).
|
class |
ReferenceBasedOutlierDetection<V extends NumberVector<?,?>,D extends NumberDistance<D,?>>
provides the Reference-Based Outlier Detection algorithm, an algorithm that
computes kNN distances approximately, using reference points.
|
class |
SOD<V extends NumberVector<V,?>,D extends NumberDistance<D,?>> |
Modifier and Type | Class and Description |
---|---|
class |
FeatureBagging
A simple ensemble method called "Feature bagging" for outlier detection.
|
Modifier and Type | Class and Description |
---|---|
class |
CTLuGLSBackwardSearchAlgorithm<V extends NumberVector<?,?>,D extends NumberDistance<D,?>>
GLS-Backward Search is a statistical approach to detecting spatial outliers.
|
class |
CTLuMeanMultipleAttributes<N,O extends NumberVector<?,?>>
Mean Approach is used to discover spatial outliers with multiple attributes.
|
class |
CTLuMedianAlgorithm<N>
Median Algorithm of C.
|
class |
CTLuMedianMultipleAttributes<N,O extends NumberVector<?,?>>
Median Approach is used to discover spatial outliers with multiple
attributes.
|
class |
CTLuMoranScatterplotOutlier<N>
Moran scatterplot outliers, based on the standardized deviation from the
local and global means.
|
class |
CTLuRandomWalkEC<N,D extends NumberDistance<D,?>>
Spatial outlier detection based on random walks.
|
class |
CTLuScatterplotOutlier<N>
Scatterplot-outlier is a spatial outlier detection method that performs a
linear regression of object attributes and their neighbors average value.
|
class |
CTLuZTestOutlier<N>
Detect outliers by comparing their attribute value to the mean and standard
deviation of their neighborhood.
|
class |
SLOM<N,O,D extends NumberDistance<D,?>>
SLOM: a new measure for local spatial outliers
Reference:
Sanjay Chawla and Pei Sun SLOM: a new measure for local spatial outliers in Knowledge and Information Systems 2005 This implementation works around some corner cases in SLOM, in particular when an object has none or a single neighbor only (albeit the results will still not be too useful then), which will result in divisions by zero. |
class |
SOF<N,O,D extends NumberDistance<D,?>>
The Spatial Outlier Factor (SOF) is a spatial
LOF variation. |
class |
TrimmedMeanApproach<N>
A Trimmed Mean Approach to Finding Spatial Outliers.
|
Modifier and Type | Class and Description |
---|---|
class |
ComputeKNNOutlierScores<O,D extends NumberDistance<D,?>>
Application that runs a series of kNN-based algorithms on a data set, for
building an ensemble in a second step.
|
class |
GreedyEnsembleExperiment
Class to load an outlier detection summary file, as produced by
ComputeKNNOutlierScores , and compute a naive ensemble for it. |
class |
VisualizePairwiseGainMatrix
Class to load an outlier detection summary file, as produced by
ComputeKNNOutlierScores , and compute a matrix with the pairwise
gains. |
Modifier and Type | Method and Description |
---|---|
private static List<Pair<Reference,List<Class<?>>>> |
DocumentReferences.sortedReferences() |
Modifier and Type | Method and Description |
---|---|
private static Document |
DocumentReferences.documentReferences(List<Pair<Reference,List<Class<?>>>> refs) |
private static void |
DocumentReferences.documentReferencesWiki(List<Pair<Reference,List<Class<?>>>> refs,
PrintStream refstreamW) |
private static void |
DocumentReferences.inspectClass(Class<?> cls,
List<Pair<Reference,List<Class<?>>>> refs,
Map<Reference,List<Class<?>>> map) |
private static void |
DocumentReferences.inspectClass(Class<?> cls,
List<Pair<Reference,List<Class<?>>>> refs,
Map<Reference,List<Class<?>>> map) |
Modifier and Type | Class and Description |
---|---|
class |
KNNExplorer<O extends NumberVector<?,?>,D extends NumberDistance<D,?>>
User application to explore the k Nearest Neighbors for a given data set and
distance function.
|
Modifier and Type | Class and Description |
---|---|
class |
CanberraDistanceFunction
Canberra distance function, a variation of Manhattan distance.
|
class |
JeffreyDivergenceDistanceFunction
Provides the Jeffrey Divergence Distance for FeatureVectors.
|
Modifier and Type | Class and Description |
---|---|
class |
HistogramIntersectionDistanceFunction
Intersection distance for color histograms.
|
class |
HSBHistogramQuadraticDistanceFunction
Distance function for HSB color histograms based on a quadratic form and
color similarity.
|
class |
RGBHistogramQuadraticDistanceFunction
Distance function for RGB color histograms based on a quadratic form and
color similarity.
|
Modifier and Type | Class and Description |
---|---|
class |
DTWDistanceFunction
Provides the Dynamic Time Warping distance for FeatureVectors.
|
class |
EDRDistanceFunction
Provides the Edit Distance on Real Sequence distance for FeatureVectors.
|
class |
ERPDistanceFunction
Provides the Edit Distance With Real Penalty distance for FeatureVectors.
|
class |
LCSSDistanceFunction
Provides the Longest Common Subsequence distance for FeatureVectors.
|
Modifier and Type | Class and Description |
---|---|
class |
BCubed
BCubed measures.
|
class |
EditDistance
Edit distance measures
Pantel, P. and Lin, D.
|
class |
Entropy
Entropy based measures
References:
Meilă, M.
|
class |
SetMatchingPurity
Set matching purity measures
References:
Zhao, Y. and Karypis, G.
|
Modifier and Type | Method and Description |
---|---|
double |
SetMatchingPurity.f1Measure()
Get the set matching F1-Measure
Steinbach, M. and Karypis, G. and Kumar, V. and others
A comparison of document clustering techniques KDD workshop on text mining, 2000 |
double |
PairCounting.fowlkesMallows()
Computes the pair-counting Fowlkes-mallows (flat only, non-hierarchical!)
|
double |
Entropy.normalizedVariationOfInformation()
Get the normalized variation of information (normalized, 0 = equal)
NVI = 1 - NMI_Joint
Vinh, N.X. and Epps, J. and Bailey, J.
|
double |
SetMatchingPurity.purity()
Get the set matchings purity (first:second clustering) (normalized, 1 =
equal)
|
double |
PairCounting.randIndex()
Computes the Rand index (RI).
|
Modifier and Type | Class and Description |
---|---|
class |
Segments
Creates segments of two or more clusterings.
|
Modifier and Type | Class and Description |
---|---|
class |
MTree<O,D extends Distance<D>>
MTree is a metrical index structure based on the concepts of the M-Tree.
|
Modifier and Type | Class and Description |
---|---|
class |
DoubleDistanceRStarTreeKNNQuery<O extends SpatialComparable>
Instance of a KNN query for a particular spatial index.
|
class |
DoubleDistanceRStarTreeRangeQuery<O extends SpatialComparable>
Instance of a range query for a particular spatial index.
|
class |
GenericRStarTreeKNNQuery<O extends SpatialComparable,D extends Distance<D>>
Instance of a KNN query for a particular spatial index.
|
class |
GenericRStarTreeRangeQuery<O extends SpatialComparable,D extends Distance<D>>
Instance of a range query for a particular spatial index.
|
Modifier and Type | Class and Description |
---|---|
class |
RStarTree
RStarTree is a spatial index structure based on the concepts of the R*-Tree.
|
Modifier and Type | Class and Description |
---|---|
class |
OneDimSortBulkSplit
Simple bulk loading strategy by sorting the data along the first dimension.
|
class |
SortTileRecursiveBulkSplit
Sort-Tile-Recursive aims at tiling the data space with a grid-like structure
for partitioning the dataset into the required number of buckets.
|
class |
SpatialSortBulkSplit
Bulk loading by spatially sorting the objects, then partitioning the sorted
list appropriately.
|
Modifier and Type | Class and Description |
---|---|
class |
ApproximativeLeastOverlapInsertionStrategy
The choose subtree method proposed by the R*-Tree with slightly better
performance for large leaf sizes (linear approximation).
|
class |
CombinedInsertionStrategy
Use two different insertion strategies for directory and leaf nodes.
|
class |
LeastEnlargementInsertionStrategy
The default R-Tree insertion strategy: find rectangle with least volume
enlargement.
|
class |
LeastEnlargementWithAreaInsertionStrategy
A slight modification of the default R-Tree insertion strategy: find
rectangle with least volume enlargement, but choose least area on ties.
|
class |
LeastOverlapInsertionStrategy
The choose subtree method proposed by the R*-Tree for leaf nodes.
|
Modifier and Type | Class and Description |
---|---|
class |
LimitedReinsertOverflowTreatment
Limited reinsertions, as proposed by the R*-Tree: For each real insert, allow
reinsertions to happen only once per level.
|
Modifier and Type | Class and Description |
---|---|
class |
CloseReinsert
Reinsert objects on page overflow, starting with close objects first (even
when they will likely be inserted into the same page again!)
|
class |
FarReinsert
Reinsert objects on page overflow, starting with farther objects first (even
when they will likely be inserted into the same page again!)
|
Modifier and Type | Class and Description |
---|---|
class |
AngTanLinearSplit
Line-time complexity split proposed by Ang and Tan.
|
class |
GreeneSplit
Quadratic-time complexity split as used by Diane Greene for the R-Tree.
|
class |
RTreeLinearSplit
Linear-time complexity greedy split as used by the original R-Tree.
|
class |
RTreeQuadraticSplit
Quadratic-time complexity greedy split as used by the original R-Tree.
|
class |
TopologicalSplitter
Encapsulates the required parameters for a topological split of a R*-Tree.
|
Modifier and Type | Class and Description |
---|---|
class |
VAFile<V extends NumberVector<?,?>>
Vector-approximation file (VAFile)
Reference:
Weber, R. and Blott, S.
|
Modifier and Type | Class and Description |
---|---|
class |
Mean
Compute the mean using a numerically stable online algorithm.
|
class |
MeanVariance
Do some simple statistics (mean, variance) using a numerically stable online
algorithm.
|
Modifier and Type | Class and Description |
---|---|
class |
GrahamScanConvexHull2D
Classes to compute the convex hull of a set of points in 2D, using the
classic Grahams scan.
|
class |
SweepHullDelaunay2D
Compute the Convex Hull and/or Delaunay Triangulation, using the sweep-hull
approach of David Sinclair.
|
Modifier and Type | Class and Description |
---|---|
class |
PCAFilteredAutotuningRunner<V extends NumberVector<? extends V,?>>
Performs a self-tuning local PCA based on the covariance matrices of given
objects.
|
class |
WeightedCovarianceMatrixBuilder<V extends NumberVector<? extends V,?>>
CovarianceMatrixBuilder with weights. |
Modifier and Type | Class and Description |
---|---|
class |
BinarySplitSpatialSorter
Spatially sort the data set by repetitive binary splitting, circulating
through the dimensions.
|
class |
HilbertSpatialSorter
Sort object along the Hilbert Space Filling curve by mapping them to their
Hilbert numbers and sorting them.
|
class |
PeanoSpatialSorter
Bulk-load an R-tree index by presorting the objects with their position on
the Peano curve.
|
Modifier and Type | Class and Description |
---|---|
class |
KMLOutputHandler
Class to handle KML output.
|
Modifier and Type | Method and Description |
---|---|
static Reference |
DocumentationUtil.getReference(Class<?> c)
Get the reference annotation of a class, or
null . |
Modifier and Type | Class and Description |
---|---|
class |
HeDESNormalizationOutlierScaling
Normalization used by HeDES
|
class |
MinusLogGammaScaling
Scaling that can map arbitrary values to a probability in the range of [0:1],
by assuming a Gamma distribution on the data and evaluating the Gamma CDF.
|
class |
MinusLogStandardDeviationScaling
Scaling that can map arbitrary values to a probability in the range of [0:1].
|
class |
MixtureModelOutlierScalingFunction
Tries to fit a mixture model (exponential for inliers and gaussian for
outliers) to the outlier score distribution.
|
class |
MultiplicativeInverseScaling
Scaling function to invert values basically by computing 1/x, but in a variation
that maps the values to the [0:1] interval and avoiding division by 0.
|
class |
OutlierGammaScaling
Scaling that can map arbitrary values to a probability in the range of [0:1]
by assuming a Gamma distribution on the values.
|
class |
OutlierMinusLogScaling
Scaling function to invert values by computing -1 * Math.log(x)
Useful for example for scaling
ABOD , but see
MinusLogStandardDeviationScaling and MinusLogGammaScaling for
more advanced scalings for this algorithm. |
class |
SigmoidOutlierScalingFunction
Tries to fit a sigmoid to the outlier scores and use it to convert the values
to probability estimates in the range of 0.0 to 1.0
|
class |
SqrtStandardDeviationScaling
Scaling that can map arbitrary values to a probability in the range of [0:1].
|
class |
StandardDeviationScaling
Scaling that can map arbitrary values to a probability in the range of [0:1].
|
Modifier and Type | Class and Description |
---|---|
class |
CircleSegmentsVisualizer
Visualizer to draw circle segments of clusterings and enable interactive
selection of segments.
|
Modifier and Type | Method and Description |
---|---|
private double[] |
DensityEstimationOverlay.initializeBandwidth(double[][] data) |
Modifier and Type | Class and Description |
---|---|
class |
BubbleVisualization
Generates a SVG-Element containing bubbles.
|