de.lmu.ifi.dbs.elki.evaluation.paircounting
Class PairCountingFMeasure

java.lang.Object
  extended by de.lmu.ifi.dbs.elki.evaluation.paircounting.PairCountingFMeasure

public class PairCountingFMeasure
extends Object

Compare two clustering results using a pair-counting F-Measure. A pair are any two objects that belong to the same cluster. Two clusterings are compared by comparing their pairs; if two clusterings completely agree, they also agree on every pair; even when the clusters and points are ordered differently. An empty clustering will of course have no pairs, the trivial all-in-one clustering of course has n^2 pairs. Therefore neither recall nor precision itself are useful, however their combination -- the F-Measure -- is useful.


Constructor Summary
PairCountingFMeasure()
           
 
Method Summary
static
<R extends Clustering<M>,M extends Model,S extends Clustering<N>,N extends Model>
double
compareClusterings(R result1, S result2)
          Compare two clustering results.
static
<R extends Clustering<M>,M extends Model,S extends Clustering<N>,N extends Model>
double
compareClusterings(R result1, S result2, boolean noiseSpecial, boolean hierarchicalSpecial)
          Compare two clustering results.
static
<R extends Clustering<M>,M extends Model,S extends Clustering<N>,N extends Model>
double
compareClusterings(R result1, S result2, double beta)
          Compare two clustering results.
static
<R extends Clustering<M>,M extends Model,S extends Clustering<N>,N extends Model>
double
compareClusterings(R result1, S result2, double beta, boolean noiseSpecial, boolean hierarchicalSpecial)
          Compare two clustering results.
static Triple<Integer,Integer,Integer> countPairs(PairSortedGeneratorInterface first, PairSortedGeneratorInterface second)
          Compare two sets of generated pairs.
static
<R extends Clustering<M>,M extends Model,S extends Clustering<N>,N extends Model>
Triple<Integer,Integer,Integer>
countPairs(R result1, S result2)
          Compare two sets of generated pairs.
static double fMeasure(int inBoth, int inFirst, int inSecond, double beta)
          Computes the F-measure of the given parameters.
static
<R extends Clustering<M>,M extends Model>
PairSortedGeneratorInterface
getPairGenerator(R clusters, boolean noiseSpecial, boolean hierarchicalSpecial)
          Get a pair generator for the given Clustering
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

PairCountingFMeasure

public PairCountingFMeasure()
Method Detail

getPairGenerator

public static <R extends Clustering<M>,M extends Model> PairSortedGeneratorInterface getPairGenerator(R clusters,
                                                                                                      boolean noiseSpecial,
                                                                                                      boolean hierarchicalSpecial)
Get a pair generator for the given Clustering

Type Parameters:
R - Clustering result class
M - Model type
Parameters:
clusters - Clustering result
noiseSpecial - Special handling for "noise clusters"
hierarchicalSpecial - Special handling for hierarchical clusters
Returns:
Sorted pair generator

compareClusterings

public static <R extends Clustering<M>,M extends Model,S extends Clustering<N>,N extends Model> double compareClusterings(R result1,
                                                                                                                          S result2,
                                                                                                                          double beta,
                                                                                                                          boolean noiseSpecial,
                                                                                                                          boolean hierarchicalSpecial)
Compare two clustering results.

Type Parameters:
R - Result type
M - Model type
S - Result type
N - Model type
Parameters:
result1 - first result
result2 - second result
beta - Beta value for the F-Measure
noiseSpecial - Noise receives special treatment
hierarchicalSpecial - Special handling for hierarchical clusters
Returns:
Pair counting F-Measure result.

compareClusterings

public static <R extends Clustering<M>,M extends Model,S extends Clustering<N>,N extends Model> double compareClusterings(R result1,
                                                                                                                          S result2,
                                                                                                                          double beta)
Compare two clustering results.

Type Parameters:
R - Result type
M - Model type
S - Result type
N - Model type
Parameters:
result1 - first result
result2 - second result
beta - Beta value for the F-Measure
Returns:
Pair counting F-Measure result.

compareClusterings

public static <R extends Clustering<M>,M extends Model,S extends Clustering<N>,N extends Model> double compareClusterings(R result1,
                                                                                                                          S result2,
                                                                                                                          boolean noiseSpecial,
                                                                                                                          boolean hierarchicalSpecial)
Compare two clustering results.

Type Parameters:
R - Result type
M - Model type
S - Result type
N - Model type
Parameters:
result1 - first result
result2 - second result
noiseSpecial - Noise receives special treatment
Returns:
Pair counting F-1-Measure result.

compareClusterings

public static <R extends Clustering<M>,M extends Model,S extends Clustering<N>,N extends Model> double compareClusterings(R result1,
                                                                                                                          S result2)
Compare two clustering results.

Type Parameters:
R - Result type
M - Model type
S - Result type
N - Model type
Parameters:
result1 - first result
result2 - second result
Returns:
Pair counting F-1-Measure result.

countPairs

public static <R extends Clustering<M>,M extends Model,S extends Clustering<N>,N extends Model> Triple<Integer,Integer,Integer> countPairs(R result1,
                                                                                                                                           S result2)
Compare two sets of generated pairs. It determines how many objects of the first set are in both sets, just in the first set or just in the second set.

Type Parameters:
R - Result type
M - Model type
S - Result type
N - Model type
Parameters:
result1 - first result
result2 - second result
Returns:
Returns a Triple that contains the number of objects that are in both sets (FIRST), the number of objects that are just in the first set (SECOND) and the number of object that are just in the second set (THIRD).

countPairs

public static Triple<Integer,Integer,Integer> countPairs(PairSortedGeneratorInterface first,
                                                         PairSortedGeneratorInterface second)
Compare two sets of generated pairs. It determines how many objects of the first set are in both sets, just in the first set or just in the second set.

Parameters:
first - first set
second - second set
Returns:
Returns a Triple that contains the number of objects that are in both sets (FIRST), the number of objects that are just in the first set (SECOND) and the number of object that are just in the second set (THIRD).

fMeasure

public static double fMeasure(int inBoth,
                              int inFirst,
                              int inSecond,
                              double beta)
Computes the F-measure of the given parameters.

Returns ((1+beta*beta) * inBoth) / ((1+beta*beta) * inBoth + (beta*beta)*inFirst + inSecond)

Parameters:
inBoth - The number of objects that are in both sets.
inFirst - The number of objects that are in the first set.
inSecond - The number of objects that are in the second set.
beta - The beta values for the f-measure.
Returns:
The F-measure.

Release 0.4.0 (2011-09-20_1324)