
O - Object typeD - Distance type@Reference(authors="A. McCallum, K. Nigam, L.H. Ungar", title="Efficient Clustering of High Dimensional Data Sets with Application to Reference Matching", booktitle="Proc. 6th ACM SIGKDD international conference on Knowledge discovery and data mining", url="http://dx.doi.org/10.1145%2F347090.347123") public class CanopyPreClustering<O,D extends Distance<D>> extends AbstractDistanceBasedAlgorithm<O,D,Clustering<ClusterModel>> implements ClusteringAlgorithm<Clustering<ClusterModel>>
 Reference:
 A. McCallum, K. Nigam, L.H. Ungar
 Efficient Clustering of High Dimensional Data Sets with Application to
 Reference Matching
 Proc. 6th ACM SIGKDD international conference on Knowledge discovery and data
 mining
 
| Modifier and Type | Class and Description | 
|---|---|
static class  | 
CanopyPreClustering.Parameterizer<O,D extends Distance<D>>
Parameterization class 
 | 
| Modifier and Type | Field and Description | 
|---|---|
private static Logging | 
LOG
Class logger. 
 | 
private D | 
t1
Threshold for inclusion 
 | 
private D | 
t2
Threshold for removal 
 | 
DISTANCE_FUNCTION_ID| Constructor and Description | 
|---|
CanopyPreClustering(DistanceFunction<? super O,D> distanceFunction,
                   D t1,
                   D t2)
Constructor. 
 | 
| Modifier and Type | Method and Description | 
|---|---|
TypeInformation[] | 
getInputTypeRestriction()
Get the input type restriction used for negotiating the data query. 
 | 
protected Logging | 
getLogger()
Get the (STATIC) logger for this class. 
 | 
Clustering<ClusterModel> | 
run(Database database,
   Relation<O> relation)
Run the algorithm 
 | 
getDistanceFunctionmakeParameterDistanceFunction, runclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitrunprivate static final Logging LOG
public CanopyPreClustering(DistanceFunction<? super O,D> distanceFunction, D t1, D t2)
distanceFunction - Distance functiont1 - Inclusion thresholdt2 - Exclusion thresholdpublic Clustering<ClusterModel> run(Database database, Relation<O> relation)
database - Databaserelation - Relation to processpublic TypeInformation[] getInputTypeRestriction()
AbstractAlgorithmgetInputTypeRestriction in interface AlgorithmgetInputTypeRestriction in class AbstractAlgorithm<Clustering<ClusterModel>>protected Logging getLogger()
AbstractAlgorithmgetLogger in class AbstractAlgorithm<Clustering<ClusterModel>>