Visualizing the result of OPTICS

OPTICS is a new algorithm for the purpose of cluster analysis which does not produce a clustering of a data set explicitly; but instead creates an augmented ordering of the database representing its density-based clustering structure. This cluster-ordering contains information which is equivalent to the density-based clusterings corresponding to a broad range of parameter settings. It is a versatile basis for both automatic and interactive cluster analysis. Not only can ‘traditional’ clustering information be automatically and efficiently extracted, but also the intrinsic clustering structure. For medium sized data sets, the cluster-ordering can be represented graphically. This representation is suitable for interactive exploration of the intrinsic clustering structure, thus the user will be gaining additional insights into the distribution and correlation of the data.

The corelevel-reachability viewer is the tools which we use to visualize the result of OPTICS. On the x-axis it shows the points in the order generated by OPTICS, on the y-axis are the reachability and/or corelevel values depicted. A detailed description of the applet follows:

We provide 5 different 2-dimensional datasets for demonstration purposes. These can be chosen in the "Source" dropdown list. In the upper lefthand corner the dataset is shown (the color and the size of the points can be chaned with the "Points Radius" and "Points Color" controls). Below this the reachability-plot (standard color orange) and the corelevel-plot (standard color black, inactive!) can be switched on/off individually by ticking the corresponding "Active" check-boxes. Their colors can be changed ("Color"-dropdown lists) and so can be size of the bars ("Bar width"-dropdown list).

Below these plots we see the automatically generated clusterings (in blue), which can be computed for different &roh; values (by entering the &roh;-value in the field "roh" and pressing "Re-Cluster"). Below the automatically generated clustering the attribute-plot is depicted (i.e. the coordinate values of the points, encoded as a gray scale value - the higher the value, the lighter the color). As we are dealing with 2-dimensional datasets, we see 2 rows, the first one showing the x-value of the point (the farther right, the lighter the color), the second one showing the y-value (the further up, the lighter).

In order to even better show the order in which OPTICS visits the points, this "walk" through the dataset can be animated by pressing the "Animate" button. The speed of the animation the its color can be controls through "Delay" and "Animation Color". After the animation has finished, one can press the "Reset" button the get back to the original state.

We provide 5 different 2-dimensional datasets as example. The clustering parameters used were:

Dataset |
db1 | db2 | db3 | db4 | db5 |

ε |
10 | 30 | 10 | 20 | 50 |

MinPts |
10 | 10 | 10 | 20 | 10 |

Markus Breunig