Next: Hierarchical Clustering
Up: No Title
Previous: Clustering Methodology
A clustering is a type of classification imposed on a finite set of objects.
The relationship between objects is represented in a proximity matrix in which
rows and columns correspond to objects.
The proximity matrix is the one and only input to a clustering algorithm.
The following figure shows a tree of classification problems
Figure 2: Tree of classification types
- Exclusive versus nonexclusive
- An exclusive classification is a partition
of the set of objects. Each object belongs to exactly one subset, or
cluster. Nonexclusive, or overlapping, classification can assign an
object to several classes.
- Intrinsic versus extrinsic
- An intrinsic classification uses only the
proximity matrix to perform the classification. Intrinsic classification
is called "unsupervised learning".
Extrinsic classification uses category labels on the objects as well as
the proximity matrix. The problem is then to establish a discriminant
surface that separates the objects according to category.
- Hierarchical versus partitional
-
A hierarchical clustering method is a procedure for transforming a proximity matrix into a sequence of nested partitions.
Algorithms that express exclusive, intrinsic classification.
- Agglomerative versus divisive
- An agglomerative, hierarchical classificat
ion places each object in its own cluster and gradually merges these atomic
clusters into larger and larger clusters until all objects are in a single
cluster. Divisive, hierarchical classification reverses the process by starting with all objects in one cluster and subdividing into smaller pieces.
- Serial versus simultaneous
- Serial procedures handle the patterns one by one whereas simultaneous classification works with the entire set of patterns at the same time.
- Monothetic versus polythetic
- This option is most applicable to problems
in taxonomy, where the objects to be clustered are represented as patterns,
or points in a space. A monothetic clustering algorithm uses the features
one by one, whereas, a polythetic procedure uses all the features at once.
- Graph theory versus matrix algebra
-
Some algorithms in terms of graph theory use properties such as
connectedness and completeness to define classifications, and express other
algorithms in terms of algebraic constructs, such as mean-square-error.
When implementing an algorithm on a computer, attention must be paid to
questions of computational efficiency.
Next: Hierarchical Clustering
Up: No Title
Previous: Clustering Methodology
Miranda Maria Irene
Thu Apr 1 15:43:18 IST 1999