Next: Clustering
Up: No Title
Previous: No Title
The major steps to be considered in a data mining problem are shown in
Figure 1.
Figure 1: Clustering methodology
- Data collection: This step requires careful recording of data.
- Initial screening: Raw data usually needs some massaging before
they are ready for analysis. For example: All the values of a
particular feature might be same, so that feature can be eliminated.
- Representation: Putting the data into a form suitable for further
analysis.
- Clustering tendency: Finding out if there exists some
justification for clustering. If the data cannot be shown to have the
tendency to cluster then analysis techniques should be applied rather
than cluster analysis.
- Clustering Strategy: This involves choosing the appropriate
clustering algorithm. Thought must be given to details such as
matching the algorithm to the data, the presentation of results and
the choice of parameters.
- Validation: This step changes the analysis into hard
evidence. Stability is one basis for comparing clustering methods.
- Interpretation: Drawing conclusion from the analysis. This depends
on the application.
Miranda Maria Irene
Thu Apr 1 15:43:18 IST 1999