In these methods the desired number of clusters is specified in advance and the ’best’ solution is chosen.
The steps in such a method are as follows:
- Choose initial cluster centres (essentially this is a set of observations that are far apart — each subject forms a cluster of one and its centre is the value of the variables for that subject).
- Assign each subject to its ’nearest’ cluster, defined in terms of the distance to the centroid.
- Find the centroids of the clusters that have been formed
- Re-calculate the distance from each subject to each centroid and move observations that are not in the cluster that they are closest to.
- Continue until the centroids remain relatively stable.