Clustering enables us to group the rows of a data set together based on their similarities without knowing anything about the true class labels of the rows in the data set. Clustering is useful in market segmentation, anomaly detection, and other unsupervised learning tasks. Clustering is also useful as a data preprocessing step for supervised learning tasks.
-
Overview of clustering techniques - Blackboard electronic reserves
-
Introduction to Statistical Learning
Section 10.3 -
Introduction to Data Mining
Chapter 8 -
Estimating the Number of Clusters in a Data Det via the Gap Statistic
by Robert Tibshirani, Guenther Walther, and Trevor Hastie