As part of the Udacity Machine Learning Engineer Nanodegree this project aimed to identify segments of customers hidden in the data. It was designed to apply feature scaling, removing unwanted outliers and using a PCA transformation to later cluster the dataset into separate clusters used to predict unseen data.
The key takeaways learned from this project are:
- How to apply preprocessing techniques such as feature scaling and outlier detection.
- How to interpret data points that have been scaled, transformed, or reduced from PCA.
- How to analyze PCA dimensions and construct a new feature space.
- How to optimally cluster a set of data to find hidden patterns in a dataset.
- How to assess information given by cluster data and use it in a meaningful way.