Feature Reduction

Feature Reduction #

Why? #

Too many features in a dataset complicates the model’s prediction strategy. Since most clustering models use some sort of distance measure, too many dimensions will result in many isolated clusters.

How many is too many? #

One indication is when there are too many features than the observations.

How? #

  • Principal Component Analysis
  • Non-Negative Matrix Factorization
  • Linear discriminant analysis
  • t-SNE

References #