Feature Selection For Unsupervised Learning

Feature Selection For Unsupervised Learning


This is my presentation for the IBM data science day, July 24.
Abstract
After reviewing popular techniques used in supervised, unsupervised and semi-supervised machine learning, we focus on feature selection methods in these different contexts, especially the metrics used to assess the value of a feature or set of features, be it binary, continuous or categorical variables. We go in deeper details and review modern feature selection techniques for unsupervised learning, typically relying on entropy-like criteria. While these criteria are usually model-dependent or scale-dependent, we introduce a new model-free, data-driven methodology in this context, with an application to an interesting number theory problem (simulated data set) in which each feature has a known theoretical entropy. We also briefly discuss high precision computing as it is relevant to this peculiar data set, as well as units of information smaller than the bit.
To download the presentation, click here (PowerPoint document.)
DSC Resources

Free Book: Applied Stochastic Processes
Comprehensive Repository of Data Science and ML Resources
Advanced Machine Learning with Basic Excel
Difference between ML, Data Science, AI, Deep Learning, and Statistics
Selected Business Analytics, Data Science and ML articles
Hire a Data Scientist | Search DSC | Classifieds | Find a Job
Post a Blog | Forum Questions


Link: Feature Selection For Unsupervised Learning