Overview of Unsupervised Learning in Machine Learning

 
Clustering
 
Jia-Bin Huang
Virginia Tech
 
Spring 2019
 
ECE-5424G / CS-5824
 
Administrative
 
HW 2 grades out.
 
HW 3 March 27.
 
Started grading Midterm. Results expected to be out by March 27
 
Final project question?
 
 
 
 
Where are we? (Supervised Learning)
 
K-Nearest Neighbor
Linear Regression
Naive Bayes
Logistic Regression
Regularization
Support Vector Machines
Neural Networks
Diagnosing ML systems
 
What’s next? (Unsupervised Learning)
 
Clustering
Expectation Maximization (EM) and
Gaussian Mixture Models (GMM)
Dimensionality reduction
Anomaly Detection
Recommender systems
 
What’s next? (Advanced topics)
 
Supervised Learning
Ensemble methods
Structured prediction
Semi-supervised learning
Generative Models
Probabilistic Graphical Models
Sequence Prediction Models
Reinforcement Learning
Markov Decision Process, Q-learning, Policy gradient
 
Clustering
 
Intro to unsupervised learning
K-means algorithm
Optimization objective
Initialization and the number of clusters
Hierarchical clustering
 
Clustering
 
Intro to unsupervised learning
K-means algorithm
Optimization objective
Initialization and the number of clusters
Hierarchical clustering
Supervised learning
?
Slide credit: Andrew Ng
 
Unsupervised learning
 
Slide credit: Andrew Ng
 
 
 
Slide credit: Andrew Ng
 
Clustering
 
Intro to unsupervised learning
K-means algorithm
Optimization objective
Initialization and the number of clusters
Hierarchical clustering
 
 
 
Slide credit: Andrew Ng
 
 
 
Slide credit: Andrew Ng
 
 
 
Slide credit: Andrew Ng
 
 
 
Slide credit: Andrew Ng
 
 
 
Slide credit: Andrew Ng
 
K-means algorithm
 
Slide credit: Andrew Ng
K-means algorithm
 
Cluster assignment step
 
Centroid update step
Slide credit: Andrew Ng
 
Clustering
 
Intro to unsupervised learning
K-means algorithm
Optimization objective
Initialization and the number of clusters
Hierarchical clustering
K-means optimization objective
Slide credit: Andrew Ng
K-means algorithm
Slide credit: Andrew Ng
 
Objective based clustering
 
Clustering
 
Intro to unsupervised learning
K-means algorithm
Optimization objective
Initialization and the number of clusters
Hierarchical clustering
 
Random initialization
 
Slide credit: Andrew Ng
 
Local optima
 
 
Slide credit: Andrew Ng
 
Local optima
 
 
Slide credit: Maria-Florina Balcan
 
1) Multiple random initialization
 
Slide credit: Andrew Ng
 
2) Furthest Point Heuristic
 
Slide credit: Maria-Florina Balcan
 
3) K-means++ initialization
 
[Arthur and Vassilvitskii 2007]
 
3) K-means++ initialization
 
[Arthur and Vassilvitskii 2007]
 
scikit-learn uses K-means++ as default
How to choose K?
 
1) Elbow method
 
 
 
 
 
 
 
2) Try multiple K and use performance from downstream task for
selection
 
Clustering
 
Intro to unsupervised learning
K-means algorithm
Optimization objective
Initialization and the number of clusters
Hierarchical clustering
 
Hierarchical Clustering
 
 
 
 
 
 
 
A hierarchy might be more nature
Different users might care about different levels of granularity or even
prunings.
 
Slide credit: Maria-Florina Balcan
 
Hierarchical Clustering
 
Top-down (divisive)
Partition data into 2-groups (e.g., 2-means)
Recursively cluster each group
 
Bottom-up (agglomerative)
Start with every point in its own cluster.
Repeatedly merge the “closest” two clusters
Different definitions of “closest” give different algorithms.
 
Slide credit: Maria-Florina Balcan
 
Bottom-up (agglomerative)
 
Slide credit: Maria-Florina Balcan
Bottom-up (agglomerative)
Slide credit: Maria-Florina Balcan
 
Things to remember
 
Intro to unsupervised learning
K-means algorithm
Optimization objective
Initialization and the number of clusters
Hierarchical clustering
Slide Note
Embed
Share

This presentation covers various topics in unsupervised learning, including clustering, expectation maximization, Gaussian mixture models, dimensionality reduction, anomaly detection, and recommender systems. It also delves into advanced supervised learning techniques, ensemble methods, structured prediction, and reinforcement learning, providing a comprehensive insight into the field of machine learning.


Uploaded on Aug 03, 2024 | 1 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Clustering Jia-Bin Huang Virginia Tech Spring 2019 ECE-5424G / CS-5824

  2. Administrative HW 2 grades out. HW 3 March 27. Started grading Midterm. Results expected to be out by March 27 Final project question? Feedback please! https://goo.gl/forms/A6NSq5yMpkuWEMR43

  3. Where are we? (Supervised Learning) K-Nearest Neighbor Linear Regression Naive Bayes Logistic Regression Regularization Support Vector Machines Neural Networks Diagnosing ML systems

  4. Whats next? (Unsupervised Learning) Clustering Expectation Maximization (EM) and Gaussian Mixture Models (GMM) Dimensionality reduction Anomaly Detection Recommender systems

  5. Whats next? (Advanced topics) Supervised Learning Ensemble methods Structured prediction Semi-supervised learning Generative Models Probabilistic Graphical Models Sequence Prediction Models Reinforcement Learning Markov Decision Process, Q-learning, Policy gradient

  6. Clustering Intro to unsupervised learning K-means algorithm Optimization objective Initialization and the number of clusters Hierarchical clustering

  7. Clustering Intro to unsupervised learning K-means algorithm Optimization objective Initialization and the number of clusters Hierarchical clustering

  8. Supervised learning ?2 ? ?1 Training set: ?1,?1, ?2,?2, , ??,?? Slide credit: Andrew Ng

  9. Unsupervised learning ?2 ?1 Training set: ?1,?2,?3, ,?? Slide credit: Andrew Ng

  10. Slide credit: Andrew Ng

  11. Clustering Intro to unsupervised learning K-means algorithm Optimization objective Initialization and the number of clusters Hierarchical clustering

  12. Slide credit: Andrew Ng

  13. Slide credit: Andrew Ng

  14. Slide credit: Andrew Ng

  15. Slide credit: Andrew Ng

  16. Slide credit: Andrew Ng

  17. K-means algorithm Input: ? (number of clusters) Training set ?(1),?(2),?(3), ,?(?) ?(?) ? (note: drop ?0= 1 convention) Slide credit: Andrew Ng

  18. K-means algorithm Randomly initialize ? cluster centroids ?1,?2, ,?? ? Repeat{ for ? = 1 to ? ?(?) index (from 1 to ?) of cluster centroid closest to ?(?) for ? = 1 to ? ?? average (mean) of points assigned to cluster ? Cluster assignment step Centroid update step } Slide credit: Andrew Ng

  19. Clustering Intro to unsupervised learning K-means algorithm Optimization objective Initialization and the number of clusters Hierarchical clustering

  20. K-means optimization objective ?(?) =Index of cluster (1, 2, K) to which example ?? is currently assigned ?? = cluster centroid ? (?? ?) ??(?) = cluster centroid of cluster to which example ?? has been assigned Optimization objective: ? ?1, ,??,?1, ,?? =1 Example: ?(?)= 5 ?(?)= 5 ??(?) = ?5 ? 2 ?? ??? ? ?=1 ? ?1, ,??,?1, ,?? min ?1, ,?? ?1, ,?? Slide credit: Andrew Ng

  21. K-means algorithm Randomly initialize ? cluster centroids ?1,?2, ,?? ? Cluster assignment step ? ?1, ,??,?1, ,?? =1 ? Repeat{ closest to ?(?) for ? = 1 to ? ?? average (mean) of points assigned to cluster ? } 2 ?? ??? ? ?=1 for ? = 1 to ? ?(?) index (from 1 to ?) of cluster centroid Centroid update step ? ?1, ,??,?1, ,?? =1 ? 2 ?? ??? ? ?=1 Slide credit: Andrew Ng

  22. Objective based clustering Input: ? (number of clusters) Training set ?(1),?(2),?(3), ,?(?) Distance/dissimilarity measure d(??,??) Goal: output a partition of the data K-means: Euclidean distance K-median: Absolute distance K-center: minimize the maximum radius

  23. Clustering Intro to unsupervised learning K-means algorithm Optimization objective Initialization and the number of clusters Hierarchical clustering

  24. Random initialization Randomly pick ? training examples Set ?1,?2, ,?? equal to those ? examples ?2 ?1 Slide credit: Andrew Ng

  25. Local optima Slide credit: Andrew Ng

  26. Local optima Slide credit: Maria-Florina Balcan

  27. 1) Multiple random initialization For i = 1 to 100 { Randomly initialize K-means Run K-means. Get ?1, ,??,?1, ,?? Compute the cost function (distortion) ?(?1, ,??,?1, ,??) } Pick clustering that gave the lowest cost ?(?1, ,??,?1, ,??) Slide credit: Andrew Ng

  28. 2) Furthest Point Heuristic Choose ?1 arbitrarily (or at random) For j = 2 to K Pick ?? among data points ?(1),?(2), ,?(?) that is farthest from previously chosen ?1,?2, ,?? 1 Slide credit: Maria-Florina Balcan

  29. 3) K-means++ initialization Interpolate between random and furthest point initialization Idea: Let D(?) be the distance between a point ? and its nearest center. Chose the next center proportional to D2(?). Choose ?1 arbitrarily (or at random) For j = 2 to K Pick ?? among data points ?(1),?(2), ,?(?) according to the distribution 2 Pr ??= ?? ? <??? ?? min [Arthur and Vassilvitskii 2007]

  30. 3) K-means++ initialization Interpolate between random and furthest point initialization Idea: Let D(?) be the distance between a point ? and its nearest center. Chose the next center proportional to D?(?). ? = 0: random sampling ? = : furthest point ? = 2: K-means++ scikit-learn uses K-means++ as default [Arthur and Vassilvitskii 2007]

  31. How to choose K? 1) Elbow method 2) Try multiple K and use performance from downstream task for selection

  32. Clustering Intro to unsupervised learning K-means algorithm Optimization objective Initialization and the number of clusters Hierarchical clustering

  33. Hierarchical Clustering A hierarchy might be more nature Different users might care about different levels of granularity or even prunings. Slide credit: Maria-Florina Balcan

  34. Hierarchical Clustering Top-down (divisive) Partition data into 2-groups (e.g., 2-means) Recursively cluster each group Bottom-up (agglomerative) Start with every point in its own cluster. Repeatedly merge the closest two clusters Different definitions of closest give different algorithms. Slide credit: Maria-Florina Balcan

  35. Bottom-up (agglomerative) Have a distance measure on pairs of objects. ? ?,? : Distance between ? and ? Single linkage: dist A,B = min x ?,? ? d(x,x ) Complete linkage: dist A,B = max x ?,? ? d(x,x ) Average linkage: dist A,B = average x ?,? ? d(x,x ) ? |?| ? +|?|mean ? mean ? 2 Ward s method dist A,B = Slide credit: Maria-Florina Balcan

  36. Bottom-up (agglomerative) Single linkage: dist A,B = min x ?,? ? d(x,x ) At any time, distance between any two points in a connected components < r. Complete linkage: dist A,B = max x ?,? ? d(x,x ) Keep max diameter as small as possible at any level ? |?| ? +|?|mean ? mean ? 2 Ward s method dist A,B = Merge the two clusters such that the increase in k-means cost is as small as possible. Works well in practice Slide credit: Maria-Florina Balcan

  37. Things to remember Intro to unsupervised learning K-means algorithm Optimization objective Initialization and the number of clusters Hierarchical clustering

Related


More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#