Overview of Unsupervised Learning in Machine Learning

Slide Note
Embed
Share

This presentation covers various topics in unsupervised learning, including clustering, expectation maximization, Gaussian mixture models, dimensionality reduction, anomaly detection, and recommender systems. It also delves into advanced supervised learning techniques, ensemble methods, structured prediction, and reinforcement learning, providing a comprehensive insight into the field of machine learning.


Uploaded on Aug 03, 2024 | 1 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Clustering Jia-Bin Huang Virginia Tech Spring 2019 ECE-5424G / CS-5824

  2. Administrative HW 2 grades out. HW 3 March 27. Started grading Midterm. Results expected to be out by March 27 Final project question? Feedback please! https://goo.gl/forms/A6NSq5yMpkuWEMR43

  3. Where are we? (Supervised Learning) K-Nearest Neighbor Linear Regression Naive Bayes Logistic Regression Regularization Support Vector Machines Neural Networks Diagnosing ML systems

  4. Whats next? (Unsupervised Learning) Clustering Expectation Maximization (EM) and Gaussian Mixture Models (GMM) Dimensionality reduction Anomaly Detection Recommender systems

  5. Whats next? (Advanced topics) Supervised Learning Ensemble methods Structured prediction Semi-supervised learning Generative Models Probabilistic Graphical Models Sequence Prediction Models Reinforcement Learning Markov Decision Process, Q-learning, Policy gradient

  6. Clustering Intro to unsupervised learning K-means algorithm Optimization objective Initialization and the number of clusters Hierarchical clustering

  7. Clustering Intro to unsupervised learning K-means algorithm Optimization objective Initialization and the number of clusters Hierarchical clustering

  8. Supervised learning ?2 ? ?1 Training set: ?1,?1, ?2,?2, , ??,?? Slide credit: Andrew Ng

  9. Unsupervised learning ?2 ?1 Training set: ?1,?2,?3, ,?? Slide credit: Andrew Ng

  10. Slide credit: Andrew Ng

  11. Clustering Intro to unsupervised learning K-means algorithm Optimization objective Initialization and the number of clusters Hierarchical clustering

  12. Slide credit: Andrew Ng

  13. Slide credit: Andrew Ng

  14. Slide credit: Andrew Ng

  15. Slide credit: Andrew Ng

  16. Slide credit: Andrew Ng

  17. K-means algorithm Input: ? (number of clusters) Training set ?(1),?(2),?(3), ,?(?) ?(?) ? (note: drop ?0= 1 convention) Slide credit: Andrew Ng

  18. K-means algorithm Randomly initialize ? cluster centroids ?1,?2, ,?? ? Repeat{ for ? = 1 to ? ?(?) index (from 1 to ?) of cluster centroid closest to ?(?) for ? = 1 to ? ?? average (mean) of points assigned to cluster ? Cluster assignment step Centroid update step } Slide credit: Andrew Ng

  19. Clustering Intro to unsupervised learning K-means algorithm Optimization objective Initialization and the number of clusters Hierarchical clustering

  20. K-means optimization objective ?(?) =Index of cluster (1, 2, K) to which example ?? is currently assigned ?? = cluster centroid ? (?? ?) ??(?) = cluster centroid of cluster to which example ?? has been assigned Optimization objective: ? ?1, ,??,?1, ,?? =1 Example: ?(?)= 5 ?(?)= 5 ??(?) = ?5 ? 2 ?? ??? ? ?=1 ? ?1, ,??,?1, ,?? min ?1, ,?? ?1, ,?? Slide credit: Andrew Ng

  21. K-means algorithm Randomly initialize ? cluster centroids ?1,?2, ,?? ? Cluster assignment step ? ?1, ,??,?1, ,?? =1 ? Repeat{ closest to ?(?) for ? = 1 to ? ?? average (mean) of points assigned to cluster ? } 2 ?? ??? ? ?=1 for ? = 1 to ? ?(?) index (from 1 to ?) of cluster centroid Centroid update step ? ?1, ,??,?1, ,?? =1 ? 2 ?? ??? ? ?=1 Slide credit: Andrew Ng

  22. Objective based clustering Input: ? (number of clusters) Training set ?(1),?(2),?(3), ,?(?) Distance/dissimilarity measure d(??,??) Goal: output a partition of the data K-means: Euclidean distance K-median: Absolute distance K-center: minimize the maximum radius

  23. Clustering Intro to unsupervised learning K-means algorithm Optimization objective Initialization and the number of clusters Hierarchical clustering

  24. Random initialization Randomly pick ? training examples Set ?1,?2, ,?? equal to those ? examples ?2 ?1 Slide credit: Andrew Ng

  25. Local optima Slide credit: Andrew Ng

  26. Local optima Slide credit: Maria-Florina Balcan

  27. 1) Multiple random initialization For i = 1 to 100 { Randomly initialize K-means Run K-means. Get ?1, ,??,?1, ,?? Compute the cost function (distortion) ?(?1, ,??,?1, ,??) } Pick clustering that gave the lowest cost ?(?1, ,??,?1, ,??) Slide credit: Andrew Ng

  28. 2) Furthest Point Heuristic Choose ?1 arbitrarily (or at random) For j = 2 to K Pick ?? among data points ?(1),?(2), ,?(?) that is farthest from previously chosen ?1,?2, ,?? 1 Slide credit: Maria-Florina Balcan

  29. 3) K-means++ initialization Interpolate between random and furthest point initialization Idea: Let D(?) be the distance between a point ? and its nearest center. Chose the next center proportional to D2(?). Choose ?1 arbitrarily (or at random) For j = 2 to K Pick ?? among data points ?(1),?(2), ,?(?) according to the distribution 2 Pr ??= ?? ? <??? ?? min [Arthur and Vassilvitskii 2007]

  30. 3) K-means++ initialization Interpolate between random and furthest point initialization Idea: Let D(?) be the distance between a point ? and its nearest center. Chose the next center proportional to D?(?). ? = 0: random sampling ? = : furthest point ? = 2: K-means++ scikit-learn uses K-means++ as default [Arthur and Vassilvitskii 2007]

  31. How to choose K? 1) Elbow method 2) Try multiple K and use performance from downstream task for selection

  32. Clustering Intro to unsupervised learning K-means algorithm Optimization objective Initialization and the number of clusters Hierarchical clustering

  33. Hierarchical Clustering A hierarchy might be more nature Different users might care about different levels of granularity or even prunings. Slide credit: Maria-Florina Balcan

  34. Hierarchical Clustering Top-down (divisive) Partition data into 2-groups (e.g., 2-means) Recursively cluster each group Bottom-up (agglomerative) Start with every point in its own cluster. Repeatedly merge the closest two clusters Different definitions of closest give different algorithms. Slide credit: Maria-Florina Balcan

  35. Bottom-up (agglomerative) Have a distance measure on pairs of objects. ? ?,? : Distance between ? and ? Single linkage: dist A,B = min x ?,? ? d(x,x ) Complete linkage: dist A,B = max x ?,? ? d(x,x ) Average linkage: dist A,B = average x ?,? ? d(x,x ) ? |?| ? +|?|mean ? mean ? 2 Ward s method dist A,B = Slide credit: Maria-Florina Balcan

  36. Bottom-up (agglomerative) Single linkage: dist A,B = min x ?,? ? d(x,x ) At any time, distance between any two points in a connected components < r. Complete linkage: dist A,B = max x ?,? ? d(x,x ) Keep max diameter as small as possible at any level ? |?| ? +|?|mean ? mean ? 2 Ward s method dist A,B = Merge the two clusters such that the increase in k-means cost is as small as possible. Works well in practice Slide credit: Maria-Florina Balcan

  37. Things to remember Intro to unsupervised learning K-means algorithm Optimization objective Initialization and the number of clusters Hierarchical clustering

Related


More Related Content