Matrix Factorization for Latent Factor Recovery

 
Matrix Factorization
 
 
Recovering latent factors in a matrix
 
m
 movies
 
V[i,j] = user i’s rating of movie j
 
n 
users
 
Recovering latent factors in a matrix
 
m
 movies
 
n 
users
 
m
 movies
 
~
 
V[i,j] = user i’s rating of movie j
talk pilfered from 
…..
 
KDD 2011
Recovering latent factors in a matrix
m
 movies
n 
users
m
 movies
~
V[i,j] = user i’s rating of movie j
 
r
 
W
 
H
 
V
for image denoising
 
Matrix factorization as SGD
 
step size
 
Matrix factorization as SGD - why does
this work?
 
step size
Matrix factorization as SGD - why does
this work?  Here’s the key claim:
Checking the claim
Think for SGD for logistic regression
LR loss = compare 
y
 and 
ŷ
 = dot(w,x)
similar but now update w (user weights) and x (movie weight)
What loss functions are possible?
N1, N2 - diagonal
matrixes, sort of like IDF
factors for the
users/movies
“generalized” KL-divergence
 
What loss functions are possible?
 
What loss functions are possible?
 
ALS = alternating least squares
talk pilfered from 
…..
 
KDD 2011
Similar to McDonnell et al
with perceptron learning
Slow convergence…..
 
More detail….
 
Randomly permute rows/cols of matrix
Chop V,W,H into blocks of size 
d x d
m/d 
blocks in W, 
n/d 
blocks in H
Group the data:
Pick a set of blocks with no overlapping rows
or columns (a 
stratum)
Repeat until all blocks in V are covered
Train the SGD
Process strata in series
Process blocks within a stratum in parallel
 
More detail….
Z
 was 
V
 
More detail….
 
Initialize W,H randomly
not at zero 
Choose a random ordering (random sort) of the points
in a stratum in each “sub-epoch”
Pick strata sequence by permuting rows and columns
of M, and using M’[k,i] as column index of row i in
subepoch k
Use “bold driver” to set step size:
increase step size when loss decreases (in an epoch)
decrease step size when loss increases
Implemented in Hadoop and R/Snowfall
 
M=
 
Wall Clock Time
8 nodes, 64 cores, R/snow
 
Number of Epochs
 
Varying rank
100 epochs for all
Hadoop scalability
Hadoop process
setup  time starts
to dominate
 
Hadoop scalability
Slide Note
Embed
Share

Explore the concept of matrix factorization for recovering latent factors in a matrix, specifically focusing on user ratings of movies. This technique involves decomposing a matrix into multiple matrices to extract hidden patterns and relationships. The process is crucial for tasks like image denoising and can be optimized using stochastic gradient descent. Learn how this approach works and its significance in various applications.

  • Matrix Factorization
  • Latent Factors
  • Stochastic Gradient Descent
  • Movie Ratings
  • Image Denoising

Uploaded on Sep 12, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Matrix Factorization Matrix Factorization

  2. Recovering latent factors in a matrix Recovering latent factors in a matrix m movies v11 n users vij vnm V[i,j] = user i s rating of movie j

  3. Recovering latent factors in a matrix Recovering latent factors in a matrix m movies m movies ~ a1 a2 .. am x1 y1 v11 b1 b2 bm x2 y2 .. .. n users vij vnm xn yn V[i,j] = user i s rating of movie j

  4. KDD 2011 talk pilfered from ..

  5. Recovering latent factors in a matrix Recovering latent factors in a matrix r m movies m movies ~ .. H a1 a2 am x1 y1 v11 b1 b2 bm x2 y2 .. .. n users W V vij vnm xn yn V[i,j] = user i s rating of movie j

  6. for image denoising

  7. Matrix factorization as SGD Matrix factorization as SGD step size

  8. Matrix factorization as SGD Matrix factorization as SGD - - why does this work? this work? why does step size

  9. Matrix factorization as SGD Matrix factorization as SGD - - why does this work? Here s the key claim: this work? Here s the key claim: why does

  10. Checking the claim Checking the claim Think for SGD for logistic regression LR loss = compare y and = dot(w,x) similar but now update w (user weights) and x (movie weight)

  11. What loss functions are possible? What loss functions are possible? N1, N2 - diagonal matrixes, sort of like IDF factors for the users/movies generalized KL-divergence

  12. What loss functions are possible? What loss functions are possible?

  13. What loss functions are possible? What loss functions are possible?

  14. ALS = alternating least squares

  15. KDD 2011 talk pilfered from ..

  16. Similar to McDonnell et al with perceptron learning

  17. Slow convergence..

  18. More detail. More detail . Randomly permute rows/cols of matrix Chop V,W,H into blocks of size d x d m/d blocks in W, n/d blocks in H Group the data: Pick a set of blocks with no overlapping rows or columns (a stratum) Repeat until all blocks in V are covered Train the SGD Process strata in series Process blocks within a stratum in parallel

  19. More detail. More detail . Z was V

  20. More detail. More detail . Initialize W,H randomly not at zero Choose a random ordering (random sort) of the points in a stratum in each sub-epoch Pick strata sequence by permuting rows and columns of M, and using M [k,i] as column index of row i in subepoch k Use bold driver to set step size: increase step size when loss decreases (in an epoch) decrease step size when loss increases Implemented in Hadoop and R/Snowfall M=

  21. Wall Clock Time Wall Clock Time 8 nodes, 64 cores, R/snow 8 nodes, 64 cores, R/snow

  22. Number of Epochs Number of Epochs

  23. Varying rank Varying rank 100 epochs for all 100 epochs for all

  24. Hadoop Hadoop scalability scalability Hadoop process setup time starts to dominate

  25. Hadoop Hadoop scalability scalability

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#