Non-Negative Tensor Factorization with RESCAL

 
Non-Negative Tensor
Factorization with RESCAL
 
Denis Krompaß
1
, Maximilian Nickel
1
, Xueyan Jiang
1
 and
Volker Tresp
1,2
1
 Department of Computer Science. Ludwig Maximilian University,
Oettingenstraße 67, 80538 Munich, Germany
2
 Corporate Technology, Siemens AG, Otto-Hahn-Ring 6, 81739
Munich, Germany
 
Overview
 
1.
Non-Negative Matrix Factorization
2.
Multiplicative Updates
3.
RESCAL
4.
Non-Negativity for RESCAL
5.
Experiments
6.
Benefits and Drawbacks
 
Non-Negative Matrix Factorization
 
Factorize a Matrix/Tensor into non-negative
factors e.g. X = AV
Allows interpretation of latent factors
Can be directly used for clustering
Enforces sparse factors
 
Multiplicative
 Updates
 
Introduced by Lee
 & Seung  in 2000
Used by Mørup and Hanson to infer NN Tucker
decomposition
Define
 a cost-function 
C(θ)
Derive the partial derivative with respect to 
θ
i
Identify negative and positive part of the derivative and
construct
 update function:
 
Negative part of the derivative
 
Positive part of the derivative
 
RESCAL Tensor Factorization for
Relational Learning
 
Three-way-tensor factorization model
 
 
 
 
 
 
Showed very good results in various relational
learning tasks [5,8]
 
Non-Negative Constraint for RESCAL
 
Regularized Least-Squares Cost Function
 
 
Regularized Kullback-Leibler-Divergence Cost
Function
 
 
 
Normalization and Integrating Entity
Attribute Information
 
Normalization of Factor Matrix A [13]
 
 
Add attribute information to the model [8]
 
 
 
 
Include
Also for KL-
Divergence Cost-
Function
 
Experiments
 
Nations
 14 x 14 x  56 multi-relational data that consist of relations
between nations (treaties, immigration, etc). Additionally the
dataset contains attribute information for each entry.
Kinship
 104 x 104 x 26 multi-relational data that consist of several
kinship relations within the Alwayarra tribe.
UMLS
 135 x 135 x 49 multi relational data that consist if biomedical
relationships between categorized concepts of the Unified Medical
Language System (UMLS).
10 x cross-validation
Initialized the non-negative factor matrices with Non-negative
Double Singular Value Decomposition method (NNDSVD)[9]
Nonzero entries were defined as entries smaller than -1.0e-9
or greater than 1.0e-9.
 
Results
 
In Kinships and UMLS case,
the performance is similar
to the original RESCAL
Worse performance on the
nations dataset
Sparsity of latent factors
significantly lower for
UMLS and Kinships
Minimizing KL-Divergence
Cost functions leads to
more sparse factors
 
KL-Divergence
 
KL-Divergence
 
LS
 
LS
 
Conclusion
 
Extended non-negative matrix factorization to relational learning
tasks with the RESCAL model employing multiplicative update
rules.
Derived update rules for Least-Squares and  KL-Divergence based
cost functions including: Regularization, Normalization and
Attribute Information
(+) Benefits:
+
Updates also exploit data sparsity
+
Little loss in predictive performance
+
Significant gain in sparsity of latent factor representation
(-) Drawbacks:
-
Slower convergence even after using non-random
initialization of factor matrices as proposed by [9]
 
References
 
1.
Harshman, RA: Foundations of the PARAFAC procedure: Models and conditions for an "explanatory" multi-modal factor
analysis. UCLA Working Papers in Pho
netics, 16, 1-84 (1970)
2.
Carroll, JD, Chang, JJ: Analysis of individual differences in multidimensional scaling via an N-way generalization of "Eckert-
Young" decomposition. Psychome
trika, 35, 283-319 (1970)
3.
Tucker, LR: Some Mathematical notes on three-mode factor analysis. Psychome
trika, 31, 279-311 (1966)
4.
Lee, DD, Seung, H.S.: Algorithms for Non-negative Matrix Factorization. In NIPS, 556-562 (2000)
5.
Nickel, M, Tresp, V, Kriegel, HP: A Three-Way Model for Collective Learning on Multi-Relational Data. In Proceedings of the
28th International  Conference on 
Machine Learning (2011)
6.
Wang, F, Li, P, König, AC: Efficient Document Clustering via Online Nonnegative 
Matrix Factorizations In Proceedings of
SDM'11, 908-919 (2011)
7.
Kohen, Y: The BellKor Solution to the Netflix Grand Prize. (2009)
8.
Nickel, M, Tresp, V, Kriegel, HP: Factorizing YAGO. Scalable Machine Learning 
for Linked Data. In Proceedings of the 21st
International World Wide Web 
Conference (WWW2012) (2012)
9.
Boutsidis, C, Gallopoulos, E: SVD-based initialization: A head start for nonnegative 
matrix factorization. Pat. Recogn. 41(4),
1350-1362 (2008)
10.
Langville, AN, Meyer, CD, Albright R: Initializations for the Nonnegative Matrix 
Factorization. KDD (2006)
11.
Lee, DD, Seung, HS: Learning the parts of objects by non-negative matrix  factorization. 
Nature, 401, 788-791 (1999)
12.
Mørup, M, Hanson, LK: Algorithms for Sparse Non-negative Tucker decomposition. 
Neural Comput. 20(8), 2112-31 (2008)
13.
Eggert, J, Körner, E: Sparse coding and NMF. In Neural Networks, volume 4, 
pages 2529-2533 (2004)
Slide Note
Embed
Share

This article discusses non-negative tensor factorization with RESCAL, covering topics such as Non-Negative Matrix Factorization, Multiplicative Updates, RESCAL for Relational Learning, and Non-Negative Constraint for RESCAL. It explores how factorizing matrices/tensors into non-negative factors can aid in interpretation of latent factors and clustering, and how RESCAL has provided good results in relational learning tasks. The discussion also includes integrating entity attribute information into the factorization model.

  • Tensor Factorization
  • RESCAL
  • Non-Negative Matrix
  • Multiplicative Updates
  • Relational Learning

Uploaded on Sep 30, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Non-Negative Tensor Factorization with RESCAL Denis Krompa 1, Maximilian Nickel1, Xueyan Jiang1and Volker Tresp1,2 1Department of Computer Science. Ludwig Maximilian University, Oettingenstra e 67, 80538 Munich, Germany 2Corporate Technology, Siemens AG, Otto-Hahn-Ring 6, 81739 Munich, Germany

  2. Overview 1. Non-Negative Matrix Factorization 2. Multiplicative Updates 3. RESCAL 4. Non-Negativity for RESCAL 5. Experiments 6. Benefits and Drawbacks

  3. Non-Negative Matrix Factorization Factorize a Matrix/Tensor into non-negative factors e.g. X = AV Allows interpretation of latent factors Can be directly used for clustering Enforces sparse factors

  4. Multiplicative Updates Introduced by Lee & Seung in 2000 Used by M rup and Hanson to infer NN Tucker decomposition Define a cost-function C( ) Derive the partial derivative with respect to i Identify negative and positive part of the derivative and construct update function: ( ) C Negative part of the derivative C = i i i + ( ) Positive part of the derivative i

  5. RESCAL Tensor Factorization for Relational Learning Three-way-tensor factorization model ,..., 1 for , m k A AR X k k = = k T k 2 2 2 X R = + + T ( , , ) C A X AR A A R RESCAL k A R k k F F F Showed very good results in various relational learning tasks [5,8]

  6. Non-Negative Constraint for RESCAL Regularized Least-Squares Cost Function X R X R R = + ( , , ) ( , , ) ( , ) C A f A f A 2 LS LS L 2 k k 2 2 X R = = + T ( , , ) and ( , ) f A X AR A f A R A R 2 LS k L A R k k F F F Regularized Kullback-Leibler-Divergence Cost Function X R X R R = + ( , , ) ( , , ) ( , ) C A f A f A 1 KL KL L X ijk k X R R ij A = + = + T ( , , ) log ( ) and ( , ) f A X X AR A f A A R 1 KL ij ij k ij L A R k T 1 1 ( ) AR k ij

  7. Normalization and Integrating Entity Attribute Information Normalization of Factor Matrix A [13] A f A C = ~ with ~ X R X R R + * ( , , ) ( , , ) ( ) f 1 LS LS L A = ir A ir A r F Add attribute information to the model [8] , ( ) , , , , ( LS LS D f C attr attr = X R X R = + , ) ( , , ) C A D V C A C + D A V LS f attr = where ( , , ) ( ) A V V Include 2 LS LS L 2 2 = with ( , , ) and ( ) f D A V D AV f V V 2 LS L F F attr Also for KL- Divergence Cost- Function

  8. Experiments Nations 14 x 14 x 56 multi-relational data that consist of relations between nations (treaties, immigration, etc). Additionally the dataset contains attribute information for each entry. Kinship 104 x 104 x 26 multi-relational data that consist of several kinship relations within the Alwayarra tribe. UMLS 135 x 135 x 49 multi relational data that consist if biomedical relationships between categorized concepts of the Unified Medical Language System (UMLS). 10 x cross-validation Initialized the non-negative factor matrices with Non-negative Double Singular Value Decomposition method (NNDSVD)[9] Nonzero entries were defined as entries smaller than -1.0e-9 or greater than 1.0e-9.

  9. Results In Kinships and UMLS case, the performance is similar to the original RESCAL Worse performance on the nations dataset Sparsity of latent factors significantly lower for UMLS and Kinships Minimizing KL-Divergence Cost functions leads to more sparse factors LS KL-Divergence LS KL-Divergence

  10. Conclusion Extended non-negative matrix factorization to relational learning tasks with the RESCAL model employing multiplicative update rules. Derived update rules for Least-Squares and KL-Divergence based cost functions including: Regularization, Normalization and Attribute Information (+) Benefits: + Updates also exploit data sparsity + Little loss in predictive performance + Significant gain in sparsity of latent factor representation (-) Drawbacks: - Slower convergence even after using non-random initialization of factor matrices as proposed by [9]

  11. References 1. Harshman, RA: Foundations of the PARAFAC procedure: Models and conditions for an "explanatory" multi-modal factor analysis. UCLA Working Papers in Phonetics, 16, 1-84 (1970) Carroll, JD, Chang, JJ: Analysis of individual differences in multidimensional scaling via an N-way generalization of "Eckert- Young" decomposition. Psychometrika, 35, 283-319 (1970) Tucker, LR: Some Mathematical notes on three-mode factor analysis. Psychometrika, 31, 279-311 (1966) Lee, DD, Seung, H.S.: Algorithms for Non-negative Matrix Factorization. In NIPS, 556-562 (2000) Nickel, M, Tresp, V, Kriegel, HP: A Three-Way Model for Collective Learning on Multi-Relational Data. In Proceedings of the 28th International Conference on Machine Learning (2011) Wang, F, Li, P, K nig, AC: Efficient Document Clustering via Online Nonnegative Matrix Factorizations In Proceedings of SDM'11, 908-919 (2011) Kohen, Y: The BellKor Solution to the Netflix Grand Prize. (2009) Nickel, M, Tresp, V, Kriegel, HP: Factorizing YAGO. Scalable Machine Learning for Linked Data. In Proceedings of the 21st International World Wide Web Conference (WWW2012) (2012) Boutsidis, C, Gallopoulos, E: SVD-based initialization: A head start for nonnegative matrix factorization. Pat. Recogn. 41(4), 1350-1362 (2008) Langville, AN, Meyer, CD, Albright R: Initializations for the Nonnegative Matrix Factorization. KDD (2006) Lee, DD, Seung, HS: Learning the parts of objects by non-negative matrix factorization. Nature, 401, 788-791 (1999) M rup, M, Hanson, LK: Algorithms for Sparse Non-negative Tucker decomposition. Neural Comput. 20(8), 2112-31 (2008) Eggert, J, K rner, E: Sparse coding and NMF. In Neural Networks, volume 4, pages 2529-2533 (2004) 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13.

Related


More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#