Advancements in Word Embeddings through Dependency-Based Techniques

Slide Note
Embed
Share

"Explore the evolution of word embeddings with a focus on dependency-based methods, showcasing innovations like Skip-Gram with Negative Sampling. Learn about Generalizing Skip-Gram and the shift towards analyzing linguistically rich embeddings using various contexts such as bag-of-words and syntactic dependencies."


Uploaded on Sep 07, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Dependency-Based Word Embeddings Omer Levy Yoav Goldberg Bar-Ilan University Israel

  2. Neural Embeddings Dense vectors Each dimension is a latent feature word2vec (Mikolov et al., 2013) State-of-the-Art: Skip-Gram with Negative Sampling Linguistic Regularities king man + woman = queen Linguistic Regularities in Sparse and Explicit Word Representations Friday, 2:00 PM, CoNLL 2014

  3. Our Main Contribution: Our Main Contribution: Generalizing Skip-Gram with Negative Sampling

  4. Skip-Gram with Negative Sampling v2.0 Original implementation assumes bag-of-words contexts We generalize to arbitrary contexts Dependency contexts create qualitatively different word embeddings Provide a new tool for linguistically analyzing embeddings

  5. Context Types

  6. Example Australian scientist discovers star with telescope

  7. Target Word Australian scientist discovers star with telescope

  8. Bag of Words (BoW) Context Australian scientist discovers star with telescope

  9. Bag of Words (BoW) Context Australian scientist discovers star with telescope

  10. Bag of Words (BoW) Context Australian scientist discovers star with telescope

  11. Syntactic Dependency Context Australian scientist discovers star with telescope

  12. Syntactic Dependency Context nsubj prep_with dobj Australian scientist discovers star with telescope

  13. Syntactic Dependency Context nsubj prep_with dobj Australian scientist discovers star with telescope

  14. Generalizing Skip-Gram with Negative Sampling

  15. How does Skip-Gram work? Skip-gram represents each word ? as a vector ? Skip-gram represents each context word ? as a different vector ? Same word has 2 different embeddings (as word , as context ) ?????????? ??????????

  16. How does Skip-Gram work? Text ? 2,? 1,?,?+1,?+2 ?,? = ? 1 Bag of Words Context Word-Context Pairs Learning

  17. How does Skip-Gram work? Text ? 2,? 1,?,?+1,?+2 ?,? = ? 1 Bag of Words Contexts Word-Context Pairs Learning

  18. Our Modification Text ? Arbitrary Contexts ?,? =? Word-Context Pairs Learning

  19. Our Modification Text ? Arbitrary Contexts ?,? =? Word-Context Pairs Learning Modified word2vec publicly available!

  20. Our Modification: Example Text ?????,?,???? ?,? = ???? Syntactic Contexts Word-Context Pairs Learning

  21. Our Modification: Example Text (Wikipedia) ?????,?,???? ?,? = ???? Syntactic Contexts Word-Context Pairs Learning

  22. Our Modification: Example Text (Wikipedia) ?????,?,???? ?,? = ???? Syntactic Contexts (Stanford Dependencies) Word-Context Pairs Learning

  23. What is the effect of different context types?

  24. What is the effect of different context types? Thoroughly studied in explicit representations (distributional) Lin (1998), Pad and Lapata (2007), and many others General Conclusion: Bag-of-words contexts induce topical similarities Dependency contexts induce functional similarities Share the same semantic type Cohyponyms Does this hold for embeddings as well?

  25. Embedding Similarity with Different Contexts Target Word Bag of Words (k=5) Dumbledore hallows half-blood Malfoy Snape Related to Harry Potter Dependencies Sunnydale Collinwood Calarts Greendale Millfield Hogwarts (Harry Potter s school) Schools

  26. Embedding Similarity with Different Contexts Target Word Bag of Words (k=5) nondeterministic non-deterministic computability deterministic finite-state Related to computability Dependencies Pauling Hotelling Heting Lessing Hamming Turing (computer scientist) Scientists

  27. Embedding Similarity with Different Contexts Target Word Bag of Words (k=5) singing dance dances dancers tap-dancing Related to dance Dependencies singing rapping breakdancing miming busking dancing (dance gerund) Gerunds Online Demo!

  28. Embedding Similarity with Different Contexts Dependency-based embeddings have more functional similarities This phenomenon goes beyond these examples Quantitative Analysis (in the paper)

  29. Quantitative Analysis 1 0.9 Dependencies 0.8 BoW (k=2) Precision 0.7 BoW (k=5) 0.6 0.5 0.4 0.3 0 0.1 0.2 0.3 0.4 0.5 Recall 0.6 0.7 0.8 0.9 1 Dependency-based embeddings have more functional similarities

  30. Why do dependencies induce functional similarities?

  31. Dependency Contexts & Functional Similarity Thoroughly studied in explicit representations (distributional) Lin (1998), Pad and Lapata (2007), and many others In explicit representations, we can look at the features and analyze But embeddings are a black box! Dimensions are latent and don t necessarily have any meaning

  32. Analyzing Embeddings

  33. Peeking into Skip-Grams Black Box Skip-Gram allows a peek Contexts are embedded in the same space! Given a word ?, find the contexts ?it activates most: argmax ? ? ?

  34. Associated Contexts Target Word Dependencies students/prep_at-1 educated/prep_at-1 student/prep_at-1 stay/prep_at-1 learned/prep_at-1 Hogwarts

  35. Associated Contexts Target Word Dependencies machine/nn-1 test/nn-1 theorem/poss-1 machines/nn-1 tests/nn-1 Turing

  36. Associated Contexts Target Word Dependencies dancing/conj dancing/conj-1 singing/conj-1 singing/conj ballroom/nn dancing

  37. Analyzing Embeddings We found a way to linguistically analyze embeddings Together with the ability to engineer contexts we now have the tools to create task-tailored embeddings!

  38. Conclusion

  39. Conclusion Generalized Skip-Gram with Negative Sampling to arbitrary contexts Different contexts induce different similarities Suggest a way to peek inside the black box of embeddings Code, demo, and word vectors available from our websites Make linguistically-motivatedtask-tailored embeddings today! Thank you for listening :)

Related


More Related Content