LSTMs for Deep Learning: A Visual Overview

 
ECE 6504: Deep Learning
for Perception
 
 
Dhruv Batra
Virginia Tech
 
Topics:
LSTMs (intuition and variants)
[Abhishek:] Lua / Torch Tutorial
 
Administrativia
 
HW3
Out today
Due in 2 weeks
Please please please please please start early
 
(C) Dhruv Batra
 
2
 
RNN
 
Basic block diagram
 
(C) Dhruv Batra
 
3
 
Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)
 
Key Problem
 
Learning long-term dependencies is hard
 
(C) Dhruv Batra
 
4
 
Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)
 
Meet LSTMs
 
How about we explicitly encode memory?
 
(C) Dhruv Batra
 
5
 
Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)
 
LSTMs Intuition: Memory
 
Cell State / Memory
 
(C) Dhruv Batra
 
6
 
Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)
 
LSTMs Intuition: Forget Gate
 
Should we continue to remember this “bit” of
information or not?
 
(C) Dhruv Batra
 
7
 
Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)
 
LSTMs Intuition: Input Gate
 
Should we update this “bit” of information or not?
If so, with what?
 
(C) Dhruv Batra
 
8
 
Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)
 
LSTMs Intuition: Memory Update
 
Forget that + memorize this
 
(C) Dhruv Batra
 
9
 
Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)
 
LSTMs Intuition: Output Gate
 
Should we output this “bit” of information to “deeper”
layers?
 
(C) Dhruv Batra
 
10
 
Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)
 
LSTMs Intuition: Output Gate
 
Should we output this “bit” of information to “deeper”
layers?
 
(C) Dhruv Batra
 
11
 
Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)
 
LSTMs
 
A pretty sophisticated cell
 
(C) Dhruv Batra
 
12
 
Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)
 
LSTM Variants #1: Peephole Connections
 
Let gates see the cell state / memory
 
(C) Dhruv Batra
 
13
 
Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)
 
LSTM Variants #2: Coupled Gates
 
Only memorize new if forgetting old
 
(C) Dhruv Batra
 
14
 
Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)
 
LSTM Variants #3: Gated Recurrent Units
 
Changes:
No explicit memory; memory = hidden output
Z = memorize new and forget old
 
(C) Dhruv Batra
 
15
 
Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)
 
RMSProp Intuition
 
Gradients ≠ Direction to Opt
Gradients point in the direction of steepest ascent locally
Not where we want to go long term
 
Mismatch gradient magnitudes
magnitude large =  we should travel a small distance
magnitude small = we should travel a large distance
 
(C) Dhruv Batra
 
16
 
Image Credit: Geoffrey Hinton
 
RMSProp Intuition
 
Keep track of previous gradients to get an idea of
magnitudes over batch
 
 
 
 
Divide by this accumulate
 
(C) Dhruv Batra
 
17
Slide Note
Embed
Share

Delve into the intricate workings of Long Short-Term Memory (LSTM) networks with a series of visual aids and explanations by Dhruv Batra. Explore the intuition behind LSTMs, including memory cells, forget gates, input gates, memory updates, and output gates, shedding light on how these mechanisms enable the model to retain and process information over extended periods.

  • Deep Learning
  • LSTMs
  • Neural Networks
  • Dhruv Batra
  • Visual Explanation

Uploaded on Oct 09, 2024 | 1 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. ECE 6504: Deep Learning for Perception Topics: LSTMs (intuition and variants) [Abhishek:] Lua / Torch Tutorial Dhruv Batra Virginia Tech

  2. Administrativia HW3 Out today Due in 2 weeks Please please please please please start early https://computing.ece.vt.edu/~f15ece6504/homework3/ (C) Dhruv Batra 2

  3. RNN Basic block diagram (C) Dhruv Batra 3 Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

  4. Key Problem Learning long-term dependencies is hard (C) Dhruv Batra 4 Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

  5. Meet LSTMs How about we explicitly encode memory? (C) Dhruv Batra 5 Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

  6. LSTMs Intuition: Memory Cell State / Memory (C) Dhruv Batra 6 Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

  7. LSTMs Intuition: Forget Gate Should we continue to remember this bit of information or not? (C) Dhruv Batra 7 Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

  8. LSTMs Intuition: Input Gate Should we update this bit of information or not? If so, with what? (C) Dhruv Batra 8 Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

  9. LSTMs Intuition: Memory Update Forget that + memorize this (C) Dhruv Batra 9 Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

  10. LSTMs Intuition: Output Gate Should we output this bit of information to deeper layers? (C) Dhruv Batra 10 Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

  11. LSTMs Intuition: Output Gate Should we output this bit of information to deeper layers? (C) Dhruv Batra 11 Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

  12. LSTMs A pretty sophisticated cell (C) Dhruv Batra 12 Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

  13. LSTM Variants #1: Peephole Connections Let gates see the cell state / memory (C) Dhruv Batra 13 Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

  14. LSTM Variants #2: Coupled Gates Only memorize new if forgetting old (C) Dhruv Batra 14 Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

  15. LSTM Variants #3: Gated Recurrent Units Changes: No explicit memory; memory = hidden output Z = memorize new and forget old (C) Dhruv Batra 15 Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

  16. RMSProp Intuition Gradients Direction to Opt Gradients point in the direction of steepest ascent locally Not where we want to go long term Mismatch gradient magnitudes magnitude large = we should travel a small distance magnitude small = we should travel a large distance (C) Dhruv Batra 16 Image Credit: Geoffrey Hinton

  17. RMSProp Intuition Keep track of previous gradients to get an idea of magnitudes over batch Divide by this accumulate (C) Dhruv Batra 17

Related


More Related Content

giItT1WQy@!-/#