LSTMs for Deep Learning: A Visual Overview

ECE 6504: Deep Learning

for Perception

Dhruv Batra

Virginia Tech

Topics:

–

LSTMs (intuition and variants)

–

[Abhishek:] Lua / Torch Tutorial

Administrativia

•

HW3

–

Out today

–

Due in 2 weeks

–

Please please please please please start early

–

https://computing.ece.vt.edu/~f15ece6504/homework3/

(C) Dhruv Batra

RNN

•

Basic block diagram

(C) Dhruv Batra

Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

Key Problem

•

Learning long-term dependencies is hard

(C) Dhruv Batra

Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

Meet LSTMs

•

How about we explicitly encode memory?

(C) Dhruv Batra

Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

LSTMs Intuition: Memory

•

Cell State / Memory

(C) Dhruv Batra

Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

LSTMs Intuition: Forget Gate

•

Should we continue to remember this “bit” of

information or not?

(C) Dhruv Batra

Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

LSTMs Intuition: Input Gate

•

Should we update this “bit” of information or not?

–

If so, with what?

(C) Dhruv Batra

Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

LSTMs Intuition: Memory Update

•

Forget that + memorize this

(C) Dhruv Batra

Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

LSTMs Intuition: Output Gate

•

Should we output this “bit” of information to “deeper”

layers?

(C) Dhruv Batra

Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

LSTMs Intuition: Output Gate

•

Should we output this “bit” of information to “deeper”

layers?

(C) Dhruv Batra

Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

LSTMs

•

A pretty sophisticated cell

(C) Dhruv Batra

Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

LSTM Variants #1: Peephole Connections

•

Let gates see the cell state / memory

(C) Dhruv Batra

Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

LSTM Variants #2: Coupled Gates

•

Only memorize new if forgetting old

(C) Dhruv Batra

Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

LSTM Variants #3: Gated Recurrent Units

•

Changes:

–

No explicit memory; memory = hidden output

–

Z = memorize new and forget old

(C) Dhruv Batra

Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

RMSProp Intuition

•

Gradients ≠ Direction to Opt

–

Gradients point in the direction of steepest ascent locally

–

Not where we want to go long term

•

Mismatch gradient magnitudes

–

magnitude large =  we should travel a small distance

–

magnitude small = we should travel a large distance

(C) Dhruv Batra

Image Credit: Geoffrey Hinton

RMSProp Intuition

•

Keep track of previous gradients to get an idea of

magnitudes over batch

•

Divide by this accumulate

(C) Dhruv Batra

Slide Note

Embed Share

Download

Delve into the intricate workings of Long Short-Term Memory (LSTM) networks with a series of visual aids and explanations by Dhruv Batra. Explore the intuition behind LSTMs, including memory cells, forget gates, input gates, memory updates, and output gates, shedding light on how these mechanisms enable the model to retain and process information over extended periods.

buschman_b Follow

Uploaded on Oct 09, 2024 | 1 Views

Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

Download Presentation

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript

ECE 6504: Deep Learning for Perception Topics: LSTMs (intuition and variants) [Abhishek:] Lua / Torch Tutorial Dhruv Batra Virginia Tech

Administrativia HW3 Out today Due in 2 weeks Please please please please please start early https://computing.ece.vt.edu/~f15ece6504/homework3/ (C) Dhruv Batra 2

RNN Basic block diagram (C) Dhruv Batra 3 Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

Key Problem Learning long-term dependencies is hard (C) Dhruv Batra 4 Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

Meet LSTMs How about we explicitly encode memory? (C) Dhruv Batra 5 Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

LSTMs Intuition: Memory Cell State / Memory (C) Dhruv Batra 6 Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

LSTMs Intuition: Forget Gate Should we continue to remember this bit of information or not? (C) Dhruv Batra 7 Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

LSTMs Intuition: Input Gate Should we update this bit of information or not? If so, with what? (C) Dhruv Batra 8 Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

LSTMs Intuition: Memory Update Forget that + memorize this (C) Dhruv Batra 9 Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

LSTMs Intuition: Output Gate Should we output this bit of information to deeper layers? (C) Dhruv Batra 10 Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

LSTMs Intuition: Output Gate Should we output this bit of information to deeper layers? (C) Dhruv Batra 11 Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

LSTMs A pretty sophisticated cell (C) Dhruv Batra 12 Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

LSTM Variants #1: Peephole Connections Let gates see the cell state / memory (C) Dhruv Batra 13 Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

LSTM Variants #2: Coupled Gates Only memorize new if forgetting old (C) Dhruv Batra 14 Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

LSTM Variants #3: Gated Recurrent Units Changes: No explicit memory; memory = hidden output Z = memorize new and forget old (C) Dhruv Batra 15 Image Credit: Christopher Olah (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

RMSProp Intuition Gradients Direction to Opt Gradients point in the direction of steepest ascent locally Not where we want to go long term Mismatch gradient magnitudes magnitude large = we should travel a small distance magnitude small = we should travel a large distance (C) Dhruv Batra 16 Image Credit: Geoffrey Hinton