Introduction to Keras for Deep Learning

 
“Hello world”
of deep learning
 
Keras
 
keras
 
http://speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015_2/L
ecture/Theano%20DNN.ecm.mp4/index.html
 
http://speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015_2/Le
cture/RNN%20training%20(v6).ecm.mp4/index.html
 
Very flexible
 
Need some
effort to learn
 
Easy to learn and use
 
(still have some flexibility)
 
You can modify it if you can write
TensorFlow or Theano
 
Interface of
TensorFlow or
Theano
 
or
 
If you want to learn theano:
Keras
 
François Chollet is the author of Keras.
He currently works for Google as a deep learning
engineer and researcher.
Keras means 
horn
 in Greek
Documentation: 
http://keras.io/
Example:
https://github.com/fchollet/keras/tree/master/exa
mples
使用 
Keras 
心得
感謝 沈昇勳 同學提供圖檔
“Hello world”
Handwriting Digit Recognition
Machine
“1”
 
MNIST Data: 
http://yann.lecun.com/exdb/mnist/
 
Keras provides data sets loading function: 
http://keras.io/datasets/
 
28 x 28
Keras
y
1
y
2
y
10
……
Softmax
500
500
28x28
 
softplus, softsign, relu, tanh,
hard_sigmoid, linear
Keras
 
Several alternatives: 
https://keras.io/objectives/
Keras
Step 3.1: Configuration
Step 3.2: Find the optimal network parameters
 
Training data
(Images)
 
Labels
(digits)
 
In the following slides
 
SGD, RMSprop, Adagrad, Adadelta, Adam, Adamax, Nadam
Keras
Step 3.2: Find the optimal network parameters
https://www.tensorflow.org/versions/r0.8/tutorials/mnist/beginners/index.html
 
Number of training examples
 
numpy array
 
28 x 28
=784
 
numpy array
 
10
 
Number of training examples
 
……
 
……
Mini-batch
NN
……
NN
NN
……
NN
 
Pick the 1
st
 batch
 
Randomly initialize
network parameters
 
Pick the 2
nd
 batch
 
Mini-batch
 
Mini-batch
 
Update parameters once
 
Update parameters once
 
Until all mini-batches
have been picked
 
one epoch
Repeat the above process
We do not really minimize total loss!
Mini-batch
 
100 examples in a mini-batch
 
Repeat 20 times
 
Batch size = 1
 
Stochastic gradient descent
 
Batch size influences both 
speed
 and
performance
. You have to tune it.
Speed
 
Smaller batch size means more updates in one epoch
E.g. 50000 examples
batch size = 1, 50000 updates in one epoch
batch size = 10, 5000 updates in one epoch
 
GTX 980 on MNIST with
50000 training examples
166s
166s
17s
17s
 
1 epoch
 
10 epoch
 
Batch size = 1 and 10, update the same
amount of times in the same period.
 
Batch size = 10 is more stable, converge
faster
Very large batch size can yield
worse performance
……
……
……
……
……
……
……
……
y
1
y
2
y
M
Speed - Matrix Operation
W
1
W
2
W
L
b
2
b
L
x
a
1
a
2
y
y
b
1
W
1
x
 
+
b
2
W
2
 
+
b
L
W
L
 
+
 
b
1
 
 
Forward pass
 
(Backward pass is similar)
Speed - Matrix Operation
Why mini-batch is faster than stochastic gradient
descent?
Stochastic Gradient Descent
Mini-batch
 
matrix
Practically, which
one is faster?
 
=
 
=
 
……
 
=
Keras
 
http://keras.io/getting-started/faq/#how-can-i-save-a-keras-model
 
How to use the neural network (testing):
 
case 1:
 
case 2:
 
Save and load models
 
Keras
 
Using GPU to speed training
Way 1
THEANO_FLAGS=device=gpu0 python
YourCode.py
Way 2 (in your code)
import os
os.environ["THEANO_FLAGS"] =
"device=gpu0"
 
Live Demo
 
Slide Note
Embed
Share

Introduction to the world of deep learning with Keras, a popular deep learning library developed by François Chollet. Learn about Keras, Theano, TensorFlow, and how to train neural networks for tasks like handwriting digit recognition using the MNIST dataset. Explore different activation functions, network configurations, optimization algorithms, and training strategies in Keras. Get started with practical examples and resources to enhance your deep learning skills.

  • Deep Learning
  • Keras
  • Theano
  • TensorFlow
  • Neural Networks

Uploaded on Sep 21, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Hello world of deep learning

  2. If you want to learn theano: http://speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015_2/L ecture/Theano%20DNN.ecm.mp4/index.html Keras http://speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015_2/Le cture/RNN%20training%20(v6).ecm.mp4/index.html Very flexible or Need some effort to learn Easy to learn and use (still have some flexibility) You can modify it if you can write TensorFlow or Theano Interface of TensorFlow or Theano keras

  3. Keras Fran ois Chollet is the author of Keras. He currently works for Google as a deep learning engineer and researcher. Keras means horn in Greek Documentation: http://keras.io/ Example: https://github.com/fchollet/keras/tree/master/exa mples

  4. Keras

  5. Hello world Handwriting Digit Recognition 1 Machine 28 x 28 MNIST Data: http://yann.lecun.com/exdb/mnist/ Keras provides data sets loading function: http://keras.io/datasets/

  6. Keras 28x28 500 softplus, softsign, relu, tanh, hard_sigmoid, linear 500 Softmax y1 y2 y10

  7. Keras Several alternatives: https://keras.io/objectives/

  8. Keras Step 3.1: Configuration SGD, RMSprop, Adagrad, Adadelta, Adam, Adamax, Nadam Step 3.2: Find the optimal network parameters Training data (Images) Labels (digits) In the following slides

  9. Keras Step 3.2: Find the optimal network parameters numpy array numpy array 28 x 28 =784 10 Number of training examples Number of training examples https://www.tensorflow.org/versions/r0.8/tutorials/mnist/beginners/index.html

  10. We do not really minimize total loss! Mini-batch Randomly initialize network parameters Pick the 1st batch ? = ?1+ ?31+ Update parameters once ?1 x1 y1 NN Mini-batch ?1 ?31 x31 y31 NN ?31 Pick the 2nd batch ? = ?2+ ?16+ Update parameters once ?2 x2 y2 NN Mini-batch ?2 Until all mini-batches have been picked one epoch ?16 x16 y16 NN ?16 Repeat the above process

  11. Batch size influences both speed and performance. You have to tune it. Mini-batch Pick the 1st batch ? = ?1+ ?31+ Update parameters once ?1 x1 y1 NN Mini-batch ?1 ?31 x31 y31 NN Pick the 2nd batch ? = ?2+ ?16+ Update parameters once ?31 100 examples in a mini-batch Batch size = 1 Stochastic gradient descent Until all mini-batches have been picked one epoch Repeat 20 times

  12. Very large batch size can yield worse performance Speed Smaller batch size means more updates in one epoch E.g. 50000 examples batch size = 1, 50000 updates in one epoch batch size = 10, 5000 updates in one epoch 166s 1 epoch 10 epoch 17s Batch size = 1 and 10, update the same amount of times in the same period. Batch size = 10 is more stable, converge faster 166s GTX 980 on MNIST with 50000 training examples 17s

  13. Speed - Matrix Operation y1 1x 2x y2 WL W1 W2 bL b2 b1 yM x a2 a1 N x y y = ? x Forward pass (Backward pass is similar) bL b1 + W1 + ? b2 WL + x W2 = ? ?

  14. Speed - Matrix Operation Why mini-batch is faster than stochastic gradient descent? Stochastic Gradient Descent ?1= ?1= ?1 ?1 ? ? Mini-batch matrix Practically, which one is faster? ?1 ?1 ?1 ? ? =

  15. Keras Save and load models http://keras.io/getting-started/faq/#how-can-i-save-a-keras-model How to use the neural network (testing): case 1: case 2:

  16. Keras Using GPU to speed training Way 1 THEANO_FLAGS=device=gpu0 python YourCode.py Way 2 (in your code) import os os.environ["THEANO_FLAGS"] = "device=gpu0"

  17. Live Demo

Related


More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#