Introduction to Keras for Deep Learning

Slide Note

Introduction to the world of deep learning with Keras, a popular deep learning library developed by François Chollet. Learn about Keras, Theano, TensorFlow, and how to train neural networks for tasks like handwriting digit recognition using the MNIST dataset. Explore different activation functions, network configurations, optimization algorithms, and training strategies in Keras. Get started with practical examples and resources to enhance your deep learning skills.

koslow_s Follow

Uploaded on Sep 21, 2024 | 2 Views

Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

Download Presentation

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript

Hello world of deep learning

If you want to learn theano: http://speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015_2/L ecture/Theano%20DNN.ecm.mp4/index.html Keras http://speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015_2/Le cture/RNN%20training%20(v6).ecm.mp4/index.html Very flexible or Need some effort to learn Easy to learn and use (still have some flexibility) You can modify it if you can write TensorFlow or Theano Interface of TensorFlow or Theano keras

Keras Fran ois Chollet is the author of Keras. He currently works for Google as a deep learning engineer and researcher. Keras means horn in Greek Documentation: http://keras.io/ Example: https://github.com/fchollet/keras/tree/master/exa mples

Keras

Hello world Handwriting Digit Recognition 1 Machine 28 x 28 MNIST Data: http://yann.lecun.com/exdb/mnist/ Keras provides data sets loading function: http://keras.io/datasets/

Keras 28x28 500 softplus, softsign, relu, tanh, hard_sigmoid, linear 500 Softmax y1 y2 y10

Keras Several alternatives: https://keras.io/objectives/

Keras Step 3.1: Configuration SGD, RMSprop, Adagrad, Adadelta, Adam, Adamax, Nadam Step 3.2: Find the optimal network parameters Training data (Images) Labels (digits) In the following slides

Keras Step 3.2: Find the optimal network parameters numpy array numpy array 28 x 28 =784 10 Number of training examples Number of training examples https://www.tensorflow.org/versions/r0.8/tutorials/mnist/beginners/index.html

We do not really minimize total loss! Mini-batch Randomly initialize network parameters Pick the 1st batch ? = ?1+ ?31+ Update parameters once ?1 x1 y1 NN Mini-batch ?1 ?31 x31 y31 NN ?31 Pick the 2nd batch ? = ?2+ ?16+ Update parameters once ?2 x2 y2 NN Mini-batch ?2 Until all mini-batches have been picked one epoch ?16 x16 y16 NN ?16 Repeat the above process

Batch size influences both speed and performance. You have to tune it. Mini-batch Pick the 1st batch ? = ?1+ ?31+ Update parameters once ?1 x1 y1 NN Mini-batch ?1 ?31 x31 y31 NN Pick the 2nd batch ? = ?2+ ?16+ Update parameters once ?31 100 examples in a mini-batch Batch size = 1 Stochastic gradient descent Until all mini-batches have been picked one epoch Repeat 20 times

Very large batch size can yield worse performance Speed Smaller batch size means more updates in one epoch E.g. 50000 examples batch size = 1, 50000 updates in one epoch batch size = 10, 5000 updates in one epoch 166s 1 epoch 10 epoch 17s Batch size = 1 and 10, update the same amount of times in the same period. Batch size = 10 is more stable, converge faster 166s GTX 980 on MNIST with 50000 training examples 17s

Speed - Matrix Operation y1 1x 2x y2 WL W1 W2 bL b2 b1 yM x a2 a1 N x y y = ? x Forward pass (Backward pass is similar) bL b1 + W1 + ? b2 WL + x W2 = ? ?

Speed - Matrix Operation Why mini-batch is faster than stochastic gradient descent? Stochastic Gradient Descent ?1= ?1= ?1 ?1 ? ? Mini-batch matrix Practically, which one is faster? ?1 ?1 ?1 ? ? =

Keras Save and load models http://keras.io/getting-started/faq/#how-can-i-save-a-keras-model How to use the neural network (testing): case 1: case 2:

Keras Using GPU to speed training Way 1 THEANO_FLAGS=device=gpu0 python YourCode.py Way 2 (in your code) import os os.environ["THEANO_FLAGS"] = "device=gpu0"

Live Demo

Introduction to Keras for Deep Learning

Download Presentation

Presentation Transcript

Related

More Related Content