Regularization

CS-EJ3311 –

Deep Learning with Python

Regularization

Alexander Jung

22.2.2025

•

basic idea of regularization

•

regularization via data augmentation

•

regularization via transfer learning

22.2.2025

informal:

learn

hypothesis

 out of a hypothesis space or “model”

that incurs minimum

loss

 when predicting

labels

 of datapoints

based on their

features

see Ch. 4.1 of mlbook.cs.aalto.fi

22.2.2025

“training error”

feature x

label y

22.2.2025

hypospace

/model

crucial parameter is the

ratio d/m

22.2.2025

d / m

training error

validation error

adjust model and/or data to reach

“critical value” (d/m=1)

22.2.2025

bring d/m below critical value 1:

•

increase m by using more training data

•

decrease d by using smaller hypothesis space

22.2.2025

bring d/m below critical value 1:

•

increase m by using more training data

•

decrease d by using smaller hypothesis space

22.2.2025

Data Augmentation

22.2.2025

feature x

label y

original datapoint

augmented

we have enlarged dataset by factor 3 !

22.2.2025

22.2.2025

22.2.2025

22.2.2025

22.2.2025

bring d/m below critical value 1:

•

increase m by using more training data

•

decrease d by using smaller hypothesis space

22.2.2025

replace

 original ERM

22.2.2025

                three hidden layers

two hidden layers

one hidden

layer

22.2.2025

                          10000 iterations

100 iterations

Prune Hypospace by Early Stopping

10 iterations

22.2.2025

all possible maps h(.)

Transfer Learning

22.2.2025

22.2.2025

22.2.2025

22.2.2025

all possible maps h(.)

Fine Tuning a Pretrained Net

22.2.2025

earning rate/step size used during fine tuning

determines

effective model size

22.2.2025

https://www.tensorflow.org/api_docs/python/tf/keras/

applications/vgg16/VGG16

Layer-Wise Fine Tuning

22.2.2025

https://www.quora.com/What-is-the-VGG-neural-network

ine –tune deeper layers

“f

reeze” input layers

“cat”

Feature Extraction

22.2.2025

https://www.quora.com/What-is-the-VGG-neural-network

“f

rozen” input layers perform feature extraction

“f

eature extractor” or ”base” model

“head”

“cat”

22.2.2025

https://keras.io/guides/transfer_learning/

Questions ?

22.2.2025

Slide Note

Embed Share

Download

Delve into the world of deep learning with Python and explore the concepts of regularization in this comprehensive guide by Alexander Jung. Learn how to apply regularization techniques to improve the performance of deep learning models and achieve better results. Gain practical insights and hands-on experience to enhance your skills in the field of deep learning with Python.

kalennaat Follow

Uploaded on Feb 22, 2025 | 0 Views

Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

Download Presentation

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript

CS-EJ3311 Deep Learning with Python Regularization Alexander Jung 22.2.2025 1

What I want to teach you today: What I want to teach you today: basic idea of regularization regularization via data augmentation regularization via transfer learning 2 22.2.2025

What is ML ? What is ML ? informal: learn hypothesis out of a hypothesis space or model that incurs minimum loss when predicting labels of datapoints based on their features training error see Ch. 4.1 of mlbook.cs.aalto.fi 22.2.2025 3

hypothesis h(x) that minimizes ?? ?(?),?(?) label y training error ??=1 3 ( ?? ??)2 validation error ??= ( ?4 ?4)2 3 ?=1 feature x 4 22.2.2025

Data and Model Size Data and Model Size m training set hypospace /model nr. of features n crucial parameter is the ratio d/m d 5 22.2.2025

training error validation error critical value (d/m=1) d / m adjust model and/or data to reach 22.2.2025 6

bring d/m below critical value 1: increase m by using more training data decrease d by using smaller hypothesis space 7 22.2.2025

bring d/m below critical value 1: increase m by using more training data decrease d by using smaller hypothesis space 8 22.2.2025

Data Augmentation 9 22.2.2025

add a bit of noise to features add a bit of noise to features original datapoint label y augmented feature x we have enlarged dataset by factor 3 ! 10 22.2.2025

rotated cat image is still cat image rotated cat image is still cat image 11 22.2.2025

flipped cat image is still cat image flipped cat image is still cat image 12 22.2.2025

shifted cat image is still cat image shifted cat image is still cat image 13 22.2.2025

14 22.2.2025

bring d/m below critical value 1: increase m by using more training data decrease d by using smaller hypothesis space 15 22.2.2025

replace original ERM ? 1 ? ?=1 (??,??), min ? with ERM on smaller ? 1 ? ?=1 (??,??), min ? 16 22.2.2025

Prune Network Architecture Prune Network Architecture one hidden layer two hidden layers . three hidden layers 17 22.2.2025

Prune Hypospace by Early Stopping 10 iterations 100 iterations 10000 iterations 18 22.2.2025

reference hypothesis ( pretrained net ) Transfer Learning all possible maps h(.) 19 22.2.2025

Problem I: classify image as shows border collie vs. not Problem II: classify image as shows a dog vs. not ML Problem I is our main interest only little training data ?(1)for Problem I much more labeled data ?(2) for Problem II pre-train a hypothesis on ?(2), fine-tune on ?(1) 20 22.2.2025

?(2) ?(1) learn h by fine-tuning pre-train hypothesis 21 22.2.2025

? 1 ? ?=1 (??,??), + ?( , ) min ? distance to hypothesis which is pre-trained on ?(2) fine tuning on ?(1) 22 22.2.2025

Fine Tuning a Pretrained Net all possible maps h(.) learning rate/step size used during fine tuning determines effective model size 23 22.2.2025

https://www.tensorflow.org/api_docs/python/tf/keras/ applications/vgg16/VGG16 24 22.2.2025

Layer-Wise Fine Tuning fine tune deeper layers freeze input layers cat https://www.quora.com/What-is-the-VGG-neural-network 25 22.2.2025

Feature Extraction frozen input layers perform feature extraction feature extractor or base model head cat https://www.quora.com/What-is-the-VGG-neural-network 26 22.2.2025

https://keras.io/guides/transfer_learning/ 27 22.2.2025

Questions ? 28 22.2.2025

Regularization

Download Presentation

Presentation Transcript

Related

More Related Content