Optimizing Deep Learning: Methods and Insights

Slide Note

Exploring gradient-free and derivative-free optimization methods for deep learning, including insights on search space of deep networks and alternative approaches like ant colony optimization and simulated annealing. Emphasizes the importance of architecture and simpler training methods for improved performance on large datasets.

sieracki_d Follow

Uploaded on Oct 05, 2024 | 0 Views

Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript

Gradient free optimization for deep learning Usman Roshan NJIT

Derivative free optimization Pros: Can handle any activation function (for example sign) Free from vanishing and exploding gradient problems Cons: May take longer than gradient search Does it work for deep learning and what do we know there?

What do we know about the search space of deep networks? From The Loss Surfaces of Multilayer Networks , AISTATS 2014

Other methods for deep learning optimization Ant colony optimization Simulated annealing Both report minor improvements Previous studies show the importance of architecture Even gradient descent is a bit of an overkill. For example random weights will go a long way in deep learning. Dropout zeros out many nodes. Perhaps even simpler training methods may be better on large datasets

Optimizing Deep Learning: Methods and Insights

Download Presentation

Presentation Transcript

Related

More Related Content