Understanding Deep Generative Bayesian Networks in Machine Learning
Exploring the differences between Neural Networks and Bayesian Neural Networks, the advantages of the latter including robustness and adaptation capabilities, the Bayesian theory behind these networks, and insights into the comparison with regular neural network theory. Dive into the complexities, uncertainties, and potential benefits of Bayesian Neural Networks for machine learning applications.
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
Deep Generative Bayesian Network 1
Summary 1. Neural Networks vs Bayesian Neural Networks 2. Advantages of Bayesian Neural Networks 3. Bayesian Theory 4. Preliminary Results 2
1 Neural Networks vs Bayesian Neural Networks 3
Regular Neural Networks Fixed weights and biases Kinda boring One set of features: one result 4
Bayesian Neural Network Gaussian distributions Awesome Monte Carlo sampling The weights and biases are drawn following these distributions One set of features: different results possible 5
2 Advantages of Bayesian Neural Networks 6
Advantages vs Disadvantages Robust to small datasets (less prone to More computer demanding overfitting) Implementation more difficult Conscious of its uncertainties More complexe theory Gives a probability distribution as an output Can adapt easily to regular neural networks architectures 7
Comparison to Regular Neural Network Theory Minimize the loss: Regular theory Equivalent to maximizing the likelihood: Calculate the posterior distribution: Bayesian theory Baye s rules (exact inference): 9
Approximation Posterior distribution Parametrized distribution Kullbach-Liebler divergence: 10
Loss function Evidence Lower BOund (ELBO): 11
Building custom rings True ring Adding noise to the dataset The goal is to reproduce this noise Noised ring 13
The noise distribution Depends on the features: mean, width, position, angles Make the model fit to this distribution 14
Fitting the noise distribution Working first on a simpler distribution By minimizing the ELBO loss we can fit the distribution - - blue = predicted distribution orange = true distribution 15
Updates 16
No convergence - - blue = predicted distribution orange = true distribution 17
Partial convergence - - blue = predicted distribution orange = true distribution 18
Discrete probability distribution Real distribution - - blue = predicted distribution orange = true distribution 19
The CODE 20
The CODE 21
Image loss and KL loss - - blue = factor switch orange = kl_factor adjust 22