Understanding InfoGAN and BiGAN in Feature Extraction

1 / 19

Embed Share

Explore the concepts of InfoGAN and BiGAN in feature extraction, where InfoGAN focuses on predicting code c that generates x and BiGAN involves understanding code z from prior distribution through encoder and decoder. Learn about their algorithms and the interplay between discriminators and generators for image generation.

onaz Follow

Uploaded on Mar 18, 2025 | 0 Views

Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

Download Presentation

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript

Feature Extraction

InfoGAN Regular GAN (The colors represents the characteristics.) Modifying a specific dimension, no clear meaning What we expect Actually

What is InfoGAN? c Discrimi nator z x Generator = scalar z encoder Auto- encoder Parameter sharing Classifier decoder (only the last layer is different) Predict the code c that generates x c

What is InfoGAN? c c must have clear influence on x z x Generator = z encoder Auto- encoder The classifier can recover c from x. Classifier decoder Predict the code c that generates x c

https://arxiv.org/abs/1606.03657

Anders Boesen, Lindbo Larsen, Sren Kaae S nderby, Hugo Larochelle, Ole Winther, Autoencoding beyond pixels using a learned similarity metric , ICML. 2016 VAE-GAN Minimize reconstruction error Cheat discriminator Minimize reconstruction error z close to normal Discriminate real, generated and reconstructed images Discrimi nator Generator (Decoder) x z scalar x Encoder as close as possible VAE Discriminator provides the reconstruction loss GAN

Algorithm Initialize En, De, Dis In each iteration: Sample M images ?1,?2, ,??from database Generate M codes ?1, ?2, , ?? from encoder ??= ?? ?? Generate M images ?1, ?2, , ??from decoder ??= ?? ?? Sample M codes ?1,?2, ,??from prior P(z) Generate M images ?1, ?2, , ??from decoder ??= ?? ?? Update En to decrease ?? ??, decrease KL(P( ??|xi)||P(z)) Update De to decrease ?? ??, increase ??? ?? and ??? ?? Update Dis to increase ??? ??, decrease ??? ?? and ??? ?? Another kind of discriminator: real gen recon Discriminator x

BiGAN (from prior distribution) from encoder or decoder? code z code z Encoder Decoder Discriminator Image x (real) Image x (generated) Image x code z

Algorithm Initialize encoder En, decoder De, discriminator Dis In each iteration: Sample M images ?1,?2, ,??from database Generate M codes ?1, ?2, , ?? from encoder ??= ?? ?? Sample M codes ?1,?2, ,??from prior P(z) Generate M codes ?1, ?2, , ?? from decoder ??= ?? ?? Update Dis to increase ??? ??, ??, decrease ??? ??,?? Update En and De to decrease ??? ??, ??, increase ??? ??,??

(from prior distribution) from encoder or decoder? code z code z Encoder Decoder Discriminator Image x (real) ? ?,? Image x (generated) ? ?,? Image x code z Evaluate the difference between P and Q P and Q would be the same En(x ) = z De(z ) = x For all x Optimal encoder and decoder: For all z De(z ) = x En(x ) = z

BiGAN En(x ) = z De(z ) = x For all x Optimal encoder and decoder: For all z De(z ) = x En(x ) = z ? How about? ? Decoder Encoder En, De z x Encoder Decoder En, De x z

Triple GAN Chongxuan Li, Kun Xu, Jun Zhu, Bo Zhang, Triple Generative Adversarial Nets , arXiv 2017

Domain-adversarial training Training and testing data are in different domains Generator feature Training data: Testing data: The same distribution feature Generator Hana Ajakan, Pascal Germain, Hugo Larochelle, Fran ois Laviolette, Mario Marchand, Domain-Adversarial Training of Neural Networks, JMLR, 2016

Domain-adversarial training Maximize label classification accuracy Maximize label classification accuracy + minimize domain classification accuracy Label predictor feature extractor Domain classifier Not only cheat the domain classifier, but satisfying label classifier at the same time Maximize domain classification accuracy This is a big network, but different parts have different goals.

Original Seq2seq Auto-encoder RNN Encoder RNN Decoder input segment reconstructed Include phonetic information, speaker information, etc. Feature Disentangle ? Phonetic Encoder RNN Decoder input segment reconstructed Speaker Encoder ?

Feature Disentangle ? Phonetic Encoder RNN Decoder input segment reconstructed Speaker Encoder ? Assume that we know speaker ID of segments. ?? Speaker Encoder ?? distance larger than a as close as possible threshold same speaker different speakers ?? Speaker Encoder ??

Feature Disentangle ? Phonetic Encoder RNN Decoder input segment reconstructed Speaker Encoder ? Inspired from domain adversarial training ?? ?? Phonetic Encoder Speaker Classifier score ?? Phonetic Encoder same speaker different speakers ?? Learn to confuse Speaker Classifier

Audio segments of two different speakers

Acknowledgement

Understanding InfoGAN and BiGAN in Feature Extraction

Download Presentation

Presentation Transcript

Related

More Related Content