Auto-Encoding Variational Bayes (@ ICLR 2014)
Diederik P Kingma, Max Welling
Variational Autoencoders (VAEs) are a family of latent variable deep generative models,
It works by optimizing the Evidence Lower Bound (ELBO) with an encoder-decoder
architecture. In the ELBO, is defined as , i.e. an amortized
model that is encoded by a neural network that maps to variational parameters ,
and then is the decoder, mapping back to .
The process is basically to (1) sample a data point from our dataset, which approximates
, (2) compute the variational parameters using , sample
from the variational distribution, obtaining a sample , then reconstruct from
using , and then optimize the ELBO using gradient updates.
The sampling of is done via a reparametrization trick, where the noise is taken
from a standard gaussian and then multiplied deterministically rescaled with the variational
parameters. This has lower variance than using REINFORCE (which you could always do).
There are many variants of the traditional VAE. The -VAE puts more weight
on the KL term to get disentangled representations. The IWAE gets a tighter bound
by taking multiple Monte-Carlo samples. The GMVAE uses a more flexible prior:
a gaussian mixture instead of a single gaussian. One can also use VAEs
in a semi-supervised setting, where a small number of classification labels are available.