Computer Vision
  • Introduction
    • Neural network basics
    • MLE and Cross Entropy
    • Convolution basics
    • Neural Network Categories
  • 2D Backbones
    • ResNet
    • Transformer
      • Recurrent Neural Network
      • Vision Transformer
      • SwinTransformer
  • Methods for Object Detection
  • Object Detection
    • The R-CNN family
    • ROI pool & ROI align
    • FCOS
    • Object Detection in Detectron2
  • Segmentation
    • Fully Convolutional Network
    • Unet: image segmentation
  • Video Understanding
    • I3D: video understanding
    • Slowfast: video recognition
    • ActionFormer: temporal action localization
  • Generative models
    • Autoregressive model
    • Variational Auto-Encoder
    • Generative Adversarial Network
    • Diffusion Models
    • 3D Face Reconstruction
Powered by GitBook
On this page
  1. Generative models

Variational Auto-Encoder

VAE

PreviousAutoregressive modelNextGenerative Adversarial Network

Last updated 1 year ago

Also known as the reconstruction loss. Basically, it means that "input data" should be likely under the final distribution output from the model

Encoder output (approximate posterior) should match approximate a Gaussian distribution (prior) that we defined.

Regular encoder
The true data(image) distribution. Intractable because of the final term.
variational lower bound (ELBo): the term that VAE tries to maximize.