Computer Vision
  • Introduction
    • Neural network basics
    • MLE and Cross Entropy
    • Convolution basics
    • Neural Network Categories
  • 2D Backbones
    • ResNet
    • Transformer
      • Recurrent Neural Network
      • Vision Transformer
      • SwinTransformer
  • Methods for Object Detection
  • Object Detection
    • The R-CNN family
    • ROI pool & ROI align
    • FCOS
    • Object Detection in Detectron2
  • Segmentation
    • Fully Convolutional Network
    • Unet: image segmentation
  • Video Understanding
    • I3D: video understanding
    • Slowfast: video recognition
    • ActionFormer: temporal action localization
  • Generative models
    • Autoregressive model
    • Variational Auto-Encoder
    • Generative Adversarial Network
    • Diffusion Models
    • 3D Face Reconstruction
Powered by GitBook
On this page
  • Model parameter initialization
  • Weight Decay
  • Batch normalization
  • Activation functions
  1. Introduction

Neural network basics

These are the fundamentals of neural networks

PreviousIntroductionNextMLE and Cross Entropy

Last updated 1 year ago

Model parameter initialization

Xavier initialization:

Weight Decay

Batch normalization

Activation functions

Given that the variance of input is gamma and the variance of weight is sigma, this is the mean and variance of the output
Since we need to satisfy unit variance for both forward and backward, we compromise by satisfying this equation
Penalize large weight vectors by adding weight to the overall loss.
1) Use "batch mean" and "batch variance" to normalize each layer. 2) Only use moving mean and moving variance while doing inference. 3) Update moving mean and variance with batch mean and variance (weighted by momentum) 4) Update gamma and beta through back propagation.