The R-CNN family
R-CNN, Fast R-CNN, Faster R-CNN, and Mask R-CNN
Last updated
R-CNN, Fast R-CNN, Faster R-CNN, and Mask R-CNN
Last updated
Region Proposal Generator: Selective Search , Edge Boxes
Image descriptors: histogram of oriented gradients (HOG)
Classifier: Support Vector Machine
Extract features using a CNN network
Faster R-CNN builds a network that has only a single stage:
Input image is fed into a pretrained CNN network to get a feature map
Input image is also used to propose region proposals
Project the region proposals on the feature map yields region of interest (ROI)
ROI pooling layers are used to extract a fixed-length feature vector from those ROIs
Fast R-CNN is faster than R-CNN because:
Faster R-CNN shares computations (i.e. convolutional layer calculations) across all proposals (i.e. ROIs) rather than doing the calculations for each proposal independently.
Fast R-CNN does not cache the extracted features and thus does not need so much disk storage compared to R-CNN, which needs hundreds of gigabytes.
Changes the selective search portion in Fast R-CNN into a Regional Proposal Network.
Replaces the region of interest pooling layer with the region of interest (RoI) alignment layer
More suitable for pixel-level prediction (semantic segmentation)