Commit b23eee0c authored by Stzpz's avatar Stzpz Committed by Francisco Massa

Supported FBNet architecture. (#463)

* Supported any feature map size for average pool.
* Different models may have different feature map size.

* Used registry to register keypoint and mask heads.

* Passing in/out channels between modules when creating the model.

Passing in/out channels between modules when creating the model. This simplifies the code to compute the input channels for feature extractors and makes the predictors independent of the backbone architectures.
* Passed in_channels to rpn and head builders.
* Set out_channels to model modules including backbone and feature extractors.
* Moved cfg.MODEL.BACKBONE.OUT_CHANNELS to cfg.MODEL.RESNETS.BACKBONE_OUT_CHANNELS as it is not used by all architectures. Updated config files accordingly.

For new architecture modules, the return module needs to contain a field called 'out_channels' to indicate the output channel size.

* Added unit test for box_coder and nms.

* Added FBNet architecture.

* FBNet is a general architecture definition to support efficient architecture search and MaskRCNN2GO.
* Included various efficient building blocks (inverted residual, shuffle, separate dw conv, dw upsampling etc.)
* Supported building backbone, rpn, detection, keypoint and mask heads using efficient building blocks.
* Architecture could be defined in `fbnet_modeldef.py` or in `cfg.MODEL.FBNET.ARCH_DEF` directly.
* A few baseline architectures are included.

* Added various unit tests.

* build and run backbones.
* build and run feature extractors.
* build and run predictors.

* Added a unit test to verify all config files are loadable.
parent 192261db
......@@ -3,7 +3,8 @@ MODEL:
WEIGHT: "catalog://Caffe2Detectron/COCO/35857890/e2e_faster_rcnn_R-101-FPN_1x"
BACKBONE:
CONV_BODY: "R-101-FPN"
OUT_CHANNELS: 256
RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN:
USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64)
......
......@@ -3,7 +3,8 @@ MODEL:
WEIGHT: "catalog://Caffe2Detectron/COCO/35857345/e2e_faster_rcnn_R-50-FPN_1x"
BACKBONE:
CONV_BODY: "R-50-FPN"
OUT_CHANNELS: 256
RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN:
USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64)
......
......@@ -3,7 +3,8 @@ MODEL:
WEIGHT: "catalog://Caffe2Detectron/COCO/36761737/e2e_faster_rcnn_X-101-32x8d-FPN_1x"
BACKBONE:
CONV_BODY: "R-101-FPN"
OUT_CHANNELS: 256
RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN:
USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64)
......
......@@ -3,7 +3,8 @@ MODEL:
WEIGHT: "catalog://Caffe2Detectron/COCO/37697547/e2e_keypoint_rcnn_R-50-FPN_1x"
BACKBONE:
CONV_BODY: "R-50-FPN"
OUT_CHANNELS: 256
RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN:
USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64)
......
......@@ -3,7 +3,8 @@ MODEL:
WEIGHT: "catalog://Caffe2Detectron/COCO/35861795/e2e_mask_rcnn_R-101-FPN_1x"
BACKBONE:
CONV_BODY: "R-101-FPN"
OUT_CHANNELS: 256
RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN:
USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64)
......
......@@ -3,7 +3,8 @@ MODEL:
WEIGHT: "catalog://Caffe2Detectron/COCO/35858933/e2e_mask_rcnn_R-50-FPN_1x"
BACKBONE:
CONV_BODY: "R-50-FPN"
OUT_CHANNELS: 256
RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN:
USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64)
......
......@@ -3,7 +3,8 @@ MODEL:
WEIGHT: "catalog://Caffe2Detectron/COCO/37129812/e2e_mask_rcnn_X-152-32x8d-FPN-IN5k_1.44x"
BACKBONE:
CONV_BODY: "R-152-FPN"
OUT_CHANNELS: 256
RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN:
USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64)
......
......@@ -3,7 +3,11 @@ MODEL:
WEIGHT: "catalog://Caffe2Detectron/COCO/36761843/e2e_mask_rcnn_X-101-32x8d-FPN_1x"
BACKBONE:
CONV_BODY: "R-101-FPN"
OUT_CHANNELS: 256
RESNETS:
BACKBONE_OUT_CHANNELS: 256
STRIDE_IN_1X1: False
NUM_GROUPS: 32
WIDTH_PER_GROUP: 8
RPN:
USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64)
......@@ -27,10 +31,6 @@ MODEL:
POOLER_SAMPLING_RATIO: 2
RESOLUTION: 28
SHARE_BOX_FEATURE_EXTRACTOR: False
RESNETS:
STRIDE_IN_1X1: False
NUM_GROUPS: 32
WIDTH_PER_GROUP: 8
MASK_ON: True
DATASETS:
TEST: ("coco_2014_minival",)
......
......@@ -3,7 +3,8 @@ MODEL:
WEIGHT: "catalog://ImageNetPretrained/MSRA/R-50"
BACKBONE:
CONV_BODY: "R-50-FPN"
OUT_CHANNELS: 256
RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN:
USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64)
......
......@@ -3,7 +3,8 @@ MODEL:
WEIGHT: "catalog://ImageNetPretrained/MSRA/R-50"
BACKBONE:
CONV_BODY: "R-50-FPN"
OUT_CHANNELS: 256
RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN:
USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64)
......
......@@ -3,7 +3,8 @@ MODEL:
WEIGHT: "catalog://ImageNetPretrained/MSRA/R-101"
BACKBONE:
CONV_BODY: "R-101-FPN"
OUT_CHANNELS: 256
RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN:
USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64)
......
......@@ -3,7 +3,8 @@ MODEL:
WEIGHT: "catalog://ImageNetPretrained/MSRA/R-50"
BACKBONE:
CONV_BODY: "R-50-FPN"
OUT_CHANNELS: 256
RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN:
USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64)
......
......@@ -3,7 +3,6 @@ MODEL:
WEIGHT: "catalog://ImageNetPretrained/FAIR/20171220/X-101-32x8d"
BACKBONE:
CONV_BODY: "R-101-FPN"
OUT_CHANNELS: 256
RPN:
USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64)
......@@ -20,6 +19,7 @@ MODEL:
FEATURE_EXTRACTOR: "FPN2MLPFeatureExtractor"
PREDICTOR: "FPNPredictor"
RESNETS:
BACKBONE_OUT_CHANNELS: 256
STRIDE_IN_1X1: False
NUM_GROUPS: 32
WIDTH_PER_GROUP: 8
......
MODEL:
META_ARCHITECTURE: "GeneralizedRCNN"
BACKBONE:
CONV_BODY: FBNet
FBNET:
ARCH: "default"
BN_TYPE: "bn"
WIDTH_DIVISOR: 8
DW_CONV_SKIP_BN: True
DW_CONV_SKIP_RELU: True
RPN:
ANCHOR_SIZES: (16, 32, 64, 128, 256)
ANCHOR_STRIDE: (16, )
BATCH_SIZE_PER_IMAGE: 256
PRE_NMS_TOP_N_TRAIN: 6000
PRE_NMS_TOP_N_TEST: 6000
POST_NMS_TOP_N_TRAIN: 2000
POST_NMS_TOP_N_TEST: 1000
RPN_HEAD: FBNet.rpn_head
ROI_HEADS:
BATCH_SIZE_PER_IMAGE: 512
ROI_BOX_HEAD:
POOLER_RESOLUTION: 6
FEATURE_EXTRACTOR: FBNet.roi_head
NUM_CLASSES: 81
DATASETS:
TRAIN: ("coco_2014_train", "coco_2014_valminusminival")
TEST: ("coco_2014_minival",)
SOLVER:
BASE_LR: 0.06
WARMUP_FACTOR: 0.1
WEIGHT_DECAY: 0.0001
STEPS: (60000, 80000)
MAX_ITER: 90000
IMS_PER_BATCH: 128 # for 8GPUs
# TEST:
# IMS_PER_BATCH: 8
INPUT:
MIN_SIZE_TRAIN: (320, )
MAX_SIZE_TRAIN: 640
MIN_SIZE_TEST: 320
MAX_SIZE_TEST: 640
PIXEL_MEAN: [103.53, 116.28, 123.675]
PIXEL_STD: [57.375, 57.12, 58.395]
MODEL:
META_ARCHITECTURE: "GeneralizedRCNN"
BACKBONE:
CONV_BODY: FBNet
FBNET:
ARCH: "default"
BN_TYPE: "bn"
WIDTH_DIVISOR: 8
DW_CONV_SKIP_BN: True
DW_CONV_SKIP_RELU: True
RPN:
ANCHOR_SIZES: (32, 64, 128, 256, 512)
ANCHOR_STRIDE: (16, )
BATCH_SIZE_PER_IMAGE: 256
PRE_NMS_TOP_N_TRAIN: 6000
PRE_NMS_TOP_N_TEST: 6000
POST_NMS_TOP_N_TRAIN: 2000
POST_NMS_TOP_N_TEST: 1000
RPN_HEAD: FBNet.rpn_head
ROI_HEADS:
BATCH_SIZE_PER_IMAGE: 256
ROI_BOX_HEAD:
POOLER_RESOLUTION: 6
FEATURE_EXTRACTOR: FBNet.roi_head
NUM_CLASSES: 81
DATASETS:
TRAIN: ("coco_2014_train", "coco_2014_valminusminival")
TEST: ("coco_2014_minival",)
SOLVER:
BASE_LR: 0.06
WARMUP_FACTOR: 0.1
WEIGHT_DECAY: 0.0001
STEPS: (60000, 80000)
MAX_ITER: 90000
IMS_PER_BATCH: 128 # for 8GPUs
# TEST:
# IMS_PER_BATCH: 8
INPUT:
MIN_SIZE_TRAIN: (600, )
MAX_SIZE_TRAIN: 1000
MIN_SIZE_TEST: 600
MAX_SIZE_TEST: 1000
PIXEL_MEAN: [103.53, 116.28, 123.675]
PIXEL_STD: [57.375, 57.12, 58.395]
......@@ -3,7 +3,8 @@ MODEL:
WEIGHT: "catalog://ImageNetPretrained/MSRA/R-50"
BACKBONE:
CONV_BODY: "R-50-FPN"
OUT_CHANNELS: 256
RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN:
USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64)
......
......@@ -3,7 +3,8 @@ MODEL:
WEIGHT: "catalog://ImageNetPretrained/MSRA/R-101"
BACKBONE:
CONV_BODY: "R-101-FPN"
OUT_CHANNELS: 256
RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN:
USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64)
......
......@@ -3,7 +3,8 @@ MODEL:
WEIGHT: "catalog://ImageNetPretrained/MSRA/R-50"
BACKBONE:
CONV_BODY: "R-50-FPN"
OUT_CHANNELS: 256
RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN:
USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64)
......
......@@ -3,7 +3,11 @@ MODEL:
WEIGHT: "catalog://ImageNetPretrained/FAIR/20171220/X-101-32x8d"
BACKBONE:
CONV_BODY: "R-101-FPN"
OUT_CHANNELS: 256
RESNETS:
BACKBONE_OUT_CHANNELS: 256
STRIDE_IN_1X1: False
NUM_GROUPS: 32
WIDTH_PER_GROUP: 8
RPN:
USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64)
......@@ -27,10 +31,6 @@ MODEL:
POOLER_SAMPLING_RATIO: 2
RESOLUTION: 28
SHARE_BOX_FEATURE_EXTRACTOR: False
RESNETS:
STRIDE_IN_1X1: False
NUM_GROUPS: 32
WIDTH_PER_GROUP: 8
MASK_ON: True
DATASETS:
TRAIN: ("coco_2014_train", "coco_2014_valminusminival")
......
MODEL:
META_ARCHITECTURE: "GeneralizedRCNN"
BACKBONE:
CONV_BODY: FBNet
FBNET:
ARCH: "default"
BN_TYPE: "bn"
WIDTH_DIVISOR: 8
DW_CONV_SKIP_BN: True
DW_CONV_SKIP_RELU: True
DET_HEAD_LAST_SCALE: -1.0
RPN:
ANCHOR_SIZES: (16, 32, 64, 128, 256)
ANCHOR_STRIDE: (16, )
BATCH_SIZE_PER_IMAGE: 256
PRE_NMS_TOP_N_TRAIN: 6000
PRE_NMS_TOP_N_TEST: 6000
POST_NMS_TOP_N_TRAIN: 2000
POST_NMS_TOP_N_TEST: 1000
RPN_HEAD: FBNet.rpn_head
ROI_HEADS:
BATCH_SIZE_PER_IMAGE: 256
ROI_BOX_HEAD:
POOLER_RESOLUTION: 6
FEATURE_EXTRACTOR: FBNet.roi_head
NUM_CLASSES: 81
ROI_MASK_HEAD:
POOLER_RESOLUTION: 6
FEATURE_EXTRACTOR: FBNet.roi_head_mask
PREDICTOR: "MaskRCNNConv1x1Predictor"
RESOLUTION: 12
SHARE_BOX_FEATURE_EXTRACTOR: False
MASK_ON: True
DATASETS:
TRAIN: ("coco_2014_train", "coco_2014_valminusminival")
TEST: ("coco_2014_minival",)
SOLVER:
BASE_LR: 0.06
WARMUP_FACTOR: 0.1
WEIGHT_DECAY: 0.0001
STEPS: (60000, 80000)
MAX_ITER: 90000
IMS_PER_BATCH: 128 # for 8GPUs
# TEST:
# IMS_PER_BATCH: 8
INPUT:
MIN_SIZE_TRAIN: (320, )
MAX_SIZE_TRAIN: 640
MIN_SIZE_TEST: 320
MAX_SIZE_TEST: 640
PIXEL_MEAN: [103.53, 116.28, 123.675]
PIXEL_STD: [57.375, 57.12, 58.395]
MODEL:
META_ARCHITECTURE: "GeneralizedRCNN"
BACKBONE:
CONV_BODY: FBNet
FBNET:
ARCH: "xirb16d_dsmask"
BN_TYPE: "bn"
WIDTH_DIVISOR: 8
DW_CONV_SKIP_BN: True
DW_CONV_SKIP_RELU: True
DET_HEAD_LAST_SCALE: -1.0
RPN:
ANCHOR_SIZES: (16, 32, 64, 128, 256)
ANCHOR_STRIDE: (16, )
BATCH_SIZE_PER_IMAGE: 256
PRE_NMS_TOP_N_TRAIN: 6000
PRE_NMS_TOP_N_TEST: 6000
POST_NMS_TOP_N_TRAIN: 2000
POST_NMS_TOP_N_TEST: 1000
RPN_HEAD: FBNet.rpn_head
ROI_HEADS:
BATCH_SIZE_PER_IMAGE: 512
ROI_BOX_HEAD:
POOLER_RESOLUTION: 6
FEATURE_EXTRACTOR: FBNet.roi_head
NUM_CLASSES: 81
ROI_MASK_HEAD:
POOLER_RESOLUTION: 6
FEATURE_EXTRACTOR: FBNet.roi_head_mask
PREDICTOR: "MaskRCNNConv1x1Predictor"
RESOLUTION: 12
SHARE_BOX_FEATURE_EXTRACTOR: False
MASK_ON: True
DATASETS:
TRAIN: ("coco_2014_train", "coco_2014_valminusminival")
TEST: ("coco_2014_minival",)
SOLVER:
BASE_LR: 0.06
WARMUP_FACTOR: 0.1
WEIGHT_DECAY: 0.0001
STEPS: (60000, 80000)
MAX_ITER: 90000
IMS_PER_BATCH: 128 # for 8GPUs
# TEST:
# IMS_PER_BATCH: 8
INPUT:
MIN_SIZE_TRAIN: (320, )
MAX_SIZE_TRAIN: 640
MIN_SIZE_TEST: 320
MAX_SIZE_TEST: 640
PIXEL_MEAN: [103.53, 116.28, 123.675]
PIXEL_STD: [57.375, 57.12, 58.395]
INPUT:
MIN_SIZE_TRAIN: 800
MIN_SIZE_TRAIN: (800,)
MAX_SIZE_TRAIN: 1333
MIN_SIZE_TEST: 800
MAX_SIZE_TEST: 1333
......@@ -8,8 +8,8 @@ MODEL:
WEIGHT: "catalog://ImageNetPretrained/MSRA/R-50-GN"
BACKBONE:
CONV_BODY: "R-50-FPN"
OUT_CHANNELS: 256
RESNETS: # use GN for backbone
BACKBONE_OUT_CHANNELS: 256
STRIDE_IN_1X1: False
TRANS_FUNC: "BottleneckWithGN"
STEM_FUNC: "StemWithGN"
......
INPUT:
MIN_SIZE_TRAIN: 800
MIN_SIZE_TRAIN: (800,)
MAX_SIZE_TRAIN: 1333
MIN_SIZE_TEST: 800
MAX_SIZE_TEST: 1333
......@@ -8,8 +8,8 @@ MODEL:
WEIGHT: "catalog://ImageNetPretrained/MSRA/R-50-GN"
BACKBONE:
CONV_BODY: "R-50-FPN"
OUT_CHANNELS: 256
RESNETS: # use GN for backbone
BACKBONE_OUT_CHANNELS: 256
STRIDE_IN_1X1: False
TRANS_FUNC: "BottleneckWithGN"
STEM_FUNC: "StemWithGN"
......
INPUT:
MIN_SIZE_TRAIN: 800
MIN_SIZE_TRAIN: (800,)
MAX_SIZE_TRAIN: 1333
MIN_SIZE_TEST: 800
MAX_SIZE_TEST: 1333
......@@ -8,8 +8,8 @@ MODEL:
WEIGHT: "catalog://ImageNetPretrained/MSRA/R-50-GN"
BACKBONE:
CONV_BODY: "R-50-FPN"
OUT_CHANNELS: 256
RESNETS: # use GN for backbone
BACKBONE_OUT_CHANNELS: 256
STRIDE_IN_1X1: False
TRANS_FUNC: "BottleneckWithGN"
STEM_FUNC: "StemWithGN"
......
INPUT:
MIN_SIZE_TRAIN: 800
MIN_SIZE_TRAIN: (800,)
MAX_SIZE_TRAIN: 1333
MIN_SIZE_TEST: 800
MAX_SIZE_TEST: 1333
......@@ -8,8 +8,8 @@ MODEL:
WEIGHT: "catalog://ImageNetPretrained/MSRA/R-50-GN"
BACKBONE:
CONV_BODY: "R-50-FPN"
OUT_CHANNELS: 256
RESNETS: # use GN for backbone
BACKBONE_OUT_CHANNELS: 256
STRIDE_IN_1X1: False
TRANS_FUNC: "BottleneckWithGN"
STEM_FUNC: "StemWithGN"
......
INPUT:
MIN_SIZE_TRAIN: 800
MIN_SIZE_TRAIN: (800,)
MAX_SIZE_TRAIN: 1333
MIN_SIZE_TEST: 800
MAX_SIZE_TEST: 1333
......@@ -8,9 +8,9 @@ MODEL:
WEIGHT: "" # no pretrained model
BACKBONE:
CONV_BODY: "R-50-FPN"
OUT_CHANNELS: 256
FREEZE_CONV_BODY_AT: 0 # finetune all layers
RESNETS: # use GN for backbone
BACKBONE_OUT_CHANNELS: 256
STRIDE_IN_1X1: False
TRANS_FUNC: "BottleneckWithGN"
STEM_FUNC: "StemWithGN"
......
INPUT:
MIN_SIZE_TRAIN: 800
MIN_SIZE_TRAIN: (800,)
MAX_SIZE_TRAIN: 1333
MIN_SIZE_TEST: 800
MAX_SIZE_TEST: 1333
......@@ -8,9 +8,9 @@ MODEL:
WEIGHT: "" # no pretrained model
BACKBONE:
CONV_BODY: "R-50-FPN"
OUT_CHANNELS: 256
FREEZE_CONV_BODY_AT: 0 # finetune all layers
RESNETS: # use GN for backbone
BACKBONE_OUT_CHANNELS: 256
STRIDE_IN_1X1: False
TRANS_FUNC: "BottleneckWithGN"
STEM_FUNC: "StemWithGN"
......
INPUT:
MIN_SIZE_TRAIN: 800
MIN_SIZE_TRAIN: (800,)
MAX_SIZE_TRAIN: 1333
MIN_SIZE_TEST: 800
MAX_SIZE_TEST: 1333
......@@ -8,9 +8,9 @@ MODEL:
WEIGHT: "" # no pretrained model
BACKBONE:
CONV_BODY: "R-50-FPN"
OUT_CHANNELS: 256
FREEZE_CONV_BODY_AT: 0 # finetune all layers
RESNETS: # use GN for backbone
BACKBONE_OUT_CHANNELS: 256
STRIDE_IN_1X1: False
TRANS_FUNC: "BottleneckWithGN"
STEM_FUNC: "StemWithGN"
......
INPUT:
MIN_SIZE_TRAIN: 800
MIN_SIZE_TRAIN: (800,)
MAX_SIZE_TRAIN: 1333
MIN_SIZE_TEST: 800
MAX_SIZE_TEST: 1333
......@@ -8,9 +8,9 @@ MODEL:
WEIGHT: "" # no pretrained model
BACKBONE:
CONV_BODY: "R-50-FPN"
OUT_CHANNELS: 256
FREEZE_CONV_BODY_AT: 0 # finetune all layers
RESNETS: # use GN for backbone
BACKBONE_OUT_CHANNELS: 256
STRIDE_IN_1X1: False
TRANS_FUNC: "BottleneckWithGN"
STEM_FUNC: "StemWithGN"
......
......@@ -3,7 +3,8 @@ MODEL:
WEIGHT: "catalog://ImageNetPretrained/MSRA/R-50"
BACKBONE:
CONV_BODY: "R-50-FPN"
OUT_CHANNELS: 256
RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN:
USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64)
......
......@@ -3,7 +3,8 @@ MODEL:
WEIGHT: "catalog://ImageNetPretrained/MSRA/R-50"
BACKBONE:
CONV_BODY: "R-50-FPN"
OUT_CHANNELS: 256
RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN:
USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64)
......
......@@ -3,7 +3,8 @@ MODEL:
WEIGHT: "catalog://ImageNetPretrained/FAIR/20171220/X-101-32x8d"
BACKBONE:
CONV_BODY: "R-101-FPN"
OUT_CHANNELS: 256
RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN:
USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64)
......
......@@ -3,7 +3,8 @@ MODEL:
WEIGHT: "catalog://ImageNetPretrained/MSRA/R-50"
BACKBONE:
CONV_BODY: "R-50-FPN"
OUT_CHANNELS: 256
RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN:
USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64)
......
......@@ -3,7 +3,8 @@ MODEL:
WEIGHT: "catalog://ImageNetPretrained/MSRA/R-50"
BACKBONE:
CONV_BODY: "R-50-FPN"
OUT_CHANNELS: 256
RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN:
USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64)
......
......@@ -3,7 +3,8 @@ MODEL:
WEIGHT: "catalog://ImageNetPretrained/FAIR/20171220/X-101-32x8d"
BACKBONE:
CONV_BODY: "R-101-FPN"
OUT_CHANNELS: 256
RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN:
USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64)
......
......@@ -4,7 +4,8 @@ MODEL:
RPN_ONLY: True
BACKBONE:
CONV_BODY: "R-50-FPN"
OUT_CHANNELS: 256
RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN:
USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64)
......
......@@ -5,7 +5,8 @@ MODEL:
RETINANET_ON: True
BACKBONE:
CONV_BODY: "R-101-FPN-RETINANET"
OUT_CHANNELS: 256
RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN:
USE_FPN: True
FG_IOU_THRESHOLD: 0.5
......
......@@ -5,7 +5,8 @@ MODEL:
RETINANET_ON: True
BACKBONE:
CONV_BODY: "R-101-FPN-RETINANET"
OUT_CHANNELS: 256
RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN:
USE_FPN: True
FG_IOU_THRESHOLD: 0.5
......
......@@ -5,7 +5,8 @@ MODEL:
RETINANET_ON: True
BACKBONE:
CONV_BODY: "R-50-FPN-RETINANET"
OUT_CHANNELS: 256
RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN:
USE_FPN: True
FG_IOU_THRESHOLD: 0.5
......
......@@ -5,7 +5,8 @@ MODEL:
RETINANET_ON: True
BACKBONE:
CONV_BODY: "R-50-FPN-RETINANET"
OUT_CHANNELS: 256
RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN:
USE_FPN: True
FG_IOU_THRESHOLD: 0.5
......
......@@ -5,7 +5,8 @@ MODEL:
RETINANET_ON: True
BACKBONE:
CONV_BODY: "R-50-FPN-RETINANET"
OUT_CHANNELS: 256
RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN:
USE_FPN: True
FG_IOU_THRESHOLD: 0.5
......
......@@ -5,7 +5,8 @@ MODEL:
RETINANET_ON: True
BACKBONE:
CONV_BODY: "R-101-FPN-RETINANET"
OUT_CHANNELS: 256
RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN:
USE_FPN: True
FG_IOU_THRESHOLD: 0.5
......
......@@ -4,7 +4,8 @@ MODEL:
RPN_ONLY: True
BACKBONE:
CONV_BODY: "R-101-FPN"
OUT_CHANNELS: 256
RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN:
USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64)
......
......@@ -4,7 +4,8 @@ MODEL:
RPN_ONLY: True
BACKBONE:
CONV_BODY: "R-50-FPN"
OUT_CHANNELS: 256
RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN:
USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64)
......
......@@ -4,7 +4,8 @@ MODEL:
RPN_ONLY: True
BACKBONE:
CONV_BODY: "R-101-FPN"
OUT_CHANNELS: 256
RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN:
USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64)
......
......@@ -92,7 +92,6 @@ _C.MODEL.BACKBONE.CONV_BODY = "R-50-C4"
# Add StopGrad at a specified stage so the bottom layers are frozen
_C.MODEL.BACKBONE.FREEZE_CONV_BODY_AT = 2
_C.MODEL.BACKBONE.OUT_CHANNELS = 256 * 4
# GN for backbone
_C.MODEL.BACKBONE.USE_GN = False
......@@ -271,6 +270,7 @@ _C.MODEL.RESNETS.STEM_FUNC = "StemWithFixedBatchNorm"
# Apply dilation in stage "res5"
_C.MODEL.RESNETS.RES5_DILATION = 1
_C.MODEL.RESNETS.BACKBONE_OUT_CHANNELS = 256 * 4
_C.MODEL.RESNETS.RES2_OUT_CHANNELS = 256
_C.MODEL.RESNETS.STEM_OUT_CHANNELS = 64
......@@ -335,6 +335,44 @@ _C.MODEL.RETINANET.INFERENCE_TH = 0.05
# NMS threshold used in RetinaNet
_C.MODEL.RETINANET.NMS_TH = 0.4
# ---------------------------------------------------------------------------- #
# FBNet options
# ---------------------------------------------------------------------------- #
_C.MODEL.FBNET = CN()
_C.MODEL.FBNET.ARCH = "default"
# custom arch
_C.MODEL.FBNET.ARCH_DEF = ""
_C.MODEL.FBNET.BN_TYPE = "bn"
_C.MODEL.FBNET.SCALE_FACTOR = 1.0
# the output channels will be divisible by WIDTH_DIVISOR
_C.MODEL.FBNET.WIDTH_DIVISOR = 1
_C.MODEL.FBNET.DW_CONV_SKIP_BN = True
_C.MODEL.FBNET.DW_CONV_SKIP_RELU = True
# > 0 scale, == 0 skip, < 0 same dimension
_C.MODEL.FBNET.DET_HEAD_LAST_SCALE = 1.0
_C.MODEL.FBNET.DET_HEAD_BLOCKS = []
# overwrite the stride for the head, 0 to use original value
_C.MODEL.FBNET.DET_HEAD_STRIDE = 0
# > 0 scale, == 0 skip, < 0 same dimension
_C.MODEL.FBNET.KPTS_HEAD_LAST_SCALE = 0.0
_C.MODEL.FBNET.KPTS_HEAD_BLOCKS = []
# overwrite the stride for the head, 0 to use original value
_C.MODEL.FBNET.KPTS_HEAD_STRIDE = 0
# > 0 scale, == 0 skip, < 0 same dimension
_C.MODEL.FBNET.MASK_HEAD_LAST_SCALE = 0.0
_C.MODEL.FBNET.MASK_HEAD_BLOCKS = []
# overwrite the stride for the head, 0 to use original value
_C.MODEL.FBNET.MASK_HEAD_STRIDE = 0
# 0 to use all blocks defined in arch_def
_C.MODEL.FBNET.RPN_HEAD_BLOCKS = 0
_C.MODEL.FBNET.RPN_BN_TYPE = ""
# ---------------------------------------------------------------------------- #
# Solver
# ---------------------------------------------------------------------------- #
......
......@@ -4,6 +4,7 @@ import torch
from .batch_norm import FrozenBatchNorm2d
from .misc import Conv2d
from .misc import ConvTranspose2d
from .misc import BatchNorm2d
from .misc import interpolate
from .nms import nms
from .roi_align import ROIAlign
......@@ -15,6 +16,6 @@ from .sigmoid_focal_loss import SigmoidFocalLoss
__all__ = ["nms", "roi_align", "ROIAlign", "roi_pool", "ROIPool",
"smooth_l1_loss", "Conv2d", "ConvTranspose2d", "interpolate",
"FrozenBatchNorm2d", "SigmoidFocalLoss"
"BatchNorm2d", "FrozenBatchNorm2d", "SigmoidFocalLoss"
]
......@@ -26,7 +26,6 @@ class _NewEmptyTensorOp(torch.autograd.Function):
return _NewEmptyTensorOp.apply(grad, shape), None
class Conv2d(torch.nn.Conv2d):
def forward(self, x):
if x.numel() > 0:
......@@ -64,6 +63,15 @@ class ConvTranspose2d(torch.nn.ConvTranspose2d):
return _NewEmptyTensorOp.apply(x, output_shape)
class BatchNorm2d(torch.nn.BatchNorm2d):
def forward(self, x):
if x.numel() > 0:
return super(BatchNorm2d, self).forward(x)
# get output shape
output_shape = x.shape
return _NewEmptyTensorOp.apply(x, output_shape)
def interpolate(
input, size=None, scale_factor=None, mode="nearest", align_corners=None
):
......
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
from .backbone import build_backbone
from . import fbnet
......@@ -16,6 +16,7 @@ from . import resnet
def build_resnet_backbone(cfg):
body = resnet.ResNet(cfg)
model = nn.Sequential(OrderedDict([("body", body)]))
model.out_channels = cfg.MODEL.RESNETS.BACKBONE_OUT_CHANNELS
return model
......@@ -25,7 +26,7 @@ def build_resnet_backbone(cfg):
def build_resnet_fpn_backbone(cfg):
body = resnet.ResNet(cfg)
in_channels_stage2 = cfg.MODEL.RESNETS.RES2_OUT_CHANNELS
out_channels = cfg.MODEL.BACKBONE.OUT_CHANNELS
out_channels = cfg.MODEL.RESNETS.BACKBONE_OUT_CHANNELS
fpn = fpn_module.FPN(
in_channels_list=[
in_channels_stage2,
......@@ -40,14 +41,16 @@ def build_resnet_fpn_backbone(cfg):
top_blocks=fpn_module.LastLevelMaxPool(),
)
model = nn.Sequential(OrderedDict([("body", body), ("fpn", fpn)]))
model.out_channels = out_channels
return model
@registry.BACKBONES.register("R-50-FPN-RETINANET")
@registry.BACKBONES.register("R-101-FPN-RETINANET")
def build_resnet_fpn_p3p7_backbone(cfg):
body = resnet.ResNet(cfg)
in_channels_stage2 = cfg.MODEL.RESNETS.RES2_OUT_CHANNELS
out_channels = cfg.MODEL.BACKBONE.OUT_CHANNELS
out_channels = cfg.MODEL.RESNETS.BACKBONE_OUT_CHANNELS
in_channels_p6p7 = in_channels_stage2 * 8 if cfg.MODEL.RETINANET.USE_C5 \
else out_channels
fpn = fpn_module.FPN(
......@@ -64,8 +67,10 @@ def build_resnet_fpn_p3p7_backbone(cfg):
top_blocks=fpn_module.LastLevelP6P7(in_channels_p6p7, out_channels),
)
model = nn.Sequential(OrderedDict([("body", body), ("fpn", fpn)]))
model.out_channels = out_channels
return model
def build_backbone(cfg):
assert cfg.MODEL.BACKBONE.CONV_BODY in registry.BACKBONES, \
"cfg.MODEL.BACKBONE.CONV_BODY: {} are not registered in registry".format(
......
from __future__ import absolute_import, division, print_function, unicode_literals
import copy
import json
import logging
from collections import OrderedDict
from . import (
fbnet_builder as mbuilder,
fbnet_modeldef as modeldef,
)
import torch.nn as nn
from maskrcnn_benchmark.modeling import registry
from maskrcnn_benchmark.modeling.rpn import rpn
from maskrcnn_benchmark.modeling import poolers
logger = logging.getLogger(__name__)
def create_builder(cfg):
bn_type = cfg.MODEL.FBNET.BN_TYPE
if bn_type == "gn":
bn_type = (bn_type, cfg.GROUP_NORM.NUM_GROUPS)
factor = cfg.MODEL.FBNET.SCALE_FACTOR
arch = cfg.MODEL.FBNET.ARCH
arch_def = cfg.MODEL.FBNET.ARCH_DEF
if len(arch_def) > 0:
arch_def = json.loads(arch_def)
if arch in modeldef.MODEL_ARCH:
if len(arch_def) > 0:
assert (
arch_def == modeldef.MODEL_ARCH[arch]
), "Two architectures with the same name {},\n{},\n{}".format(
arch, arch_def, modeldef.MODEL_ARCH[arch]
)
arch_def = modeldef.MODEL_ARCH[arch]
else:
assert arch_def is not None and len(arch_def) > 0
arch_def = mbuilder.unify_arch_def(arch_def)
rpn_stride = arch_def.get("rpn_stride", None)
if rpn_stride is not None:
assert (
cfg.MODEL.RPN.ANCHOR_STRIDE[0] == rpn_stride
), "Needs to set cfg.MODEL.RPN.ANCHOR_STRIDE to {}, got {}".format(
rpn_stride, cfg.MODEL.RPN.ANCHOR_STRIDE
)
width_divisor = cfg.MODEL.FBNET.WIDTH_DIVISOR
dw_skip_bn = cfg.MODEL.FBNET.DW_CONV_SKIP_BN
dw_skip_relu = cfg.MODEL.FBNET.DW_CONV_SKIP_RELU
logger.info(
"Building fbnet model with arch {} (without scaling):\n{}".format(
arch, arch_def
)
)
builder = mbuilder.FBNetBuilder(
width_ratio=factor,
bn_type=bn_type,
width_divisor=width_divisor,
dw_skip_bn=dw_skip_bn,
dw_skip_relu=dw_skip_relu,
)
return builder, arch_def
def _get_trunk_cfg(arch_def):
""" Get all stages except the last one """
num_stages = mbuilder.get_num_stages(arch_def)
trunk_stages = arch_def.get("backbone", range(num_stages - 1))
ret = mbuilder.get_blocks(arch_def, stage_indices=trunk_stages)
return ret
class FBNetTrunk(nn.Module):
def __init__(
self, builder, arch_def, dim_in,
):
super(FBNetTrunk, self).__init__()
self.first = builder.add_first(arch_def["first"], dim_in=dim_in)
trunk_cfg = _get_trunk_cfg(arch_def)
self.stages = builder.add_blocks(trunk_cfg["stages"])
# return features for each stage
def forward(self, x):
y = self.first(x)
y = self.stages(y)
ret = [y]
return ret
@registry.BACKBONES.register("FBNet")
def add_conv_body(cfg, dim_in=3):
builder, arch_def = create_builder(cfg)
body = FBNetTrunk(builder, arch_def, dim_in)
model = nn.Sequential(OrderedDict([("body", body)]))
model.out_channels = builder.last_depth
return model
def _get_rpn_stage(arch_def, num_blocks):
rpn_stage = arch_def.get("rpn")
ret = mbuilder.get_blocks(arch_def, stage_indices=rpn_stage)
if num_blocks > 0:
logger.warn('Use last {} blocks in {} as rpn'.format(num_blocks, ret))
block_count = len(ret["stages"])
assert num_blocks <= block_count, "use block {}, block count {}".format(
num_blocks, block_count
)
blocks = range(block_count - num_blocks, block_count)
ret = mbuilder.get_blocks(ret, block_indices=blocks)
return ret["stages"]
class FBNetRPNHead(nn.Module):
def __init__(
self, cfg, in_channels, builder, arch_def,
):
super(FBNetRPNHead, self).__init__()
assert in_channels == builder.last_depth
rpn_bn_type = cfg.MODEL.FBNET.RPN_BN_TYPE
if len(rpn_bn_type) > 0:
builder.bn_type = rpn_bn_type
use_blocks = cfg.MODEL.FBNET.RPN_HEAD_BLOCKS
stages = _get_rpn_stage(arch_def, use_blocks)
self.head = builder.add_blocks(stages)
self.out_channels = builder.last_depth
def forward(self, x):
x = [self.head(y) for y in x]
return x
@registry.RPN_HEADS.register("FBNet.rpn_head")
def add_rpn_head(cfg, in_channels, num_anchors):
builder, model_arch = create_builder(cfg)
builder.last_depth = in_channels
assert in_channels == builder.last_depth
# builder.name_prefix = "[rpn]"
rpn_feature = FBNetRPNHead(cfg, in_channels, builder, model_arch)
rpn_regressor = rpn.RPNHeadConvRegressor(
cfg, rpn_feature.out_channels, num_anchors)
return nn.Sequential(rpn_feature, rpn_regressor)
def _get_head_stage(arch, head_name, blocks):
# use default name 'head' if the specific name 'head_name' does not existed
if head_name not in arch:
head_name = "head"
head_stage = arch.get(head_name)
ret = mbuilder.get_blocks(arch, stage_indices=head_stage, block_indices=blocks)
return ret["stages"]
# name mapping for head names in arch def and cfg
ARCH_CFG_NAME_MAPPING = {
"bbox": "ROI_BOX_HEAD",
"kpts": "ROI_KEYPOINT_HEAD",
"mask": "ROI_MASK_HEAD",
}
class FBNetROIHead(nn.Module):
def __init__(
self, cfg, in_channels, builder, arch_def,
head_name, use_blocks, stride_init, last_layer_scale,
):
super(FBNetROIHead, self).__init__()
assert in_channels == builder.last_depth
assert isinstance(use_blocks, list)
head_cfg_name = ARCH_CFG_NAME_MAPPING[head_name]
self.pooler = poolers.make_pooler(cfg, head_cfg_name)
stage = _get_head_stage(arch_def, head_name, use_blocks)
assert stride_init in [0, 1, 2]
if stride_init != 0:
stage[0]["block"][3] = stride_init
blocks = builder.add_blocks(stage)
last_info = copy.deepcopy(arch_def["last"])
last_info[1] = last_layer_scale
last = builder.add_last(last_info)
self.head = nn.Sequential(OrderedDict([
("blocks", blocks),
("last", last)
]))
# output_blob = builder.add_final_pool(
# # model, output_blob, kernel_size=cfg.FAST_RCNN.ROI_XFORM_RESOLUTION)
# model,
# output_blob,
# kernel_size=int(cfg.FAST_RCNN.ROI_XFORM_RESOLUTION / stride_init),
# )
self.out_channels = builder.last_depth
def forward(self, x, proposals):
x = self.pooler(x, proposals)
x = self.head(x)
return x
@registry.ROI_BOX_FEATURE_EXTRACTORS.register("FBNet.roi_head")
def add_roi_head(cfg, in_channels):
builder, model_arch = create_builder(cfg)
builder.last_depth = in_channels
# builder.name_prefix = "_[bbox]_"
return FBNetROIHead(
cfg, in_channels, builder, model_arch,
head_name="bbox",
use_blocks=cfg.MODEL.FBNET.DET_HEAD_BLOCKS,
stride_init=cfg.MODEL.FBNET.DET_HEAD_STRIDE,
last_layer_scale=cfg.MODEL.FBNET.DET_HEAD_LAST_SCALE,
)
@registry.ROI_KEYPOINT_FEATURE_EXTRACTORS.register("FBNet.roi_head_keypoints")
def add_roi_head_keypoints(cfg, in_channels):
builder, model_arch = create_builder(cfg)
builder.last_depth = in_channels
# builder.name_prefix = "_[kpts]_"
return FBNetROIHead(
cfg, in_channels, builder, model_arch,
head_name="kpts",
use_blocks=cfg.MODEL.FBNET.KPTS_HEAD_BLOCKS,
stride_init=cfg.MODEL.FBNET.KPTS_HEAD_STRIDE,
last_layer_scale=cfg.MODEL.FBNET.KPTS_HEAD_LAST_SCALE,
)
@registry.ROI_MASK_FEATURE_EXTRACTORS.register("FBNet.roi_head_mask")
def add_roi_head_mask(cfg, in_channels):
builder, model_arch = create_builder(cfg)
builder.last_depth = in_channels
# builder.name_prefix = "_[mask]_"
return FBNetROIHead(
cfg, in_channels, builder, model_arch,
head_name="mask",
use_blocks=cfg.MODEL.FBNET.MASK_HEAD_BLOCKS,
stride_init=cfg.MODEL.FBNET.MASK_HEAD_STRIDE,
last_layer_scale=cfg.MODEL.FBNET.MASK_HEAD_LAST_SCALE,
)
This diff is collapsed.
from __future__ import absolute_import, division, print_function, unicode_literals
def add_archs(archs):
global MODEL_ARCH
for x in archs:
assert x not in MODEL_ARCH, "Duplicated model name {} existed".format(x)
MODEL_ARCH[x] = archs[x]
MODEL_ARCH = {
"default": {
"block_op_type": [
# stage 0
["ir_k3"],
# stage 1
["ir_k3"] * 2,
# stage 2
["ir_k3"] * 3,
# stage 3
["ir_k3"] * 7,
# stage 4, bbox head
["ir_k3"] * 4,
# stage 5, rpn
["ir_k3"] * 3,
# stage 5, mask head
["ir_k3"] * 5,
],
"block_cfg": {
"first": [32, 2],
"stages": [
# [t, c, n, s]
# stage 0
[[1, 16, 1, 1]],
# stage 1
[[6, 24, 2, 2]],
# stage 2
[[6, 32, 3, 2]],
# stage 3
[[6, 64, 4, 2], [6, 96, 3, 1]],
# stage 4, bbox head
[[4, 160, 1, 2], [6, 160, 2, 1], [6, 240, 1, 1]],
# [[6, 160, 3, 2], [6, 320, 1, 1]],
# stage 5, rpn head
[[6, 96, 3, 1]],
# stage 6, mask head
[[4, 160, 1, 1], [6, 160, 3, 1], [3, 80, 1, -2]],
],
# [c, channel_scale]
"last": [1280, 0.0],
"backbone": [0, 1, 2, 3],
"rpn": [5],
"bbox": [4],
"mask": [6],
},
},
"xirb16d_dsmask": {
"block_op_type": [
# stage 0
["ir_k3"],
# stage 1
["ir_k3"] * 2,
# stage 2
["ir_k3"] * 3,
# stage 3
["ir_k3"] * 7,
# stage 4, bbox head
["ir_k3"] * 4,
# stage 5, mask head
["ir_k3"] * 5,
# stage 6, rpn
["ir_k3"] * 3,
],
"block_cfg": {
"first": [16, 2],
"stages": [
# [t, c, n, s]
# stage 0
[[1, 16, 1, 1]],
# stage 1
[[6, 32, 2, 2]],
# stage 2
[[6, 48, 3, 2]],
# stage 3
[[6, 96, 4, 2], [6, 128, 3, 1]],
# stage 4, bbox head
[[4, 128, 1, 2], [6, 128, 2, 1], [6, 160, 1, 1]],
# stage 5, mask head
[[4, 128, 1, 2], [6, 128, 2, 1], [6, 128, 1, -2], [3, 64, 1, -2]],
# stage 6, rpn head
[[6, 128, 3, 1]],
],
# [c, channel_scale]
"last": [1280, 0.0],
"backbone": [0, 1, 2, 3],
"rpn": [6],
"bbox": [4],
"mask": [5],
},
},
"mobilenet_v2": {
"block_op_type": [
# stage 0
["ir_k3"],
# stage 1
["ir_k3"] * 2,
# stage 2
["ir_k3"] * 3,
# stage 3
["ir_k3"] * 7,
# stage 4
["ir_k3"] * 4,
],
"block_cfg": {
"first": [32, 2],
"stages": [
# [t, c, n, s]
# stage 0
[[1, 16, 1, 1]],
# stage 1
[[6, 24, 2, 2]],
# stage 2
[[6, 32, 3, 2]],
# stage 3
[[6, 64, 4, 2], [6, 96, 3, 1]],
# stage 4
[[6, 160, 3, 1], [6, 320, 1, 1]],
],
# [c, channel_scale]
"last": [1280, 0.0],
"backbone": [0, 1, 2, 3],
"bbox": [4],
},
},
}
......@@ -187,6 +187,7 @@ class ResNetHead(nn.Module):
stride = None
self.add_module(name, module)
self.stages.append(name)
self.out_channels = out_channels
def forward(self, x):
for stage in self.stages:
......
......@@ -27,8 +27,8 @@ class GeneralizedRCNN(nn.Module):
super(GeneralizedRCNN, self).__init__()
self.backbone = build_backbone(cfg)
self.rpn = build_rpn(cfg)
self.roi_heads = build_roi_heads(cfg)
self.rpn = build_rpn(cfg, self.backbone.out_channels)
self.roi_heads = build_roi_heads(cfg, self.backbone.out_channels)
def forward(self, images, targets=None):
"""
......
......@@ -119,3 +119,15 @@ class Pooler(nn.Module):
result[idx_in_level] = pooler(per_level_feature, rois_per_level)
return result
def make_pooler(cfg, head_name):
resolution = cfg.MODEL[head_name].POOLER_RESOLUTION
scales = cfg.MODEL[head_name].POOLER_SCALES
sampling_ratio = cfg.MODEL[head_name].POOLER_SAMPLING_RATIO
pooler = Pooler(
output_size=(resolution, resolution),
scales=scales,
sampling_ratio=sampling_ratio,
)
return pooler
......@@ -3,6 +3,10 @@
from maskrcnn_benchmark.utils.registry import Registry
BACKBONES = Registry()
RPN_HEADS = Registry()
ROI_BOX_FEATURE_EXTRACTORS = Registry()
ROI_BOX_PREDICTOR = Registry()
RPN_HEADS = Registry()
ROI_KEYPOINT_FEATURE_EXTRACTORS = Registry()
ROI_KEYPOINT_PREDICTOR = Registry()
ROI_MASK_FEATURE_EXTRACTORS = Registry()
ROI_MASK_PREDICTOR = Registry()
......@@ -13,10 +13,11 @@ class ROIBoxHead(torch.nn.Module):
Generic Box Head class.
"""
def __init__(self, cfg):
def __init__(self, cfg, in_channels):
super(ROIBoxHead, self).__init__()
self.feature_extractor = make_roi_box_feature_extractor(cfg)
self.predictor = make_roi_box_predictor(cfg)
self.feature_extractor = make_roi_box_feature_extractor(cfg, in_channels)
self.predictor = make_roi_box_predictor(
cfg, self.feature_extractor.out_channels)
self.post_processor = make_roi_box_post_processor(cfg)
self.loss_evaluator = make_roi_box_loss_evaluator(cfg)
......@@ -61,10 +62,10 @@ class ROIBoxHead(torch.nn.Module):
)
def build_roi_box_head(cfg):
def build_roi_box_head(cfg, in_channels):
"""
Constructs a new box head.
By default, uses ROIBoxHead, but if it turns out not to be enough, just register a new class
and make it a parameter in the config
"""
return ROIBoxHead(cfg)
return ROIBoxHead(cfg, in_channels)
......@@ -12,7 +12,7 @@ from maskrcnn_benchmark.modeling.make_layers import make_fc
@registry.ROI_BOX_FEATURE_EXTRACTORS.register("ResNet50Conv5ROIFeatureExtractor")
class ResNet50Conv5ROIFeatureExtractor(nn.Module):
def __init__(self, config):
def __init__(self, config, in_channels):
super(ResNet50Conv5ROIFeatureExtractor, self).__init__()
resolution = config.MODEL.ROI_BOX_HEAD.POOLER_RESOLUTION
......@@ -38,6 +38,7 @@ class ResNet50Conv5ROIFeatureExtractor(nn.Module):
self.pooler = pooler
self.head = head
self.out_channels = head.out_channels
def forward(self, x, proposals):
x = self.pooler(x, proposals)
......@@ -51,7 +52,7 @@ class FPN2MLPFeatureExtractor(nn.Module):
Heads for FPN for classification
"""
def __init__(self, cfg):
def __init__(self, cfg, in_channels):
super(FPN2MLPFeatureExtractor, self).__init__()
resolution = cfg.MODEL.ROI_BOX_HEAD.POOLER_RESOLUTION
......@@ -62,12 +63,13 @@ class FPN2MLPFeatureExtractor(nn.Module):
scales=scales,
sampling_ratio=sampling_ratio,
)
input_size = cfg.MODEL.BACKBONE.OUT_CHANNELS * resolution ** 2
input_size = in_channels * resolution ** 2
representation_size = cfg.MODEL.ROI_BOX_HEAD.MLP_HEAD_DIM
use_gn = cfg.MODEL.ROI_BOX_HEAD.USE_GN
self.pooler = pooler
self.fc6 = make_fc(input_size, representation_size, use_gn)
self.fc7 = make_fc(representation_size, representation_size, use_gn)
self.out_channels = representation_size
def forward(self, x, proposals):
x = self.pooler(x, proposals)
......@@ -85,7 +87,7 @@ class FPNXconv1fcFeatureExtractor(nn.Module):
Heads for FPN for classification
"""
def __init__(self, cfg):
def __init__(self, cfg, in_channels):
super(FPNXconv1fcFeatureExtractor, self).__init__()
resolution = cfg.MODEL.ROI_BOX_HEAD.POOLER_RESOLUTION
......@@ -97,9 +99,8 @@ class FPNXconv1fcFeatureExtractor(nn.Module):
sampling_ratio=sampling_ratio,
)
self.pooler = pooler
use_gn = cfg.MODEL.ROI_BOX_HEAD.USE_GN
in_channels = cfg.MODEL.BACKBONE.OUT_CHANNELS
conv_head_dim = cfg.MODEL.ROI_BOX_HEAD.CONV_HEAD_DIM
num_stacked_convs = cfg.MODEL.ROI_BOX_HEAD.NUM_STACKED_CONVS
dilation = cfg.MODEL.ROI_BOX_HEAD.DILATION
......@@ -133,6 +134,7 @@ class FPNXconv1fcFeatureExtractor(nn.Module):
input_size = conv_head_dim * resolution ** 2
representation_size = cfg.MODEL.ROI_BOX_HEAD.MLP_HEAD_DIM
self.fc6 = make_fc(input_size, representation_size, use_gn=False)
self.out_channels = representation_size
def forward(self, x, proposals):
x = self.pooler(x, proposals)
......@@ -142,8 +144,8 @@ class FPNXconv1fcFeatureExtractor(nn.Module):
return x
def make_roi_box_feature_extractor(cfg):
def make_roi_box_feature_extractor(cfg, in_channels):
func = registry.ROI_BOX_FEATURE_EXTRACTORS[
cfg.MODEL.ROI_BOX_HEAD.FEATURE_EXTRACTOR
]
return func(cfg)
return func(cfg, in_channels)
......@@ -5,16 +5,14 @@ from torch import nn
@registry.ROI_BOX_PREDICTOR.register("FastRCNNPredictor")
class FastRCNNPredictor(nn.Module):
def __init__(self, config, pretrained=None):
def __init__(self, config, in_channels):
super(FastRCNNPredictor, self).__init__()
assert in_channels is not None
stage_index = 4
stage2_relative_factor = 2 ** (stage_index - 1)
res2_out_channels = config.MODEL.RESNETS.RES2_OUT_CHANNELS
num_inputs = res2_out_channels * stage2_relative_factor
num_inputs = in_channels
num_classes = config.MODEL.ROI_BOX_HEAD.NUM_CLASSES
self.avgpool = nn.AvgPool2d(kernel_size=7, stride=7)
self.avgpool = nn.AdaptiveAvgPool2d(1)
self.cls_score = nn.Linear(num_inputs, num_classes)
num_bbox_reg_classes = 2 if config.MODEL.CLS_AGNOSTIC_BBOX_REG else num_classes
self.bbox_pred = nn.Linear(num_inputs, num_bbox_reg_classes * 4)
......@@ -35,10 +33,10 @@ class FastRCNNPredictor(nn.Module):
@registry.ROI_BOX_PREDICTOR.register("FPNPredictor")
class FPNPredictor(nn.Module):
def __init__(self, cfg):
def __init__(self, cfg, in_channels):
super(FPNPredictor, self).__init__()
num_classes = cfg.MODEL.ROI_BOX_HEAD.NUM_CLASSES
representation_size = cfg.MODEL.ROI_BOX_HEAD.MLP_HEAD_DIM
representation_size = in_channels
self.cls_score = nn.Linear(representation_size, num_classes)
num_bbox_reg_classes = 2 if cfg.MODEL.CLS_AGNOSTIC_BBOX_REG else num_classes
......@@ -50,12 +48,15 @@ class FPNPredictor(nn.Module):
nn.init.constant_(l.bias, 0)
def forward(self, x):
if x.ndimension() == 4:
assert list(x.shape[2:]) == [1, 1]
x = x.view(x.size(0), -1)
scores = self.cls_score(x)
bbox_deltas = self.bbox_pred(x)
return scores, bbox_deltas
def make_roi_box_predictor(cfg):
def make_roi_box_predictor(cfg, in_channels):
func = registry.ROI_BOX_PREDICTOR[cfg.MODEL.ROI_BOX_HEAD.PREDICTOR]
return func(cfg)
return func(cfg, in_channels)
......@@ -7,11 +7,12 @@ from .loss import make_roi_keypoint_loss_evaluator
class ROIKeypointHead(torch.nn.Module):
def __init__(self, cfg):
def __init__(self, cfg, in_channels):
super(ROIKeypointHead, self).__init__()
self.cfg = cfg.clone()
self.feature_extractor = make_roi_keypoint_feature_extractor(cfg)
self.predictor = make_roi_keypoint_predictor(cfg)
self.feature_extractor = make_roi_keypoint_feature_extractor(cfg, in_channels)
self.predictor = make_roi_keypoint_predictor(
cfg, self.feature_extractor.out_channels)
self.post_processor = make_roi_keypoint_post_processor(cfg)
self.loss_evaluator = make_roi_keypoint_loss_evaluator(cfg)
......@@ -46,5 +47,5 @@ class ROIKeypointHead(torch.nn.Module):
return x, proposals, dict(loss_kp=loss_kp)
def build_roi_keypoint_head(cfg):
return ROIKeypointHead(cfg)
def build_roi_keypoint_head(cfg, in_channels):
return ROIKeypointHead(cfg, in_channels)
from torch import nn
from torch.nn import functional as F
from maskrcnn_benchmark.modeling import registry
from maskrcnn_benchmark.modeling.poolers import Pooler
from maskrcnn_benchmark.layers import Conv2d
@registry.ROI_KEYPOINT_FEATURE_EXTRACTORS.register("KeypointRCNNFeatureExtractor")
class KeypointRCNNFeatureExtractor(nn.Module):
def __init__(self, cfg):
def __init__(self, cfg, in_channels):
super(KeypointRCNNFeatureExtractor, self).__init__()
resolution = cfg.MODEL.ROI_KEYPOINT_HEAD.POOLER_RESOLUTION
......@@ -20,7 +22,7 @@ class KeypointRCNNFeatureExtractor(nn.Module):
)
self.pooler = pooler
input_features = cfg.MODEL.BACKBONE.OUT_CHANNELS
input_features = in_channels
layers = cfg.MODEL.ROI_KEYPOINT_HEAD.CONV_LAYERS
next_feature = input_features
self.blocks = []
......@@ -32,6 +34,7 @@ class KeypointRCNNFeatureExtractor(nn.Module):
self.add_module(layer_name, module)
next_feature = layer_features
self.blocks.append(layer_name)
self.out_channels = layer_features
def forward(self, x, proposals):
x = self.pooler(x, proposals)
......@@ -40,13 +43,8 @@ class KeypointRCNNFeatureExtractor(nn.Module):
return x
_ROI_KEYPOINT_FEATURE_EXTRACTORS = {
"KeypointRCNNFeatureExtractor": KeypointRCNNFeatureExtractor
}
def make_roi_keypoint_feature_extractor(cfg):
func = _ROI_KEYPOINT_FEATURE_EXTRACTORS[
def make_roi_keypoint_feature_extractor(cfg, in_channels):
func = registry.ROI_KEYPOINT_FEATURE_EXTRACTORS[
cfg.MODEL.ROI_KEYPOINT_HEAD.FEATURE_EXTRACTOR
]
return func(cfg)
return func(cfg, in_channels)
from torch import nn
from torch.nn import functional as F
from maskrcnn_benchmark import layers
from maskrcnn_benchmark.modeling import registry
@registry.ROI_KEYPOINT_PREDICTOR.register("KeypointRCNNPredictor")
class KeypointRCNNPredictor(nn.Module):
def __init__(self, cfg):
def __init__(self, cfg, in_channels):
super(KeypointRCNNPredictor, self).__init__()
input_features = cfg.MODEL.ROI_KEYPOINT_HEAD.CONV_LAYERS[-1]
input_features = in_channels
num_keypoints = cfg.MODEL.ROI_KEYPOINT_HEAD.NUM_CLASSES
deconv_kernel = 4
self.kps_score_lowres = layers.ConvTranspose2d(
......@@ -22,6 +23,7 @@ class KeypointRCNNPredictor(nn.Module):
)
nn.init.constant_(self.kps_score_lowres.bias, 0)
self.up_scale = 2
self.out_channels = num_keypoints
def forward(self, x):
x = self.kps_score_lowres(x)
......@@ -31,9 +33,6 @@ class KeypointRCNNPredictor(nn.Module):
return x
_ROI_KEYPOINT_PREDICTOR = {"KeypointRCNNPredictor": KeypointRCNNPredictor}
def make_roi_keypoint_predictor(cfg):
func = _ROI_KEYPOINT_PREDICTOR[cfg.MODEL.ROI_KEYPOINT_HEAD.PREDICTOR]
return func(cfg)
def make_roi_keypoint_predictor(cfg, in_channels):
func = registry.ROI_KEYPOINT_PREDICTOR[cfg.MODEL.ROI_KEYPOINT_HEAD.PREDICTOR]
return func(cfg, in_channels)
......@@ -34,11 +34,12 @@ def keep_only_positive_boxes(boxes):
class ROIMaskHead(torch.nn.Module):
def __init__(self, cfg):
def __init__(self, cfg, in_channels):
super(ROIMaskHead, self).__init__()
self.cfg = cfg.clone()
self.feature_extractor = make_roi_mask_feature_extractor(cfg)
self.predictor = make_roi_mask_predictor(cfg)
self.feature_extractor = make_roi_mask_feature_extractor(cfg, in_channels)
self.predictor = make_roi_mask_predictor(
cfg, self.feature_extractor.out_channels)
self.post_processor = make_roi_mask_post_processor(cfg)
self.loss_evaluator = make_roi_mask_loss_evaluator(cfg)
......@@ -78,5 +79,5 @@ class ROIMaskHead(torch.nn.Module):
return x, all_proposals, dict(loss_mask=loss_mask)
def build_roi_mask_head(cfg):
return ROIMaskHead(cfg)
def build_roi_mask_head(cfg, in_channels):
return ROIMaskHead(cfg, in_channels)
......@@ -3,18 +3,23 @@ from torch import nn
from torch.nn import functional as F
from ..box_head.roi_box_feature_extractors import ResNet50Conv5ROIFeatureExtractor
from maskrcnn_benchmark.modeling import registry
from maskrcnn_benchmark.modeling.poolers import Pooler
from maskrcnn_benchmark.layers import Conv2d
from maskrcnn_benchmark.modeling.make_layers import make_conv3x3
registry.ROI_MASK_FEATURE_EXTRACTORS.register(
"ResNet50Conv5ROIFeatureExtractor", ResNet50Conv5ROIFeatureExtractor
)
@registry.ROI_MASK_FEATURE_EXTRACTORS.register("MaskRCNNFPNFeatureExtractor")
class MaskRCNNFPNFeatureExtractor(nn.Module):
"""
Heads for FPN for classification
"""
def __init__(self, cfg):
def __init__(self, cfg, in_channels):
"""
Arguments:
num_classes (int): number of output classes
......@@ -31,7 +36,7 @@ class MaskRCNNFPNFeatureExtractor(nn.Module):
scales=scales,
sampling_ratio=sampling_ratio,
)
input_size = cfg.MODEL.BACKBONE.OUT_CHANNELS
input_size = in_channels
self.pooler = pooler
use_gn = cfg.MODEL.ROI_MASK_HEAD.USE_GN
......@@ -42,12 +47,14 @@ class MaskRCNNFPNFeatureExtractor(nn.Module):
self.blocks = []
for layer_idx, layer_features in enumerate(layers, 1):
layer_name = "mask_fcn{}".format(layer_idx)
module = make_conv3x3(next_feature, layer_features,
module = make_conv3x3(
next_feature, layer_features,
dilation=dilation, stride=1, use_gn=use_gn
)
self.add_module(layer_name, module)
next_feature = layer_features
self.blocks.append(layer_name)
self.out_channels = layer_features
def forward(self, x, proposals):
x = self.pooler(x, proposals)
......@@ -58,12 +65,8 @@ class MaskRCNNFPNFeatureExtractor(nn.Module):
return x
_ROI_MASK_FEATURE_EXTRACTORS = {
"ResNet50Conv5ROIFeatureExtractor": ResNet50Conv5ROIFeatureExtractor,
"MaskRCNNFPNFeatureExtractor": MaskRCNNFPNFeatureExtractor,
}
def make_roi_mask_feature_extractor(cfg):
func = _ROI_MASK_FEATURE_EXTRACTORS[cfg.MODEL.ROI_MASK_HEAD.FEATURE_EXTRACTOR]
return func(cfg)
def make_roi_mask_feature_extractor(cfg, in_channels):
func = registry.ROI_MASK_FEATURE_EXTRACTORS[
cfg.MODEL.ROI_MASK_HEAD.FEATURE_EXTRACTOR
]
return func(cfg, in_channels)
......@@ -4,21 +4,16 @@ from torch.nn import functional as F
from maskrcnn_benchmark.layers import Conv2d
from maskrcnn_benchmark.layers import ConvTranspose2d
from maskrcnn_benchmark.modeling import registry
@registry.ROI_MASK_PREDICTOR.register("MaskRCNNC4Predictor")
class MaskRCNNC4Predictor(nn.Module):
def __init__(self, cfg):
def __init__(self, cfg, in_channels):
super(MaskRCNNC4Predictor, self).__init__()
num_classes = cfg.MODEL.ROI_BOX_HEAD.NUM_CLASSES
dim_reduced = cfg.MODEL.ROI_MASK_HEAD.CONV_LAYERS[-1]
if cfg.MODEL.ROI_HEADS.USE_FPN:
num_inputs = dim_reduced
else:
stage_index = 4
stage2_relative_factor = 2 ** (stage_index - 1)
res2_out_channels = cfg.MODEL.RESNETS.RES2_OUT_CHANNELS
num_inputs = res2_out_channels * stage2_relative_factor
num_inputs = in_channels
self.conv5_mask = ConvTranspose2d(num_inputs, dim_reduced, 2, 2, 0)
self.mask_fcn_logits = Conv2d(dim_reduced, num_classes, 1, 1, 0)
......@@ -36,9 +31,27 @@ class MaskRCNNC4Predictor(nn.Module):
return self.mask_fcn_logits(x)
_ROI_MASK_PREDICTOR = {"MaskRCNNC4Predictor": MaskRCNNC4Predictor}
@registry.ROI_MASK_PREDICTOR.register("MaskRCNNConv1x1Predictor")
class MaskRCNNConv1x1Predictor(nn.Module):
def __init__(self, cfg, in_channels):
super(MaskRCNNConv1x1Predictor, self).__init__()
num_classes = cfg.MODEL.ROI_BOX_HEAD.NUM_CLASSES
num_inputs = in_channels
self.mask_fcn_logits = Conv2d(num_inputs, num_classes, 1, 1, 0)
for name, param in self.named_parameters():
if "bias" in name:
nn.init.constant_(param, 0)
elif "weight" in name:
# Caffe2 implementation uses MSRAFill, which in fact
# corresponds to kaiming_normal_ in PyTorch
nn.init.kaiming_normal_(param, mode="fan_out", nonlinearity="relu")
def forward(self, x):
return self.mask_fcn_logits(x)
def make_roi_mask_predictor(cfg):
func = _ROI_MASK_PREDICTOR[cfg.MODEL.ROI_MASK_HEAD.PREDICTOR]
return func(cfg)
def make_roi_mask_predictor(cfg, in_channels):
func = registry.ROI_MASK_PREDICTOR[cfg.MODEL.ROI_MASK_HEAD.PREDICTOR]
return func(cfg, in_channels)
......@@ -55,7 +55,7 @@ class CombinedROIHeads(torch.nn.ModuleDict):
return x, detections, losses
def build_roi_heads(cfg):
def build_roi_heads(cfg, in_channels):
# individually create the heads, that will be combined together
# afterwards
roi_heads = []
......@@ -63,11 +63,11 @@ def build_roi_heads(cfg):
return []
if not cfg.MODEL.RPN_ONLY:
roi_heads.append(("box", build_roi_box_head(cfg)))
roi_heads.append(("box", build_roi_box_head(cfg, in_channels)))
if cfg.MODEL.MASK_ON:
roi_heads.append(("mask", build_roi_mask_head(cfg)))
roi_heads.append(("mask", build_roi_mask_head(cfg, in_channels)))
if cfg.MODEL.KEYPOINT_ON:
roi_heads.append(("keypoint", build_roi_keypoint_head(cfg)))
roi_heads.append(("keypoint", build_roi_keypoint_head(cfg, in_channels)))
# combine individual heads in a single module
if roi_heads:
......
......@@ -15,7 +15,7 @@ class RetinaNetHead(torch.nn.Module):
Adds a RetinNet head with classification and regression heads
"""
def __init__(self, cfg):
def __init__(self, cfg, in_channels):
"""
Arguments:
in_channels (int): number of channels of the input feature
......@@ -24,7 +24,6 @@ class RetinaNetHead(torch.nn.Module):
super(RetinaNetHead, self).__init__()
# TODO: Implement the sigmoid version first.
num_classes = cfg.MODEL.RETINANET.NUM_CLASSES - 1
in_channels = cfg.MODEL.BACKBONE.OUT_CHANNELS
num_anchors = len(cfg.MODEL.RETINANET.ASPECT_RATIOS) \
* cfg.MODEL.RETINANET.SCALES_PER_OCTAVE
......@@ -92,13 +91,13 @@ class RetinaNetModule(torch.nn.Module):
RetinaNet outputs and losses. Only Test on FPN now.
"""
def __init__(self, cfg):
def __init__(self, cfg, in_channels):
super(RetinaNetModule, self).__init__()
self.cfg = cfg.clone()
anchor_generator = make_anchor_generator_retinanet(cfg)
head = RetinaNetHead(cfg)
head = RetinaNetHead(cfg, in_channels)
box_coder = BoxCoder(weights=(10., 10., 5., 5.))
box_selector_test = make_retinanet_postprocessor(cfg, box_coder, is_train=False)
......@@ -149,5 +148,5 @@ class RetinaNetModule(torch.nn.Module):
return boxes, {}
def build_retinanet(cfg):
return RetinaNetModule(cfg)
def build_retinanet(cfg, in_channels):
return RetinaNetModule(cfg, in_channels)
......@@ -10,6 +10,66 @@ from .loss import make_rpn_loss_evaluator
from .anchor_generator import make_anchor_generator
from .inference import make_rpn_postprocessor
class RPNHeadConvRegressor(nn.Module):
"""
A simple RPN Head for classification and bbox regression
"""
def __init__(self, cfg, in_channels, num_anchors):
"""
Arguments:
cfg : config
in_channels (int): number of channels of the input feature
num_anchors (int): number of anchors to be predicted
"""
super(RPNHeadConvRegressor, self).__init__()
self.cls_logits = nn.Conv2d(in_channels, num_anchors, kernel_size=1, stride=1)
self.bbox_pred = nn.Conv2d(
in_channels, num_anchors * 4, kernel_size=1, stride=1
)
for l in [self.cls_logits, self.bbox_pred]:
torch.nn.init.normal_(l.weight, std=0.01)
torch.nn.init.constant_(l.bias, 0)
def forward(self, x):
assert isinstance(x, (list, tuple))
logits = [self.cls_logits(y) for y in x]
bbox_reg = [self.bbox_pred(y) for y in x]
return logits, bbox_reg
class RPNHeadFeatureSingleConv(nn.Module):
"""
Adds a simple RPN Head with one conv to extract the feature
"""
def __init__(self, cfg, in_channels):
"""
Arguments:
cfg : config
in_channels (int): number of channels of the input feature
"""
super(RPNHeadFeatureSingleConv, self).__init__()
self.conv = nn.Conv2d(
in_channels, in_channels, kernel_size=3, stride=1, padding=1
)
for l in [self.conv]:
torch.nn.init.normal_(l.weight, std=0.01)
torch.nn.init.constant_(l.bias, 0)
self.out_channels = in_channels
def forward(self, x):
assert isinstance(x, (list, tuple))
x = [F.relu(self.conv(z)) for z in x]
return x
@registry.RPN_HEADS.register("SingleConvRPNHead")
class RPNHead(nn.Module):
"""
......@@ -52,14 +112,13 @@ class RPNModule(torch.nn.Module):
proposals and losses. Works for both FPN and non-FPN.
"""
def __init__(self, cfg):
def __init__(self, cfg, in_channels):
super(RPNModule, self).__init__()
self.cfg = cfg.clone()
anchor_generator = make_anchor_generator(cfg)
in_channels = cfg.MODEL.BACKBONE.OUT_CHANNELS
rpn_head = registry.RPN_HEADS[cfg.MODEL.RPN.RPN_HEAD]
head = rpn_head(
cfg, in_channels, anchor_generator.num_anchors_per_location()[0]
......@@ -138,11 +197,11 @@ class RPNModule(torch.nn.Module):
return boxes, {}
def build_rpn(cfg):
def build_rpn(cfg, in_channels):
"""
This gives the gist of it. Not super important because it doesn't change as much
"""
if cfg.MODEL.RETINANET_ON:
return build_retinanet(cfg)
return build_retinanet(cfg, in_channels)
return RPNModule(cfg)
return RPNModule(cfg, in_channels)
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
import os
def get_config_root_path():
''' Path to configs for unit tests '''
# cur_file_dir is root/tests/env_tests
cur_file_dir = os.path.dirname(os.path.abspath(os.path.realpath(__file__)))
ret = os.path.dirname(os.path.dirname(cur_file_dir))
ret = os.path.join(ret, "configs")
return ret
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
import unittest
import copy
import torch
# import modules to to register backbones
from maskrcnn_benchmark.modeling.backbone import build_backbone # NoQA
from maskrcnn_benchmark.modeling import registry
from maskrcnn_benchmark.config import cfg as g_cfg
from utils import load_config
# overwrite configs if specified, otherwise default config is used
BACKBONE_CFGS = {
"R-50-FPN": "e2e_faster_rcnn_R_50_FPN_1x.yaml",
"R-101-FPN": "e2e_faster_rcnn_R_101_FPN_1x.yaml",
"R-152-FPN": "e2e_faster_rcnn_R_101_FPN_1x.yaml",
"R-50-FPN-RETINANET": "retinanet/retinanet_R-50-FPN_1x.yaml",
"R-101-FPN-RETINANET": "retinanet/retinanet_R-101-FPN_1x.yaml",
}
class TestBackbones(unittest.TestCase):
def test_build_backbones(self):
''' Make sure backbones run '''
self.assertGreater(len(registry.BACKBONES), 0)
for name, backbone_builder in registry.BACKBONES.items():
print('Testing {}...'.format(name))
if name in BACKBONE_CFGS:
cfg = load_config(BACKBONE_CFGS[name])
else:
# Use default config if config file is not specified
cfg = copy.deepcopy(g_cfg)
backbone = backbone_builder(cfg)
# make sures the backbone has `out_channels`
self.assertIsNotNone(
getattr(backbone, 'out_channels', None),
'Need to provide out_channels for backbone {}'.format(name)
)
N, C_in, H, W = 2, 3, 224, 256
input = torch.rand([N, C_in, H, W], dtype=torch.float32)
out = backbone(input)
for cur_out in out:
self.assertEqual(
cur_out.shape[:2],
torch.Size([N, backbone.out_channels])
)
if __name__ == "__main__":
unittest.main()
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
import unittest
import numpy as np
import torch
from maskrcnn_benchmark.modeling.box_coder import BoxCoder
class TestBoxCoder(unittest.TestCase):
def test_box_decoder(self):
""" Match unit test UtilsBoxesTest.TestBboxTransformRandom in
caffe2/operators/generate_proposals_op_util_boxes_test.cc
"""
box_coder = BoxCoder(weights=(1.0, 1.0, 1.0, 1.0))
bbox = torch.from_numpy(
np.array(
[
175.62031555,
20.91103172,
253.352005,
155.0145874,
169.24636841,
4.85241556,
228.8605957,
105.02092743,
181.77426147,
199.82876587,
192.88427734,
214.0255127,
174.36262512,
186.75761414,
296.19091797,
231.27906799,
22.73153877,
92.02596283,
135.5695343,
208.80291748,
]
)
.astype(np.float32)
.reshape(-1, 4)
)
deltas = torch.from_numpy(
np.array(
[
0.47861834,
0.13992102,
0.14961673,
0.71495209,
0.29915856,
-0.35664671,
0.89018666,
0.70815367,
-0.03852064,
0.44466892,
0.49492538,
0.71409376,
0.28052918,
0.02184832,
0.65289006,
1.05060139,
-0.38172557,
-0.08533806,
-0.60335309,
0.79052375,
]
)
.astype(np.float32)
.reshape(-1, 4)
)
gt_bbox = (
np.array(
[
206.949539,
-30.715202,
297.387665,
244.448486,
143.871216,
-83.342888,
290.502289,
121.053398,
177.430283,
198.666245,
196.295273,
228.703079,
152.251892,
145.431564,
387.215454,
274.594238,
5.062420,
11.040955,
66.328903,
269.686218,
]
)
.astype(np.float32)
.reshape(-1, 4)
)
results = box_coder.decode(deltas, bbox)
np.testing.assert_allclose(results.detach().numpy(), gt_bbox, atol=1e-4)
if __name__ == "__main__":
unittest.main()
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
import unittest
import glob
import os
import utils
class TestConfigs(unittest.TestCase):
def test_configs_load(self):
''' Make sure configs are loadable '''
cfg_root_path = utils.get_config_root_path()
files = glob.glob(
os.path.join(cfg_root_path, "./**/*.yaml"), recursive=True)
self.assertGreater(len(files), 0)
for fn in files:
print('Loading {}...'.format(fn))
utils.load_config_from_file(fn)
if __name__ == "__main__":
unittest.main()
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
import unittest
import numpy as np
import torch
import maskrcnn_benchmark.modeling.backbone.fbnet_builder as fbnet_builder
TEST_CUDA = torch.cuda.is_available()
def _test_primitive(self, device, op_name, op_func, N, C_in, C_out, expand, stride):
op = op_func(C_in, C_out, expand, stride).to(device)
input = torch.rand([N, C_in, 7, 7], dtype=torch.float32).to(device)
output = op(input)
self.assertEqual(
output.shape[:2], torch.Size([N, C_out]),
'Primitive {} failed for shape {}.'.format(op_name, input.shape)
)
class TestFBNetBuilder(unittest.TestCase):
def test_identity(self):
id_op = fbnet_builder.Identity(20, 20, 1)
input = torch.rand([10, 20, 7, 7], dtype=torch.float32)
output = id_op(input)
np.testing.assert_array_equal(np.array(input), np.array(output))
id_op = fbnet_builder.Identity(20, 40, 2)
input = torch.rand([10, 20, 7, 7], dtype=torch.float32)
output = id_op(input)
np.testing.assert_array_equal(output.shape, [10, 40, 4, 4])
def test_primitives(self):
''' Make sures the primitives runs '''
for op_name, op_func in fbnet_builder.PRIMITIVES.items():
print('Testing {}'.format(op_name))
_test_primitive(
self, "cpu",
op_name, op_func,
N=20, C_in=16, C_out=32, expand=4, stride=1
)
@unittest.skipIf(not TEST_CUDA, "no CUDA detected")
def test_primitives_cuda(self):
''' Make sures the primitives runs on cuda '''
for op_name, op_func in fbnet_builder.PRIMITIVES.items():
print('Testing {}'.format(op_name))
_test_primitive(
self, "cuda",
op_name, op_func,
N=20, C_in=16, C_out=32, expand=4, stride=1
)
def test_primitives_empty_batch(self):
''' Make sures the primitives runs '''
for op_name, op_func in fbnet_builder.PRIMITIVES.items():
print('Testing {}'.format(op_name))
# test empty batch size
_test_primitive(
self, "cpu",
op_name, op_func,
N=0, C_in=16, C_out=32, expand=4, stride=1
)
@unittest.skipIf(not TEST_CUDA, "no CUDA detected")
def test_primitives_cuda_empty_batch(self):
''' Make sures the primitives runs '''
for op_name, op_func in fbnet_builder.PRIMITIVES.items():
print('Testing {}'.format(op_name))
# test empty batch size
_test_primitive(
self, "cuda",
op_name, op_func,
N=0, C_in=16, C_out=32, expand=4, stride=1
)
if __name__ == "__main__":
unittest.main()
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
import unittest
import copy
import torch
# import modules to to register feature extractors
from maskrcnn_benchmark.modeling.backbone import build_backbone # NoQA
from maskrcnn_benchmark.modeling.roi_heads.roi_heads import build_roi_heads # NoQA
from maskrcnn_benchmark.modeling import registry
from maskrcnn_benchmark.structures.bounding_box import BoxList
from maskrcnn_benchmark.config import cfg as g_cfg
from utils import load_config
# overwrite configs if specified, otherwise default config is used
FEATURE_EXTRACTORS_CFGS = {
}
# overwrite configs if specified, otherwise default config is used
FEATURE_EXTRACTORS_INPUT_CHANNELS = {
# in_channels was not used, load through config
"ResNet50Conv5ROIFeatureExtractor": 1024,
}
def _test_feature_extractors(
self, extractors, overwrite_cfgs, overwrite_in_channels
):
''' Make sure roi box feature extractors run '''
self.assertGreater(len(extractors), 0)
in_channels_default = 64
for name, builder in extractors.items():
print('Testing {}...'.format(name))
if name in overwrite_cfgs:
cfg = load_config(overwrite_cfgs[name])
else:
# Use default config if config file is not specified
cfg = copy.deepcopy(g_cfg)
in_channels = overwrite_in_channels.get(
name, in_channels_default)
fe = builder(cfg, in_channels)
self.assertIsNotNone(
getattr(fe, 'out_channels', None),
'Need to provide out_channels for feature extractor {}'.format(name)
)
N, C_in, H, W = 2, in_channels, 24, 32
input = torch.rand([N, C_in, H, W], dtype=torch.float32)
bboxes = [[1, 1, 10, 10], [5, 5, 8, 8], [2, 2, 3, 4]]
img_size = [384, 512]
box_list = BoxList(bboxes, img_size, "xyxy")
out = fe([input], [box_list] * N)
self.assertEqual(
out.shape[:2],
torch.Size([N * len(bboxes), fe.out_channels])
)
class TestFeatureExtractors(unittest.TestCase):
def test_roi_box_feature_extractors(self):
''' Make sure roi box feature extractors run '''
_test_feature_extractors(
self,
registry.ROI_BOX_FEATURE_EXTRACTORS,
FEATURE_EXTRACTORS_CFGS,
FEATURE_EXTRACTORS_INPUT_CHANNELS,
)
def test_roi_keypoints_feature_extractors(self):
''' Make sure roi keypoints feature extractors run '''
_test_feature_extractors(
self,
registry.ROI_KEYPOINT_FEATURE_EXTRACTORS,
FEATURE_EXTRACTORS_CFGS,
FEATURE_EXTRACTORS_INPUT_CHANNELS,
)
def test_roi_mask_feature_extractors(self):
''' Make sure roi mask feature extractors run '''
_test_feature_extractors(
self,
registry.ROI_MASK_FEATURE_EXTRACTORS,
FEATURE_EXTRACTORS_CFGS,
FEATURE_EXTRACTORS_INPUT_CHANNELS,
)
if __name__ == "__main__":
unittest.main()
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
import unittest
import numpy as np
import torch
from maskrcnn_benchmark.layers import nms as box_nms
class TestNMS(unittest.TestCase):
def test_nms_cpu(self):
""" Match unit test UtilsNMSTest.TestNMS in
caffe2/operators/generate_proposals_op_util_nms_test.cc
"""
inputs = (
np.array(
[
10,
10,
50,
60,
0.5,
11,
12,
48,
60,
0.7,
8,
9,
40,
50,
0.6,
100,
100,
150,
140,
0.9,
99,
110,
155,
139,
0.8,
]
)
.astype(np.float32)
.reshape(-1, 5)
)
boxes = torch.from_numpy(inputs[:, :4])
scores = torch.from_numpy(inputs[:, 4])
test_thresh = [0.1, 0.3, 0.5, 0.8, 0.9]
gt_indices = [[1, 3], [1, 3], [1, 3], [1, 2, 3, 4], [0, 1, 2, 3, 4]]
for thresh, gt_index in zip(test_thresh, gt_indices):
keep_indices = box_nms(boxes, scores, thresh)
keep_indices = np.sort(keep_indices)
np.testing.assert_array_equal(keep_indices, np.array(gt_index))
def test_nms1_cpu(self):
""" Match unit test UtilsNMSTest.TestNMS1 in
caffe2/operators/generate_proposals_op_util_nms_test.cc
"""
boxes = torch.from_numpy(
np.array(
[
[350.9821, 161.8200, 369.9685, 205.2372],
[250.5236, 154.2844, 274.1773, 204.9810],
[471.4920, 160.4118, 496.0094, 213.4244],
[352.0421, 164.5933, 366.4458, 205.9624],
[166.0765, 169.7707, 183.0102, 232.6606],
[252.3000, 183.1449, 269.6541, 210.6747],
[469.7862, 162.0192, 482.1673, 187.0053],
[168.4862, 174.2567, 181.7437, 232.9379],
[470.3290, 162.3442, 496.4272, 214.6296],
[251.0450, 155.5911, 272.2693, 203.3675],
[252.0326, 154.7950, 273.7404, 195.3671],
[351.7479, 161.9567, 370.6432, 204.3047],
[496.3306, 161.7157, 515.0573, 210.7200],
[471.0749, 162.6143, 485.3374, 207.3448],
[250.9745, 160.7633, 264.1924, 206.8350],
[470.4792, 169.0351, 487.1934, 220.2984],
[474.4227, 161.9546, 513.1018, 215.5193],
[251.9428, 184.1950, 262.6937, 207.6416],
[252.6623, 175.0252, 269.8806, 213.7584],
[260.9884, 157.0351, 288.3554, 206.6027],
[251.3629, 164.5101, 263.2179, 202.4203],
[471.8361, 190.8142, 485.6812, 220.8586],
[248.6243, 156.9628, 264.3355, 199.2767],
[495.1643, 158.0483, 512.6261, 184.4192],
[376.8718, 168.0144, 387.3584, 201.3210],
[122.9191, 160.7433, 172.5612, 231.3837],
[350.3857, 175.8806, 366.2500, 205.4329],
[115.2958, 162.7822, 161.9776, 229.6147],
[168.4375, 177.4041, 180.8028, 232.4551],
[169.7939, 184.4330, 181.4767, 232.1220],
[347.7536, 175.9356, 355.8637, 197.5586],
[495.5434, 164.6059, 516.4031, 207.7053],
[172.1216, 194.6033, 183.1217, 235.2653],
[264.2654, 181.5540, 288.4626, 214.0170],
[111.7971, 183.7748, 137.3745, 225.9724],
[253.4919, 186.3945, 280.8694, 210.0731],
[165.5334, 169.7344, 185.9159, 232.8514],
[348.3662, 184.5187, 354.9081, 201.4038],
[164.6562, 162.5724, 186.3108, 233.5010],
[113.2999, 186.8410, 135.8841, 219.7642],
[117.0282, 179.8009, 142.5375, 221.0736],
[462.1312, 161.1004, 495.3576, 217.2208],
[462.5800, 159.9310, 501.2937, 224.1655],
[503.5242, 170.0733, 518.3792, 209.0113],
[250.3658, 195.5925, 260.6523, 212.4679],
[108.8287, 163.6994, 146.3642, 229.7261],
[256.7617, 187.3123, 288.8407, 211.2013],
[161.2781, 167.4801, 186.3751, 232.7133],
[115.3760, 177.5859, 163.3512, 236.9660],
[248.9077, 188.0919, 264.8579, 207.9718],
[108.1349, 160.7851, 143.6370, 229.6243],
[465.0900, 156.7555, 490.3561, 213.5704],
[107.5338, 173.4323, 141.0704, 235.2910],
]
).astype(np.float32)
)
scores = torch.from_numpy(
np.array(
[
0.1919,
0.3293,
0.0860,
0.1600,
0.1885,
0.4297,
0.0974,
0.2711,
0.1483,
0.1173,
0.1034,
0.2915,
0.1993,
0.0677,
0.3217,
0.0966,
0.0526,
0.5675,
0.3130,
0.1592,
0.1353,
0.0634,
0.1557,
0.1512,
0.0699,
0.0545,
0.2692,
0.1143,
0.0572,
0.1990,
0.0558,
0.1500,
0.2214,
0.1878,
0.2501,
0.1343,
0.0809,
0.1266,
0.0743,
0.0896,
0.0781,
0.0983,
0.0557,
0.0623,
0.5808,
0.3090,
0.1050,
0.0524,
0.0513,
0.4501,
0.4167,
0.0623,
0.1749,
]
).astype(np.float32)
)
gt_indices = np.array(
[
1,
6,
7,
8,
11,
12,
13,
14,
17,
18,
19,
21,
23,
24,
25,
26,
30,
32,
33,
34,
35,
37,
43,
44,
47,
50,
]
)
keep_indices = box_nms(boxes, scores, 0.5)
keep_indices = np.sort(keep_indices)
np.testing.assert_array_equal(keep_indices, gt_indices)
if __name__ == "__main__":
unittest.main()
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
import unittest
import copy
import torch
# import modules to to register predictors
from maskrcnn_benchmark.modeling.backbone import build_backbone # NoQA
from maskrcnn_benchmark.modeling.roi_heads.roi_heads import build_roi_heads # NoQA
from maskrcnn_benchmark.modeling import registry
from maskrcnn_benchmark.config import cfg as g_cfg
from utils import load_config
# overwrite configs if specified, otherwise default config is used
PREDICTOR_CFGS = {
}
# overwrite configs if specified, otherwise default config is used
PREDICTOR_INPUT_CHANNELS = {
}
def _test_predictors(
self, predictors, overwrite_cfgs, overwrite_in_channels,
hwsize,
):
''' Make sure predictors run '''
self.assertGreater(len(predictors), 0)
in_channels_default = 64
for name, builder in predictors.items():
print('Testing {}...'.format(name))
if name in overwrite_cfgs:
cfg = load_config(overwrite_cfgs[name])
else:
# Use default config if config file is not specified
cfg = copy.deepcopy(g_cfg)
in_channels = overwrite_in_channels.get(
name, in_channels_default)
fe = builder(cfg, in_channels)
N, C_in, H, W = 2, in_channels, hwsize, hwsize
input = torch.rand([N, C_in, H, W], dtype=torch.float32)
out = fe(input)
yield input, out, cfg
class TestPredictors(unittest.TestCase):
def test_roi_box_predictors(self):
''' Make sure roi box predictors run '''
for cur_in, cur_out, cur_cfg in _test_predictors(
self,
registry.ROI_BOX_PREDICTOR,
PREDICTOR_CFGS,
PREDICTOR_INPUT_CHANNELS,
hwsize=1,
):
self.assertEqual(len(cur_out), 2)
scores, bbox_deltas = cur_out[0], cur_out[1]
self.assertEqual(
scores.shape[1], cur_cfg.MODEL.ROI_BOX_HEAD.NUM_CLASSES)
self.assertEqual(scores.shape[0], cur_in.shape[0])
self.assertEqual(scores.shape[0], bbox_deltas.shape[0])
self.assertEqual(scores.shape[1] * 4, bbox_deltas.shape[1])
def test_roi_keypoints_predictors(self):
''' Make sure roi keypoint predictors run '''
for cur_in, cur_out, cur_cfg in _test_predictors(
self,
registry.ROI_KEYPOINT_PREDICTOR,
PREDICTOR_CFGS,
PREDICTOR_INPUT_CHANNELS,
hwsize=14,
):
self.assertEqual(cur_out.shape[0], cur_in.shape[0])
self.assertEqual(
cur_out.shape[1], cur_cfg.MODEL.ROI_KEYPOINT_HEAD.NUM_CLASSES)
def test_roi_mask_predictors(self):
''' Make sure roi mask predictors run '''
for cur_in, cur_out, cur_cfg in _test_predictors(
self,
registry.ROI_MASK_PREDICTOR,
PREDICTOR_CFGS,
PREDICTOR_INPUT_CHANNELS,
hwsize=14,
):
self.assertEqual(cur_out.shape[0], cur_in.shape[0])
self.assertEqual(
cur_out.shape[1], cur_cfg.MODEL.ROI_BOX_HEAD.NUM_CLASSES)
if __name__ == "__main__":
unittest.main()
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
import unittest
import copy
import torch
# import modules to to register rpn heads
from maskrcnn_benchmark.modeling.backbone import build_backbone # NoQA
from maskrcnn_benchmark.modeling.rpn.rpn import build_rpn # NoQA
from maskrcnn_benchmark.modeling import registry
from maskrcnn_benchmark.config import cfg as g_cfg
from utils import load_config
# overwrite configs if specified, otherwise default config is used
RPN_CFGS = {
}
class TestRPNHeads(unittest.TestCase):
def test_build_rpn_heads(self):
''' Make sure rpn heads run '''
self.assertGreater(len(registry.RPN_HEADS), 0)
in_channels = 64
num_anchors = 10
for name, builder in registry.RPN_HEADS.items():
print('Testing {}...'.format(name))
if name in RPN_CFGS:
cfg = load_config(RPN_CFGS[name])
else:
# Use default config if config file is not specified
cfg = copy.deepcopy(g_cfg)
rpn = builder(cfg, in_channels, num_anchors)
N, C_in, H, W = 2, in_channels, 24, 32
input = torch.rand([N, C_in, H, W], dtype=torch.float32)
LAYERS = 3
out = rpn([input] * LAYERS)
self.assertEqual(len(out), 2)
logits, bbox_reg = out
for idx in range(LAYERS):
self.assertEqual(
logits[idx].shape,
torch.Size([
input.shape[0], num_anchors,
input.shape[2], input.shape[3],
])
)
self.assertEqual(
bbox_reg[idx].shape,
torch.Size([
logits[idx].shape[0], num_anchors * 4,
logits[idx].shape[2], logits[idx].shape[3],
]),
)
if __name__ == "__main__":
unittest.main()
from __future__ import absolute_import, division, print_function, unicode_literals
# Set up custom environment before nearly anything else is imported
# NOTE: this should be the first import (no not reorder)
from maskrcnn_benchmark.utils.env import setup_environment # noqa F401 isort:skip
import env_tests.env as env_tests
import os
import copy
from maskrcnn_benchmark.config import cfg as g_cfg
def get_config_root_path():
return env_tests.get_config_root_path()
def load_config(rel_path):
''' Load config from file path specified as path relative to config_root '''
cfg_path = os.path.join(env_tests.get_config_root_path(), rel_path)
return load_config_from_file(cfg_path)
def load_config_from_file(file_path):
''' Load config from file path specified as absolute path '''
ret = copy.deepcopy(g_cfg)
ret.merge_from_file(file_path)
return ret
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment