Commit b23eee0c authored by Stzpz's avatar Stzpz Committed by Francisco Massa

Supported FBNet architecture. (#463)

* Supported any feature map size for average pool.
* Different models may have different feature map size.

* Used registry to register keypoint and mask heads.

* Passing in/out channels between modules when creating the model.

Passing in/out channels between modules when creating the model. This simplifies the code to compute the input channels for feature extractors and makes the predictors independent of the backbone architectures.
* Passed in_channels to rpn and head builders.
* Set out_channels to model modules including backbone and feature extractors.
* Moved cfg.MODEL.BACKBONE.OUT_CHANNELS to cfg.MODEL.RESNETS.BACKBONE_OUT_CHANNELS as it is not used by all architectures. Updated config files accordingly.

For new architecture modules, the return module needs to contain a field called 'out_channels' to indicate the output channel size.

* Added unit test for box_coder and nms.

* Added FBNet architecture.

* FBNet is a general architecture definition to support efficient architecture search and MaskRCNN2GO.
* Included various efficient building blocks (inverted residual, shuffle, separate dw conv, dw upsampling etc.)
* Supported building backbone, rpn, detection, keypoint and mask heads using efficient building blocks.
* Architecture could be defined in `fbnet_modeldef.py` or in `cfg.MODEL.FBNET.ARCH_DEF` directly.
* A few baseline architectures are included.

* Added various unit tests.

* build and run backbones.
* build and run feature extractors.
* build and run predictors.

* Added a unit test to verify all config files are loadable.
parent 192261db
...@@ -3,7 +3,8 @@ MODEL: ...@@ -3,7 +3,8 @@ MODEL:
WEIGHT: "catalog://Caffe2Detectron/COCO/35857890/e2e_faster_rcnn_R-101-FPN_1x" WEIGHT: "catalog://Caffe2Detectron/COCO/35857890/e2e_faster_rcnn_R-101-FPN_1x"
BACKBONE: BACKBONE:
CONV_BODY: "R-101-FPN" CONV_BODY: "R-101-FPN"
OUT_CHANNELS: 256 RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN: RPN:
USE_FPN: True USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64) ANCHOR_STRIDE: (4, 8, 16, 32, 64)
......
...@@ -3,7 +3,8 @@ MODEL: ...@@ -3,7 +3,8 @@ MODEL:
WEIGHT: "catalog://Caffe2Detectron/COCO/35857345/e2e_faster_rcnn_R-50-FPN_1x" WEIGHT: "catalog://Caffe2Detectron/COCO/35857345/e2e_faster_rcnn_R-50-FPN_1x"
BACKBONE: BACKBONE:
CONV_BODY: "R-50-FPN" CONV_BODY: "R-50-FPN"
OUT_CHANNELS: 256 RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN: RPN:
USE_FPN: True USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64) ANCHOR_STRIDE: (4, 8, 16, 32, 64)
......
...@@ -3,7 +3,8 @@ MODEL: ...@@ -3,7 +3,8 @@ MODEL:
WEIGHT: "catalog://Caffe2Detectron/COCO/36761737/e2e_faster_rcnn_X-101-32x8d-FPN_1x" WEIGHT: "catalog://Caffe2Detectron/COCO/36761737/e2e_faster_rcnn_X-101-32x8d-FPN_1x"
BACKBONE: BACKBONE:
CONV_BODY: "R-101-FPN" CONV_BODY: "R-101-FPN"
OUT_CHANNELS: 256 RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN: RPN:
USE_FPN: True USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64) ANCHOR_STRIDE: (4, 8, 16, 32, 64)
......
...@@ -3,7 +3,8 @@ MODEL: ...@@ -3,7 +3,8 @@ MODEL:
WEIGHT: "catalog://Caffe2Detectron/COCO/37697547/e2e_keypoint_rcnn_R-50-FPN_1x" WEIGHT: "catalog://Caffe2Detectron/COCO/37697547/e2e_keypoint_rcnn_R-50-FPN_1x"
BACKBONE: BACKBONE:
CONV_BODY: "R-50-FPN" CONV_BODY: "R-50-FPN"
OUT_CHANNELS: 256 RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN: RPN:
USE_FPN: True USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64) ANCHOR_STRIDE: (4, 8, 16, 32, 64)
......
...@@ -3,7 +3,8 @@ MODEL: ...@@ -3,7 +3,8 @@ MODEL:
WEIGHT: "catalog://Caffe2Detectron/COCO/35861795/e2e_mask_rcnn_R-101-FPN_1x" WEIGHT: "catalog://Caffe2Detectron/COCO/35861795/e2e_mask_rcnn_R-101-FPN_1x"
BACKBONE: BACKBONE:
CONV_BODY: "R-101-FPN" CONV_BODY: "R-101-FPN"
OUT_CHANNELS: 256 RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN: RPN:
USE_FPN: True USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64) ANCHOR_STRIDE: (4, 8, 16, 32, 64)
......
...@@ -3,7 +3,8 @@ MODEL: ...@@ -3,7 +3,8 @@ MODEL:
WEIGHT: "catalog://Caffe2Detectron/COCO/35858933/e2e_mask_rcnn_R-50-FPN_1x" WEIGHT: "catalog://Caffe2Detectron/COCO/35858933/e2e_mask_rcnn_R-50-FPN_1x"
BACKBONE: BACKBONE:
CONV_BODY: "R-50-FPN" CONV_BODY: "R-50-FPN"
OUT_CHANNELS: 256 RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN: RPN:
USE_FPN: True USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64) ANCHOR_STRIDE: (4, 8, 16, 32, 64)
......
...@@ -3,7 +3,8 @@ MODEL: ...@@ -3,7 +3,8 @@ MODEL:
WEIGHT: "catalog://Caffe2Detectron/COCO/37129812/e2e_mask_rcnn_X-152-32x8d-FPN-IN5k_1.44x" WEIGHT: "catalog://Caffe2Detectron/COCO/37129812/e2e_mask_rcnn_X-152-32x8d-FPN-IN5k_1.44x"
BACKBONE: BACKBONE:
CONV_BODY: "R-152-FPN" CONV_BODY: "R-152-FPN"
OUT_CHANNELS: 256 RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN: RPN:
USE_FPN: True USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64) ANCHOR_STRIDE: (4, 8, 16, 32, 64)
......
...@@ -3,7 +3,11 @@ MODEL: ...@@ -3,7 +3,11 @@ MODEL:
WEIGHT: "catalog://Caffe2Detectron/COCO/36761843/e2e_mask_rcnn_X-101-32x8d-FPN_1x" WEIGHT: "catalog://Caffe2Detectron/COCO/36761843/e2e_mask_rcnn_X-101-32x8d-FPN_1x"
BACKBONE: BACKBONE:
CONV_BODY: "R-101-FPN" CONV_BODY: "R-101-FPN"
OUT_CHANNELS: 256 RESNETS:
BACKBONE_OUT_CHANNELS: 256
STRIDE_IN_1X1: False
NUM_GROUPS: 32
WIDTH_PER_GROUP: 8
RPN: RPN:
USE_FPN: True USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64) ANCHOR_STRIDE: (4, 8, 16, 32, 64)
...@@ -27,10 +31,6 @@ MODEL: ...@@ -27,10 +31,6 @@ MODEL:
POOLER_SAMPLING_RATIO: 2 POOLER_SAMPLING_RATIO: 2
RESOLUTION: 28 RESOLUTION: 28
SHARE_BOX_FEATURE_EXTRACTOR: False SHARE_BOX_FEATURE_EXTRACTOR: False
RESNETS:
STRIDE_IN_1X1: False
NUM_GROUPS: 32
WIDTH_PER_GROUP: 8
MASK_ON: True MASK_ON: True
DATASETS: DATASETS:
TEST: ("coco_2014_minival",) TEST: ("coco_2014_minival",)
......
...@@ -3,7 +3,8 @@ MODEL: ...@@ -3,7 +3,8 @@ MODEL:
WEIGHT: "catalog://ImageNetPretrained/MSRA/R-50" WEIGHT: "catalog://ImageNetPretrained/MSRA/R-50"
BACKBONE: BACKBONE:
CONV_BODY: "R-50-FPN" CONV_BODY: "R-50-FPN"
OUT_CHANNELS: 256 RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN: RPN:
USE_FPN: True USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64) ANCHOR_STRIDE: (4, 8, 16, 32, 64)
......
...@@ -3,7 +3,8 @@ MODEL: ...@@ -3,7 +3,8 @@ MODEL:
WEIGHT: "catalog://ImageNetPretrained/MSRA/R-50" WEIGHT: "catalog://ImageNetPretrained/MSRA/R-50"
BACKBONE: BACKBONE:
CONV_BODY: "R-50-FPN" CONV_BODY: "R-50-FPN"
OUT_CHANNELS: 256 RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN: RPN:
USE_FPN: True USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64) ANCHOR_STRIDE: (4, 8, 16, 32, 64)
......
...@@ -3,7 +3,8 @@ MODEL: ...@@ -3,7 +3,8 @@ MODEL:
WEIGHT: "catalog://ImageNetPretrained/MSRA/R-101" WEIGHT: "catalog://ImageNetPretrained/MSRA/R-101"
BACKBONE: BACKBONE:
CONV_BODY: "R-101-FPN" CONV_BODY: "R-101-FPN"
OUT_CHANNELS: 256 RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN: RPN:
USE_FPN: True USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64) ANCHOR_STRIDE: (4, 8, 16, 32, 64)
......
...@@ -3,7 +3,8 @@ MODEL: ...@@ -3,7 +3,8 @@ MODEL:
WEIGHT: "catalog://ImageNetPretrained/MSRA/R-50" WEIGHT: "catalog://ImageNetPretrained/MSRA/R-50"
BACKBONE: BACKBONE:
CONV_BODY: "R-50-FPN" CONV_BODY: "R-50-FPN"
OUT_CHANNELS: 256 RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN: RPN:
USE_FPN: True USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64) ANCHOR_STRIDE: (4, 8, 16, 32, 64)
......
...@@ -3,7 +3,6 @@ MODEL: ...@@ -3,7 +3,6 @@ MODEL:
WEIGHT: "catalog://ImageNetPretrained/FAIR/20171220/X-101-32x8d" WEIGHT: "catalog://ImageNetPretrained/FAIR/20171220/X-101-32x8d"
BACKBONE: BACKBONE:
CONV_BODY: "R-101-FPN" CONV_BODY: "R-101-FPN"
OUT_CHANNELS: 256
RPN: RPN:
USE_FPN: True USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64) ANCHOR_STRIDE: (4, 8, 16, 32, 64)
...@@ -20,6 +19,7 @@ MODEL: ...@@ -20,6 +19,7 @@ MODEL:
FEATURE_EXTRACTOR: "FPN2MLPFeatureExtractor" FEATURE_EXTRACTOR: "FPN2MLPFeatureExtractor"
PREDICTOR: "FPNPredictor" PREDICTOR: "FPNPredictor"
RESNETS: RESNETS:
BACKBONE_OUT_CHANNELS: 256
STRIDE_IN_1X1: False STRIDE_IN_1X1: False
NUM_GROUPS: 32 NUM_GROUPS: 32
WIDTH_PER_GROUP: 8 WIDTH_PER_GROUP: 8
......
MODEL:
META_ARCHITECTURE: "GeneralizedRCNN"
BACKBONE:
CONV_BODY: FBNet
FBNET:
ARCH: "default"
BN_TYPE: "bn"
WIDTH_DIVISOR: 8
DW_CONV_SKIP_BN: True
DW_CONV_SKIP_RELU: True
RPN:
ANCHOR_SIZES: (16, 32, 64, 128, 256)
ANCHOR_STRIDE: (16, )
BATCH_SIZE_PER_IMAGE: 256
PRE_NMS_TOP_N_TRAIN: 6000
PRE_NMS_TOP_N_TEST: 6000
POST_NMS_TOP_N_TRAIN: 2000
POST_NMS_TOP_N_TEST: 1000
RPN_HEAD: FBNet.rpn_head
ROI_HEADS:
BATCH_SIZE_PER_IMAGE: 512
ROI_BOX_HEAD:
POOLER_RESOLUTION: 6
FEATURE_EXTRACTOR: FBNet.roi_head
NUM_CLASSES: 81
DATASETS:
TRAIN: ("coco_2014_train", "coco_2014_valminusminival")
TEST: ("coco_2014_minival",)
SOLVER:
BASE_LR: 0.06
WARMUP_FACTOR: 0.1
WEIGHT_DECAY: 0.0001
STEPS: (60000, 80000)
MAX_ITER: 90000
IMS_PER_BATCH: 128 # for 8GPUs
# TEST:
# IMS_PER_BATCH: 8
INPUT:
MIN_SIZE_TRAIN: (320, )
MAX_SIZE_TRAIN: 640
MIN_SIZE_TEST: 320
MAX_SIZE_TEST: 640
PIXEL_MEAN: [103.53, 116.28, 123.675]
PIXEL_STD: [57.375, 57.12, 58.395]
MODEL:
META_ARCHITECTURE: "GeneralizedRCNN"
BACKBONE:
CONV_BODY: FBNet
FBNET:
ARCH: "default"
BN_TYPE: "bn"
WIDTH_DIVISOR: 8
DW_CONV_SKIP_BN: True
DW_CONV_SKIP_RELU: True
RPN:
ANCHOR_SIZES: (32, 64, 128, 256, 512)
ANCHOR_STRIDE: (16, )
BATCH_SIZE_PER_IMAGE: 256
PRE_NMS_TOP_N_TRAIN: 6000
PRE_NMS_TOP_N_TEST: 6000
POST_NMS_TOP_N_TRAIN: 2000
POST_NMS_TOP_N_TEST: 1000
RPN_HEAD: FBNet.rpn_head
ROI_HEADS:
BATCH_SIZE_PER_IMAGE: 256
ROI_BOX_HEAD:
POOLER_RESOLUTION: 6
FEATURE_EXTRACTOR: FBNet.roi_head
NUM_CLASSES: 81
DATASETS:
TRAIN: ("coco_2014_train", "coco_2014_valminusminival")
TEST: ("coco_2014_minival",)
SOLVER:
BASE_LR: 0.06
WARMUP_FACTOR: 0.1
WEIGHT_DECAY: 0.0001
STEPS: (60000, 80000)
MAX_ITER: 90000
IMS_PER_BATCH: 128 # for 8GPUs
# TEST:
# IMS_PER_BATCH: 8
INPUT:
MIN_SIZE_TRAIN: (600, )
MAX_SIZE_TRAIN: 1000
MIN_SIZE_TEST: 600
MAX_SIZE_TEST: 1000
PIXEL_MEAN: [103.53, 116.28, 123.675]
PIXEL_STD: [57.375, 57.12, 58.395]
...@@ -3,7 +3,8 @@ MODEL: ...@@ -3,7 +3,8 @@ MODEL:
WEIGHT: "catalog://ImageNetPretrained/MSRA/R-50" WEIGHT: "catalog://ImageNetPretrained/MSRA/R-50"
BACKBONE: BACKBONE:
CONV_BODY: "R-50-FPN" CONV_BODY: "R-50-FPN"
OUT_CHANNELS: 256 RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN: RPN:
USE_FPN: True USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64) ANCHOR_STRIDE: (4, 8, 16, 32, 64)
......
...@@ -3,7 +3,8 @@ MODEL: ...@@ -3,7 +3,8 @@ MODEL:
WEIGHT: "catalog://ImageNetPretrained/MSRA/R-101" WEIGHT: "catalog://ImageNetPretrained/MSRA/R-101"
BACKBONE: BACKBONE:
CONV_BODY: "R-101-FPN" CONV_BODY: "R-101-FPN"
OUT_CHANNELS: 256 RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN: RPN:
USE_FPN: True USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64) ANCHOR_STRIDE: (4, 8, 16, 32, 64)
......
...@@ -3,7 +3,8 @@ MODEL: ...@@ -3,7 +3,8 @@ MODEL:
WEIGHT: "catalog://ImageNetPretrained/MSRA/R-50" WEIGHT: "catalog://ImageNetPretrained/MSRA/R-50"
BACKBONE: BACKBONE:
CONV_BODY: "R-50-FPN" CONV_BODY: "R-50-FPN"
OUT_CHANNELS: 256 RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN: RPN:
USE_FPN: True USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64) ANCHOR_STRIDE: (4, 8, 16, 32, 64)
......
...@@ -3,7 +3,11 @@ MODEL: ...@@ -3,7 +3,11 @@ MODEL:
WEIGHT: "catalog://ImageNetPretrained/FAIR/20171220/X-101-32x8d" WEIGHT: "catalog://ImageNetPretrained/FAIR/20171220/X-101-32x8d"
BACKBONE: BACKBONE:
CONV_BODY: "R-101-FPN" CONV_BODY: "R-101-FPN"
OUT_CHANNELS: 256 RESNETS:
BACKBONE_OUT_CHANNELS: 256
STRIDE_IN_1X1: False
NUM_GROUPS: 32
WIDTH_PER_GROUP: 8
RPN: RPN:
USE_FPN: True USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64) ANCHOR_STRIDE: (4, 8, 16, 32, 64)
...@@ -27,10 +31,6 @@ MODEL: ...@@ -27,10 +31,6 @@ MODEL:
POOLER_SAMPLING_RATIO: 2 POOLER_SAMPLING_RATIO: 2
RESOLUTION: 28 RESOLUTION: 28
SHARE_BOX_FEATURE_EXTRACTOR: False SHARE_BOX_FEATURE_EXTRACTOR: False
RESNETS:
STRIDE_IN_1X1: False
NUM_GROUPS: 32
WIDTH_PER_GROUP: 8
MASK_ON: True MASK_ON: True
DATASETS: DATASETS:
TRAIN: ("coco_2014_train", "coco_2014_valminusminival") TRAIN: ("coco_2014_train", "coco_2014_valminusminival")
......
MODEL:
META_ARCHITECTURE: "GeneralizedRCNN"
BACKBONE:
CONV_BODY: FBNet
FBNET:
ARCH: "default"
BN_TYPE: "bn"
WIDTH_DIVISOR: 8
DW_CONV_SKIP_BN: True
DW_CONV_SKIP_RELU: True
DET_HEAD_LAST_SCALE: -1.0
RPN:
ANCHOR_SIZES: (16, 32, 64, 128, 256)
ANCHOR_STRIDE: (16, )
BATCH_SIZE_PER_IMAGE: 256
PRE_NMS_TOP_N_TRAIN: 6000
PRE_NMS_TOP_N_TEST: 6000
POST_NMS_TOP_N_TRAIN: 2000
POST_NMS_TOP_N_TEST: 1000
RPN_HEAD: FBNet.rpn_head
ROI_HEADS:
BATCH_SIZE_PER_IMAGE: 256
ROI_BOX_HEAD:
POOLER_RESOLUTION: 6
FEATURE_EXTRACTOR: FBNet.roi_head
NUM_CLASSES: 81
ROI_MASK_HEAD:
POOLER_RESOLUTION: 6
FEATURE_EXTRACTOR: FBNet.roi_head_mask
PREDICTOR: "MaskRCNNConv1x1Predictor"
RESOLUTION: 12
SHARE_BOX_FEATURE_EXTRACTOR: False
MASK_ON: True
DATASETS:
TRAIN: ("coco_2014_train", "coco_2014_valminusminival")
TEST: ("coco_2014_minival",)
SOLVER:
BASE_LR: 0.06
WARMUP_FACTOR: 0.1
WEIGHT_DECAY: 0.0001
STEPS: (60000, 80000)
MAX_ITER: 90000
IMS_PER_BATCH: 128 # for 8GPUs
# TEST:
# IMS_PER_BATCH: 8
INPUT:
MIN_SIZE_TRAIN: (320, )
MAX_SIZE_TRAIN: 640
MIN_SIZE_TEST: 320
MAX_SIZE_TEST: 640
PIXEL_MEAN: [103.53, 116.28, 123.675]
PIXEL_STD: [57.375, 57.12, 58.395]
MODEL:
META_ARCHITECTURE: "GeneralizedRCNN"
BACKBONE:
CONV_BODY: FBNet
FBNET:
ARCH: "xirb16d_dsmask"
BN_TYPE: "bn"
WIDTH_DIVISOR: 8
DW_CONV_SKIP_BN: True
DW_CONV_SKIP_RELU: True
DET_HEAD_LAST_SCALE: -1.0
RPN:
ANCHOR_SIZES: (16, 32, 64, 128, 256)
ANCHOR_STRIDE: (16, )
BATCH_SIZE_PER_IMAGE: 256
PRE_NMS_TOP_N_TRAIN: 6000
PRE_NMS_TOP_N_TEST: 6000
POST_NMS_TOP_N_TRAIN: 2000
POST_NMS_TOP_N_TEST: 1000
RPN_HEAD: FBNet.rpn_head
ROI_HEADS:
BATCH_SIZE_PER_IMAGE: 512
ROI_BOX_HEAD:
POOLER_RESOLUTION: 6
FEATURE_EXTRACTOR: FBNet.roi_head
NUM_CLASSES: 81
ROI_MASK_HEAD:
POOLER_RESOLUTION: 6
FEATURE_EXTRACTOR: FBNet.roi_head_mask
PREDICTOR: "MaskRCNNConv1x1Predictor"
RESOLUTION: 12
SHARE_BOX_FEATURE_EXTRACTOR: False
MASK_ON: True
DATASETS:
TRAIN: ("coco_2014_train", "coco_2014_valminusminival")
TEST: ("coco_2014_minival",)
SOLVER:
BASE_LR: 0.06
WARMUP_FACTOR: 0.1
WEIGHT_DECAY: 0.0001
STEPS: (60000, 80000)
MAX_ITER: 90000
IMS_PER_BATCH: 128 # for 8GPUs
# TEST:
# IMS_PER_BATCH: 8
INPUT:
MIN_SIZE_TRAIN: (320, )
MAX_SIZE_TRAIN: 640
MIN_SIZE_TEST: 320
MAX_SIZE_TEST: 640
PIXEL_MEAN: [103.53, 116.28, 123.675]
PIXEL_STD: [57.375, 57.12, 58.395]
INPUT: INPUT:
MIN_SIZE_TRAIN: 800 MIN_SIZE_TRAIN: (800,)
MAX_SIZE_TRAIN: 1333 MAX_SIZE_TRAIN: 1333
MIN_SIZE_TEST: 800 MIN_SIZE_TEST: 800
MAX_SIZE_TEST: 1333 MAX_SIZE_TEST: 1333
...@@ -8,8 +8,8 @@ MODEL: ...@@ -8,8 +8,8 @@ MODEL:
WEIGHT: "catalog://ImageNetPretrained/MSRA/R-50-GN" WEIGHT: "catalog://ImageNetPretrained/MSRA/R-50-GN"
BACKBONE: BACKBONE:
CONV_BODY: "R-50-FPN" CONV_BODY: "R-50-FPN"
OUT_CHANNELS: 256
RESNETS: # use GN for backbone RESNETS: # use GN for backbone
BACKBONE_OUT_CHANNELS: 256
STRIDE_IN_1X1: False STRIDE_IN_1X1: False
TRANS_FUNC: "BottleneckWithGN" TRANS_FUNC: "BottleneckWithGN"
STEM_FUNC: "StemWithGN" STEM_FUNC: "StemWithGN"
......
INPUT: INPUT:
MIN_SIZE_TRAIN: 800 MIN_SIZE_TRAIN: (800,)
MAX_SIZE_TRAIN: 1333 MAX_SIZE_TRAIN: 1333
MIN_SIZE_TEST: 800 MIN_SIZE_TEST: 800
MAX_SIZE_TEST: 1333 MAX_SIZE_TEST: 1333
...@@ -8,8 +8,8 @@ MODEL: ...@@ -8,8 +8,8 @@ MODEL:
WEIGHT: "catalog://ImageNetPretrained/MSRA/R-50-GN" WEIGHT: "catalog://ImageNetPretrained/MSRA/R-50-GN"
BACKBONE: BACKBONE:
CONV_BODY: "R-50-FPN" CONV_BODY: "R-50-FPN"
OUT_CHANNELS: 256
RESNETS: # use GN for backbone RESNETS: # use GN for backbone
BACKBONE_OUT_CHANNELS: 256
STRIDE_IN_1X1: False STRIDE_IN_1X1: False
TRANS_FUNC: "BottleneckWithGN" TRANS_FUNC: "BottleneckWithGN"
STEM_FUNC: "StemWithGN" STEM_FUNC: "StemWithGN"
......
INPUT: INPUT:
MIN_SIZE_TRAIN: 800 MIN_SIZE_TRAIN: (800,)
MAX_SIZE_TRAIN: 1333 MAX_SIZE_TRAIN: 1333
MIN_SIZE_TEST: 800 MIN_SIZE_TEST: 800
MAX_SIZE_TEST: 1333 MAX_SIZE_TEST: 1333
...@@ -8,8 +8,8 @@ MODEL: ...@@ -8,8 +8,8 @@ MODEL:
WEIGHT: "catalog://ImageNetPretrained/MSRA/R-50-GN" WEIGHT: "catalog://ImageNetPretrained/MSRA/R-50-GN"
BACKBONE: BACKBONE:
CONV_BODY: "R-50-FPN" CONV_BODY: "R-50-FPN"
OUT_CHANNELS: 256
RESNETS: # use GN for backbone RESNETS: # use GN for backbone
BACKBONE_OUT_CHANNELS: 256
STRIDE_IN_1X1: False STRIDE_IN_1X1: False
TRANS_FUNC: "BottleneckWithGN" TRANS_FUNC: "BottleneckWithGN"
STEM_FUNC: "StemWithGN" STEM_FUNC: "StemWithGN"
......
INPUT: INPUT:
MIN_SIZE_TRAIN: 800 MIN_SIZE_TRAIN: (800,)
MAX_SIZE_TRAIN: 1333 MAX_SIZE_TRAIN: 1333
MIN_SIZE_TEST: 800 MIN_SIZE_TEST: 800
MAX_SIZE_TEST: 1333 MAX_SIZE_TEST: 1333
...@@ -8,8 +8,8 @@ MODEL: ...@@ -8,8 +8,8 @@ MODEL:
WEIGHT: "catalog://ImageNetPretrained/MSRA/R-50-GN" WEIGHT: "catalog://ImageNetPretrained/MSRA/R-50-GN"
BACKBONE: BACKBONE:
CONV_BODY: "R-50-FPN" CONV_BODY: "R-50-FPN"
OUT_CHANNELS: 256
RESNETS: # use GN for backbone RESNETS: # use GN for backbone
BACKBONE_OUT_CHANNELS: 256
STRIDE_IN_1X1: False STRIDE_IN_1X1: False
TRANS_FUNC: "BottleneckWithGN" TRANS_FUNC: "BottleneckWithGN"
STEM_FUNC: "StemWithGN" STEM_FUNC: "StemWithGN"
......
INPUT: INPUT:
MIN_SIZE_TRAIN: 800 MIN_SIZE_TRAIN: (800,)
MAX_SIZE_TRAIN: 1333 MAX_SIZE_TRAIN: 1333
MIN_SIZE_TEST: 800 MIN_SIZE_TEST: 800
MAX_SIZE_TEST: 1333 MAX_SIZE_TEST: 1333
...@@ -8,9 +8,9 @@ MODEL: ...@@ -8,9 +8,9 @@ MODEL:
WEIGHT: "" # no pretrained model WEIGHT: "" # no pretrained model
BACKBONE: BACKBONE:
CONV_BODY: "R-50-FPN" CONV_BODY: "R-50-FPN"
OUT_CHANNELS: 256
FREEZE_CONV_BODY_AT: 0 # finetune all layers FREEZE_CONV_BODY_AT: 0 # finetune all layers
RESNETS: # use GN for backbone RESNETS: # use GN for backbone
BACKBONE_OUT_CHANNELS: 256
STRIDE_IN_1X1: False STRIDE_IN_1X1: False
TRANS_FUNC: "BottleneckWithGN" TRANS_FUNC: "BottleneckWithGN"
STEM_FUNC: "StemWithGN" STEM_FUNC: "StemWithGN"
......
INPUT: INPUT:
MIN_SIZE_TRAIN: 800 MIN_SIZE_TRAIN: (800,)
MAX_SIZE_TRAIN: 1333 MAX_SIZE_TRAIN: 1333
MIN_SIZE_TEST: 800 MIN_SIZE_TEST: 800
MAX_SIZE_TEST: 1333 MAX_SIZE_TEST: 1333
...@@ -8,9 +8,9 @@ MODEL: ...@@ -8,9 +8,9 @@ MODEL:
WEIGHT: "" # no pretrained model WEIGHT: "" # no pretrained model
BACKBONE: BACKBONE:
CONV_BODY: "R-50-FPN" CONV_BODY: "R-50-FPN"
OUT_CHANNELS: 256
FREEZE_CONV_BODY_AT: 0 # finetune all layers FREEZE_CONV_BODY_AT: 0 # finetune all layers
RESNETS: # use GN for backbone RESNETS: # use GN for backbone
BACKBONE_OUT_CHANNELS: 256
STRIDE_IN_1X1: False STRIDE_IN_1X1: False
TRANS_FUNC: "BottleneckWithGN" TRANS_FUNC: "BottleneckWithGN"
STEM_FUNC: "StemWithGN" STEM_FUNC: "StemWithGN"
......
INPUT: INPUT:
MIN_SIZE_TRAIN: 800 MIN_SIZE_TRAIN: (800,)
MAX_SIZE_TRAIN: 1333 MAX_SIZE_TRAIN: 1333
MIN_SIZE_TEST: 800 MIN_SIZE_TEST: 800
MAX_SIZE_TEST: 1333 MAX_SIZE_TEST: 1333
...@@ -8,9 +8,9 @@ MODEL: ...@@ -8,9 +8,9 @@ MODEL:
WEIGHT: "" # no pretrained model WEIGHT: "" # no pretrained model
BACKBONE: BACKBONE:
CONV_BODY: "R-50-FPN" CONV_BODY: "R-50-FPN"
OUT_CHANNELS: 256
FREEZE_CONV_BODY_AT: 0 # finetune all layers FREEZE_CONV_BODY_AT: 0 # finetune all layers
RESNETS: # use GN for backbone RESNETS: # use GN for backbone
BACKBONE_OUT_CHANNELS: 256
STRIDE_IN_1X1: False STRIDE_IN_1X1: False
TRANS_FUNC: "BottleneckWithGN" TRANS_FUNC: "BottleneckWithGN"
STEM_FUNC: "StemWithGN" STEM_FUNC: "StemWithGN"
......
INPUT: INPUT:
MIN_SIZE_TRAIN: 800 MIN_SIZE_TRAIN: (800,)
MAX_SIZE_TRAIN: 1333 MAX_SIZE_TRAIN: 1333
MIN_SIZE_TEST: 800 MIN_SIZE_TEST: 800
MAX_SIZE_TEST: 1333 MAX_SIZE_TEST: 1333
...@@ -8,9 +8,9 @@ MODEL: ...@@ -8,9 +8,9 @@ MODEL:
WEIGHT: "" # no pretrained model WEIGHT: "" # no pretrained model
BACKBONE: BACKBONE:
CONV_BODY: "R-50-FPN" CONV_BODY: "R-50-FPN"
OUT_CHANNELS: 256
FREEZE_CONV_BODY_AT: 0 # finetune all layers FREEZE_CONV_BODY_AT: 0 # finetune all layers
RESNETS: # use GN for backbone RESNETS: # use GN for backbone
BACKBONE_OUT_CHANNELS: 256
STRIDE_IN_1X1: False STRIDE_IN_1X1: False
TRANS_FUNC: "BottleneckWithGN" TRANS_FUNC: "BottleneckWithGN"
STEM_FUNC: "StemWithGN" STEM_FUNC: "StemWithGN"
......
...@@ -3,7 +3,8 @@ MODEL: ...@@ -3,7 +3,8 @@ MODEL:
WEIGHT: "catalog://ImageNetPretrained/MSRA/R-50" WEIGHT: "catalog://ImageNetPretrained/MSRA/R-50"
BACKBONE: BACKBONE:
CONV_BODY: "R-50-FPN" CONV_BODY: "R-50-FPN"
OUT_CHANNELS: 256 RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN: RPN:
USE_FPN: True USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64) ANCHOR_STRIDE: (4, 8, 16, 32, 64)
......
...@@ -3,7 +3,8 @@ MODEL: ...@@ -3,7 +3,8 @@ MODEL:
WEIGHT: "catalog://ImageNetPretrained/MSRA/R-50" WEIGHT: "catalog://ImageNetPretrained/MSRA/R-50"
BACKBONE: BACKBONE:
CONV_BODY: "R-50-FPN" CONV_BODY: "R-50-FPN"
OUT_CHANNELS: 256 RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN: RPN:
USE_FPN: True USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64) ANCHOR_STRIDE: (4, 8, 16, 32, 64)
......
...@@ -3,7 +3,8 @@ MODEL: ...@@ -3,7 +3,8 @@ MODEL:
WEIGHT: "catalog://ImageNetPretrained/FAIR/20171220/X-101-32x8d" WEIGHT: "catalog://ImageNetPretrained/FAIR/20171220/X-101-32x8d"
BACKBONE: BACKBONE:
CONV_BODY: "R-101-FPN" CONV_BODY: "R-101-FPN"
OUT_CHANNELS: 256 RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN: RPN:
USE_FPN: True USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64) ANCHOR_STRIDE: (4, 8, 16, 32, 64)
......
...@@ -3,7 +3,8 @@ MODEL: ...@@ -3,7 +3,8 @@ MODEL:
WEIGHT: "catalog://ImageNetPretrained/MSRA/R-50" WEIGHT: "catalog://ImageNetPretrained/MSRA/R-50"
BACKBONE: BACKBONE:
CONV_BODY: "R-50-FPN" CONV_BODY: "R-50-FPN"
OUT_CHANNELS: 256 RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN: RPN:
USE_FPN: True USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64) ANCHOR_STRIDE: (4, 8, 16, 32, 64)
......
...@@ -3,7 +3,8 @@ MODEL: ...@@ -3,7 +3,8 @@ MODEL:
WEIGHT: "catalog://ImageNetPretrained/MSRA/R-50" WEIGHT: "catalog://ImageNetPretrained/MSRA/R-50"
BACKBONE: BACKBONE:
CONV_BODY: "R-50-FPN" CONV_BODY: "R-50-FPN"
OUT_CHANNELS: 256 RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN: RPN:
USE_FPN: True USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64) ANCHOR_STRIDE: (4, 8, 16, 32, 64)
......
...@@ -3,7 +3,8 @@ MODEL: ...@@ -3,7 +3,8 @@ MODEL:
WEIGHT: "catalog://ImageNetPretrained/FAIR/20171220/X-101-32x8d" WEIGHT: "catalog://ImageNetPretrained/FAIR/20171220/X-101-32x8d"
BACKBONE: BACKBONE:
CONV_BODY: "R-101-FPN" CONV_BODY: "R-101-FPN"
OUT_CHANNELS: 256 RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN: RPN:
USE_FPN: True USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64) ANCHOR_STRIDE: (4, 8, 16, 32, 64)
......
...@@ -4,7 +4,8 @@ MODEL: ...@@ -4,7 +4,8 @@ MODEL:
RPN_ONLY: True RPN_ONLY: True
BACKBONE: BACKBONE:
CONV_BODY: "R-50-FPN" CONV_BODY: "R-50-FPN"
OUT_CHANNELS: 256 RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN: RPN:
USE_FPN: True USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64) ANCHOR_STRIDE: (4, 8, 16, 32, 64)
......
...@@ -5,7 +5,8 @@ MODEL: ...@@ -5,7 +5,8 @@ MODEL:
RETINANET_ON: True RETINANET_ON: True
BACKBONE: BACKBONE:
CONV_BODY: "R-101-FPN-RETINANET" CONV_BODY: "R-101-FPN-RETINANET"
OUT_CHANNELS: 256 RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN: RPN:
USE_FPN: True USE_FPN: True
FG_IOU_THRESHOLD: 0.5 FG_IOU_THRESHOLD: 0.5
......
...@@ -5,7 +5,8 @@ MODEL: ...@@ -5,7 +5,8 @@ MODEL:
RETINANET_ON: True RETINANET_ON: True
BACKBONE: BACKBONE:
CONV_BODY: "R-101-FPN-RETINANET" CONV_BODY: "R-101-FPN-RETINANET"
OUT_CHANNELS: 256 RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN: RPN:
USE_FPN: True USE_FPN: True
FG_IOU_THRESHOLD: 0.5 FG_IOU_THRESHOLD: 0.5
......
...@@ -5,7 +5,8 @@ MODEL: ...@@ -5,7 +5,8 @@ MODEL:
RETINANET_ON: True RETINANET_ON: True
BACKBONE: BACKBONE:
CONV_BODY: "R-50-FPN-RETINANET" CONV_BODY: "R-50-FPN-RETINANET"
OUT_CHANNELS: 256 RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN: RPN:
USE_FPN: True USE_FPN: True
FG_IOU_THRESHOLD: 0.5 FG_IOU_THRESHOLD: 0.5
......
...@@ -5,7 +5,8 @@ MODEL: ...@@ -5,7 +5,8 @@ MODEL:
RETINANET_ON: True RETINANET_ON: True
BACKBONE: BACKBONE:
CONV_BODY: "R-50-FPN-RETINANET" CONV_BODY: "R-50-FPN-RETINANET"
OUT_CHANNELS: 256 RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN: RPN:
USE_FPN: True USE_FPN: True
FG_IOU_THRESHOLD: 0.5 FG_IOU_THRESHOLD: 0.5
......
...@@ -5,7 +5,8 @@ MODEL: ...@@ -5,7 +5,8 @@ MODEL:
RETINANET_ON: True RETINANET_ON: True
BACKBONE: BACKBONE:
CONV_BODY: "R-50-FPN-RETINANET" CONV_BODY: "R-50-FPN-RETINANET"
OUT_CHANNELS: 256 RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN: RPN:
USE_FPN: True USE_FPN: True
FG_IOU_THRESHOLD: 0.5 FG_IOU_THRESHOLD: 0.5
......
...@@ -5,7 +5,8 @@ MODEL: ...@@ -5,7 +5,8 @@ MODEL:
RETINANET_ON: True RETINANET_ON: True
BACKBONE: BACKBONE:
CONV_BODY: "R-101-FPN-RETINANET" CONV_BODY: "R-101-FPN-RETINANET"
OUT_CHANNELS: 256 RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN: RPN:
USE_FPN: True USE_FPN: True
FG_IOU_THRESHOLD: 0.5 FG_IOU_THRESHOLD: 0.5
......
...@@ -4,7 +4,8 @@ MODEL: ...@@ -4,7 +4,8 @@ MODEL:
RPN_ONLY: True RPN_ONLY: True
BACKBONE: BACKBONE:
CONV_BODY: "R-101-FPN" CONV_BODY: "R-101-FPN"
OUT_CHANNELS: 256 RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN: RPN:
USE_FPN: True USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64) ANCHOR_STRIDE: (4, 8, 16, 32, 64)
......
...@@ -4,7 +4,8 @@ MODEL: ...@@ -4,7 +4,8 @@ MODEL:
RPN_ONLY: True RPN_ONLY: True
BACKBONE: BACKBONE:
CONV_BODY: "R-50-FPN" CONV_BODY: "R-50-FPN"
OUT_CHANNELS: 256 RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN: RPN:
USE_FPN: True USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64) ANCHOR_STRIDE: (4, 8, 16, 32, 64)
......
...@@ -4,7 +4,8 @@ MODEL: ...@@ -4,7 +4,8 @@ MODEL:
RPN_ONLY: True RPN_ONLY: True
BACKBONE: BACKBONE:
CONV_BODY: "R-101-FPN" CONV_BODY: "R-101-FPN"
OUT_CHANNELS: 256 RESNETS:
BACKBONE_OUT_CHANNELS: 256
RPN: RPN:
USE_FPN: True USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64) ANCHOR_STRIDE: (4, 8, 16, 32, 64)
......
...@@ -92,7 +92,6 @@ _C.MODEL.BACKBONE.CONV_BODY = "R-50-C4" ...@@ -92,7 +92,6 @@ _C.MODEL.BACKBONE.CONV_BODY = "R-50-C4"
# Add StopGrad at a specified stage so the bottom layers are frozen # Add StopGrad at a specified stage so the bottom layers are frozen
_C.MODEL.BACKBONE.FREEZE_CONV_BODY_AT = 2 _C.MODEL.BACKBONE.FREEZE_CONV_BODY_AT = 2
_C.MODEL.BACKBONE.OUT_CHANNELS = 256 * 4
# GN for backbone # GN for backbone
_C.MODEL.BACKBONE.USE_GN = False _C.MODEL.BACKBONE.USE_GN = False
...@@ -271,6 +270,7 @@ _C.MODEL.RESNETS.STEM_FUNC = "StemWithFixedBatchNorm" ...@@ -271,6 +270,7 @@ _C.MODEL.RESNETS.STEM_FUNC = "StemWithFixedBatchNorm"
# Apply dilation in stage "res5" # Apply dilation in stage "res5"
_C.MODEL.RESNETS.RES5_DILATION = 1 _C.MODEL.RESNETS.RES5_DILATION = 1
_C.MODEL.RESNETS.BACKBONE_OUT_CHANNELS = 256 * 4
_C.MODEL.RESNETS.RES2_OUT_CHANNELS = 256 _C.MODEL.RESNETS.RES2_OUT_CHANNELS = 256
_C.MODEL.RESNETS.STEM_OUT_CHANNELS = 64 _C.MODEL.RESNETS.STEM_OUT_CHANNELS = 64
...@@ -335,6 +335,44 @@ _C.MODEL.RETINANET.INFERENCE_TH = 0.05 ...@@ -335,6 +335,44 @@ _C.MODEL.RETINANET.INFERENCE_TH = 0.05
# NMS threshold used in RetinaNet # NMS threshold used in RetinaNet
_C.MODEL.RETINANET.NMS_TH = 0.4 _C.MODEL.RETINANET.NMS_TH = 0.4
# ---------------------------------------------------------------------------- #
# FBNet options
# ---------------------------------------------------------------------------- #
_C.MODEL.FBNET = CN()
_C.MODEL.FBNET.ARCH = "default"
# custom arch
_C.MODEL.FBNET.ARCH_DEF = ""
_C.MODEL.FBNET.BN_TYPE = "bn"
_C.MODEL.FBNET.SCALE_FACTOR = 1.0
# the output channels will be divisible by WIDTH_DIVISOR
_C.MODEL.FBNET.WIDTH_DIVISOR = 1
_C.MODEL.FBNET.DW_CONV_SKIP_BN = True
_C.MODEL.FBNET.DW_CONV_SKIP_RELU = True
# > 0 scale, == 0 skip, < 0 same dimension
_C.MODEL.FBNET.DET_HEAD_LAST_SCALE = 1.0
_C.MODEL.FBNET.DET_HEAD_BLOCKS = []
# overwrite the stride for the head, 0 to use original value
_C.MODEL.FBNET.DET_HEAD_STRIDE = 0
# > 0 scale, == 0 skip, < 0 same dimension
_C.MODEL.FBNET.KPTS_HEAD_LAST_SCALE = 0.0
_C.MODEL.FBNET.KPTS_HEAD_BLOCKS = []
# overwrite the stride for the head, 0 to use original value
_C.MODEL.FBNET.KPTS_HEAD_STRIDE = 0
# > 0 scale, == 0 skip, < 0 same dimension
_C.MODEL.FBNET.MASK_HEAD_LAST_SCALE = 0.0
_C.MODEL.FBNET.MASK_HEAD_BLOCKS = []
# overwrite the stride for the head, 0 to use original value
_C.MODEL.FBNET.MASK_HEAD_STRIDE = 0
# 0 to use all blocks defined in arch_def
_C.MODEL.FBNET.RPN_HEAD_BLOCKS = 0
_C.MODEL.FBNET.RPN_BN_TYPE = ""
# ---------------------------------------------------------------------------- # # ---------------------------------------------------------------------------- #
# Solver # Solver
# ---------------------------------------------------------------------------- # # ---------------------------------------------------------------------------- #
......
...@@ -4,6 +4,7 @@ import torch ...@@ -4,6 +4,7 @@ import torch
from .batch_norm import FrozenBatchNorm2d from .batch_norm import FrozenBatchNorm2d
from .misc import Conv2d from .misc import Conv2d
from .misc import ConvTranspose2d from .misc import ConvTranspose2d
from .misc import BatchNorm2d
from .misc import interpolate from .misc import interpolate
from .nms import nms from .nms import nms
from .roi_align import ROIAlign from .roi_align import ROIAlign
...@@ -15,6 +16,6 @@ from .sigmoid_focal_loss import SigmoidFocalLoss ...@@ -15,6 +16,6 @@ from .sigmoid_focal_loss import SigmoidFocalLoss
__all__ = ["nms", "roi_align", "ROIAlign", "roi_pool", "ROIPool", __all__ = ["nms", "roi_align", "ROIAlign", "roi_pool", "ROIPool",
"smooth_l1_loss", "Conv2d", "ConvTranspose2d", "interpolate", "smooth_l1_loss", "Conv2d", "ConvTranspose2d", "interpolate",
"FrozenBatchNorm2d", "SigmoidFocalLoss" "BatchNorm2d", "FrozenBatchNorm2d", "SigmoidFocalLoss"
] ]
...@@ -26,7 +26,6 @@ class _NewEmptyTensorOp(torch.autograd.Function): ...@@ -26,7 +26,6 @@ class _NewEmptyTensorOp(torch.autograd.Function):
return _NewEmptyTensorOp.apply(grad, shape), None return _NewEmptyTensorOp.apply(grad, shape), None
class Conv2d(torch.nn.Conv2d): class Conv2d(torch.nn.Conv2d):
def forward(self, x): def forward(self, x):
if x.numel() > 0: if x.numel() > 0:
...@@ -64,6 +63,15 @@ class ConvTranspose2d(torch.nn.ConvTranspose2d): ...@@ -64,6 +63,15 @@ class ConvTranspose2d(torch.nn.ConvTranspose2d):
return _NewEmptyTensorOp.apply(x, output_shape) return _NewEmptyTensorOp.apply(x, output_shape)
class BatchNorm2d(torch.nn.BatchNorm2d):
def forward(self, x):
if x.numel() > 0:
return super(BatchNorm2d, self).forward(x)
# get output shape
output_shape = x.shape
return _NewEmptyTensorOp.apply(x, output_shape)
def interpolate( def interpolate(
input, size=None, scale_factor=None, mode="nearest", align_corners=None input, size=None, scale_factor=None, mode="nearest", align_corners=None
): ):
......
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved. # Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
from .backbone import build_backbone from .backbone import build_backbone
from . import fbnet
...@@ -16,6 +16,7 @@ from . import resnet ...@@ -16,6 +16,7 @@ from . import resnet
def build_resnet_backbone(cfg): def build_resnet_backbone(cfg):
body = resnet.ResNet(cfg) body = resnet.ResNet(cfg)
model = nn.Sequential(OrderedDict([("body", body)])) model = nn.Sequential(OrderedDict([("body", body)]))
model.out_channels = cfg.MODEL.RESNETS.BACKBONE_OUT_CHANNELS
return model return model
...@@ -25,7 +26,7 @@ def build_resnet_backbone(cfg): ...@@ -25,7 +26,7 @@ def build_resnet_backbone(cfg):
def build_resnet_fpn_backbone(cfg): def build_resnet_fpn_backbone(cfg):
body = resnet.ResNet(cfg) body = resnet.ResNet(cfg)
in_channels_stage2 = cfg.MODEL.RESNETS.RES2_OUT_CHANNELS in_channels_stage2 = cfg.MODEL.RESNETS.RES2_OUT_CHANNELS
out_channels = cfg.MODEL.BACKBONE.OUT_CHANNELS out_channels = cfg.MODEL.RESNETS.BACKBONE_OUT_CHANNELS
fpn = fpn_module.FPN( fpn = fpn_module.FPN(
in_channels_list=[ in_channels_list=[
in_channels_stage2, in_channels_stage2,
...@@ -40,14 +41,16 @@ def build_resnet_fpn_backbone(cfg): ...@@ -40,14 +41,16 @@ def build_resnet_fpn_backbone(cfg):
top_blocks=fpn_module.LastLevelMaxPool(), top_blocks=fpn_module.LastLevelMaxPool(),
) )
model = nn.Sequential(OrderedDict([("body", body), ("fpn", fpn)])) model = nn.Sequential(OrderedDict([("body", body), ("fpn", fpn)]))
model.out_channels = out_channels
return model return model
@registry.BACKBONES.register("R-50-FPN-RETINANET") @registry.BACKBONES.register("R-50-FPN-RETINANET")
@registry.BACKBONES.register("R-101-FPN-RETINANET") @registry.BACKBONES.register("R-101-FPN-RETINANET")
def build_resnet_fpn_p3p7_backbone(cfg): def build_resnet_fpn_p3p7_backbone(cfg):
body = resnet.ResNet(cfg) body = resnet.ResNet(cfg)
in_channels_stage2 = cfg.MODEL.RESNETS.RES2_OUT_CHANNELS in_channels_stage2 = cfg.MODEL.RESNETS.RES2_OUT_CHANNELS
out_channels = cfg.MODEL.BACKBONE.OUT_CHANNELS out_channels = cfg.MODEL.RESNETS.BACKBONE_OUT_CHANNELS
in_channels_p6p7 = in_channels_stage2 * 8 if cfg.MODEL.RETINANET.USE_C5 \ in_channels_p6p7 = in_channels_stage2 * 8 if cfg.MODEL.RETINANET.USE_C5 \
else out_channels else out_channels
fpn = fpn_module.FPN( fpn = fpn_module.FPN(
...@@ -64,8 +67,10 @@ def build_resnet_fpn_p3p7_backbone(cfg): ...@@ -64,8 +67,10 @@ def build_resnet_fpn_p3p7_backbone(cfg):
top_blocks=fpn_module.LastLevelP6P7(in_channels_p6p7, out_channels), top_blocks=fpn_module.LastLevelP6P7(in_channels_p6p7, out_channels),
) )
model = nn.Sequential(OrderedDict([("body", body), ("fpn", fpn)])) model = nn.Sequential(OrderedDict([("body", body), ("fpn", fpn)]))
model.out_channels = out_channels
return model return model
def build_backbone(cfg): def build_backbone(cfg):
assert cfg.MODEL.BACKBONE.CONV_BODY in registry.BACKBONES, \ assert cfg.MODEL.BACKBONE.CONV_BODY in registry.BACKBONES, \
"cfg.MODEL.BACKBONE.CONV_BODY: {} are not registered in registry".format( "cfg.MODEL.BACKBONE.CONV_BODY: {} are not registered in registry".format(
......
from __future__ import absolute_import, division, print_function, unicode_literals
import copy
import json
import logging
from collections import OrderedDict
from . import (
fbnet_builder as mbuilder,
fbnet_modeldef as modeldef,
)
import torch.nn as nn
from maskrcnn_benchmark.modeling import registry
from maskrcnn_benchmark.modeling.rpn import rpn
from maskrcnn_benchmark.modeling import poolers
logger = logging.getLogger(__name__)
def create_builder(cfg):
bn_type = cfg.MODEL.FBNET.BN_TYPE
if bn_type == "gn":
bn_type = (bn_type, cfg.GROUP_NORM.NUM_GROUPS)
factor = cfg.MODEL.FBNET.SCALE_FACTOR
arch = cfg.MODEL.FBNET.ARCH
arch_def = cfg.MODEL.FBNET.ARCH_DEF
if len(arch_def) > 0:
arch_def = json.loads(arch_def)
if arch in modeldef.MODEL_ARCH:
if len(arch_def) > 0:
assert (
arch_def == modeldef.MODEL_ARCH[arch]
), "Two architectures with the same name {},\n{},\n{}".format(
arch, arch_def, modeldef.MODEL_ARCH[arch]
)
arch_def = modeldef.MODEL_ARCH[arch]
else:
assert arch_def is not None and len(arch_def) > 0
arch_def = mbuilder.unify_arch_def(arch_def)
rpn_stride = arch_def.get("rpn_stride", None)
if rpn_stride is not None:
assert (
cfg.MODEL.RPN.ANCHOR_STRIDE[0] == rpn_stride
), "Needs to set cfg.MODEL.RPN.ANCHOR_STRIDE to {}, got {}".format(
rpn_stride, cfg.MODEL.RPN.ANCHOR_STRIDE
)
width_divisor = cfg.MODEL.FBNET.WIDTH_DIVISOR
dw_skip_bn = cfg.MODEL.FBNET.DW_CONV_SKIP_BN
dw_skip_relu = cfg.MODEL.FBNET.DW_CONV_SKIP_RELU
logger.info(
"Building fbnet model with arch {} (without scaling):\n{}".format(
arch, arch_def
)
)
builder = mbuilder.FBNetBuilder(
width_ratio=factor,
bn_type=bn_type,
width_divisor=width_divisor,
dw_skip_bn=dw_skip_bn,
dw_skip_relu=dw_skip_relu,
)
return builder, arch_def
def _get_trunk_cfg(arch_def):
""" Get all stages except the last one """
num_stages = mbuilder.get_num_stages(arch_def)
trunk_stages = arch_def.get("backbone", range(num_stages - 1))
ret = mbuilder.get_blocks(arch_def, stage_indices=trunk_stages)
return ret
class FBNetTrunk(nn.Module):
def __init__(
self, builder, arch_def, dim_in,
):
super(FBNetTrunk, self).__init__()
self.first = builder.add_first(arch_def["first"], dim_in=dim_in)
trunk_cfg = _get_trunk_cfg(arch_def)
self.stages = builder.add_blocks(trunk_cfg["stages"])
# return features for each stage
def forward(self, x):
y = self.first(x)
y = self.stages(y)
ret = [y]
return ret
@registry.BACKBONES.register("FBNet")
def add_conv_body(cfg, dim_in=3):
builder, arch_def = create_builder(cfg)
body = FBNetTrunk(builder, arch_def, dim_in)
model = nn.Sequential(OrderedDict([("body", body)]))
model.out_channels = builder.last_depth
return model
def _get_rpn_stage(arch_def, num_blocks):
rpn_stage = arch_def.get("rpn")
ret = mbuilder.get_blocks(arch_def, stage_indices=rpn_stage)
if num_blocks > 0:
logger.warn('Use last {} blocks in {} as rpn'.format(num_blocks, ret))
block_count = len(ret["stages"])
assert num_blocks <= block_count, "use block {}, block count {}".format(
num_blocks, block_count
)
blocks = range(block_count - num_blocks, block_count)
ret = mbuilder.get_blocks(ret, block_indices=blocks)
return ret["stages"]
class FBNetRPNHead(nn.Module):
def __init__(
self, cfg, in_channels, builder, arch_def,
):
super(FBNetRPNHead, self).__init__()
assert in_channels == builder.last_depth
rpn_bn_type = cfg.MODEL.FBNET.RPN_BN_TYPE
if len(rpn_bn_type) > 0:
builder.bn_type = rpn_bn_type
use_blocks = cfg.MODEL.FBNET.RPN_HEAD_BLOCKS
stages = _get_rpn_stage(arch_def, use_blocks)
self.head = builder.add_blocks(stages)
self.out_channels = builder.last_depth
def forward(self, x):
x = [self.head(y) for y in x]
return x
@registry.RPN_HEADS.register("FBNet.rpn_head")
def add_rpn_head(cfg, in_channels, num_anchors):
builder, model_arch = create_builder(cfg)
builder.last_depth = in_channels
assert in_channels == builder.last_depth
# builder.name_prefix = "[rpn]"
rpn_feature = FBNetRPNHead(cfg, in_channels, builder, model_arch)
rpn_regressor = rpn.RPNHeadConvRegressor(
cfg, rpn_feature.out_channels, num_anchors)
return nn.Sequential(rpn_feature, rpn_regressor)
def _get_head_stage(arch, head_name, blocks):
# use default name 'head' if the specific name 'head_name' does not existed
if head_name not in arch:
head_name = "head"
head_stage = arch.get(head_name)
ret = mbuilder.get_blocks(arch, stage_indices=head_stage, block_indices=blocks)
return ret["stages"]
# name mapping for head names in arch def and cfg
ARCH_CFG_NAME_MAPPING = {
"bbox": "ROI_BOX_HEAD",
"kpts": "ROI_KEYPOINT_HEAD",
"mask": "ROI_MASK_HEAD",
}
class FBNetROIHead(nn.Module):
def __init__(
self, cfg, in_channels, builder, arch_def,
head_name, use_blocks, stride_init, last_layer_scale,
):
super(FBNetROIHead, self).__init__()
assert in_channels == builder.last_depth
assert isinstance(use_blocks, list)
head_cfg_name = ARCH_CFG_NAME_MAPPING[head_name]
self.pooler = poolers.make_pooler(cfg, head_cfg_name)
stage = _get_head_stage(arch_def, head_name, use_blocks)
assert stride_init in [0, 1, 2]
if stride_init != 0:
stage[0]["block"][3] = stride_init
blocks = builder.add_blocks(stage)
last_info = copy.deepcopy(arch_def["last"])
last_info[1] = last_layer_scale
last = builder.add_last(last_info)
self.head = nn.Sequential(OrderedDict([
("blocks", blocks),
("last", last)
]))
# output_blob = builder.add_final_pool(
# # model, output_blob, kernel_size=cfg.FAST_RCNN.ROI_XFORM_RESOLUTION)
# model,
# output_blob,
# kernel_size=int(cfg.FAST_RCNN.ROI_XFORM_RESOLUTION / stride_init),
# )
self.out_channels = builder.last_depth
def forward(self, x, proposals):
x = self.pooler(x, proposals)
x = self.head(x)
return x
@registry.ROI_BOX_FEATURE_EXTRACTORS.register("FBNet.roi_head")
def add_roi_head(cfg, in_channels):
builder, model_arch = create_builder(cfg)
builder.last_depth = in_channels
# builder.name_prefix = "_[bbox]_"
return FBNetROIHead(
cfg, in_channels, builder, model_arch,
head_name="bbox",
use_blocks=cfg.MODEL.FBNET.DET_HEAD_BLOCKS,
stride_init=cfg.MODEL.FBNET.DET_HEAD_STRIDE,
last_layer_scale=cfg.MODEL.FBNET.DET_HEAD_LAST_SCALE,
)
@registry.ROI_KEYPOINT_FEATURE_EXTRACTORS.register("FBNet.roi_head_keypoints")
def add_roi_head_keypoints(cfg, in_channels):
builder, model_arch = create_builder(cfg)
builder.last_depth = in_channels
# builder.name_prefix = "_[kpts]_"
return FBNetROIHead(
cfg, in_channels, builder, model_arch,
head_name="kpts",
use_blocks=cfg.MODEL.FBNET.KPTS_HEAD_BLOCKS,
stride_init=cfg.MODEL.FBNET.KPTS_HEAD_STRIDE,
last_layer_scale=cfg.MODEL.FBNET.KPTS_HEAD_LAST_SCALE,
)
@registry.ROI_MASK_FEATURE_EXTRACTORS.register("FBNet.roi_head_mask")
def add_roi_head_mask(cfg, in_channels):
builder, model_arch = create_builder(cfg)
builder.last_depth = in_channels
# builder.name_prefix = "_[mask]_"
return FBNetROIHead(
cfg, in_channels, builder, model_arch,
head_name="mask",
use_blocks=cfg.MODEL.FBNET.MASK_HEAD_BLOCKS,
stride_init=cfg.MODEL.FBNET.MASK_HEAD_STRIDE,
last_layer_scale=cfg.MODEL.FBNET.MASK_HEAD_LAST_SCALE,
)
This diff is collapsed.
from __future__ import absolute_import, division, print_function, unicode_literals
def add_archs(archs):
global MODEL_ARCH
for x in archs:
assert x not in MODEL_ARCH, "Duplicated model name {} existed".format(x)
MODEL_ARCH[x] = archs[x]
MODEL_ARCH = {
"default": {
"block_op_type": [
# stage 0
["ir_k3"],
# stage 1
["ir_k3"] * 2,
# stage 2
["ir_k3"] * 3,
# stage 3
["ir_k3"] * 7,
# stage 4, bbox head
["ir_k3"] * 4,
# stage 5, rpn
["ir_k3"] * 3,
# stage 5, mask head
["ir_k3"] * 5,
],
"block_cfg": {
"first": [32, 2],
"stages": [
# [t, c, n, s]
# stage 0
[[1, 16, 1, 1]],
# stage 1
[[6, 24, 2, 2]],
# stage 2
[[6, 32, 3, 2]],
# stage 3
[[6, 64, 4, 2], [6, 96, 3, 1]],
# stage 4, bbox head
[[4, 160, 1, 2], [6, 160, 2, 1], [6, 240, 1, 1]],
# [[6, 160, 3, 2], [6, 320, 1, 1]],
# stage 5, rpn head
[[6, 96, 3, 1]],
# stage 6, mask head
[[4, 160, 1, 1], [6, 160, 3, 1], [3, 80, 1, -2]],
],
# [c, channel_scale]
"last": [1280, 0.0],
"backbone": [0, 1, 2, 3],
"rpn": [5],
"bbox": [4],
"mask": [6],
},
},
"xirb16d_dsmask": {
"block_op_type": [
# stage 0
["ir_k3"],
# stage 1
["ir_k3"] * 2,
# stage 2
["ir_k3"] * 3,
# stage 3
["ir_k3"] * 7,
# stage 4, bbox head
["ir_k3"] * 4,
# stage 5, mask head
["ir_k3"] * 5,
# stage 6, rpn
["ir_k3"] * 3,
],
"block_cfg": {
"first": [16, 2],
"stages": [
# [t, c, n, s]
# stage 0
[[1, 16, 1, 1]],
# stage 1
[[6, 32, 2, 2]],
# stage 2
[[6, 48, 3, 2]],
# stage 3
[[6, 96, 4, 2], [6, 128, 3, 1]],
# stage 4, bbox head
[[4, 128, 1, 2], [6, 128, 2, 1], [6, 160, 1, 1]],
# stage 5, mask head
[[4, 128, 1, 2], [6, 128, 2, 1], [6, 128, 1, -2], [3, 64, 1, -2]],
# stage 6, rpn head
[[6, 128, 3, 1]],
],
# [c, channel_scale]
"last": [1280, 0.0],
"backbone": [0, 1, 2, 3],
"rpn": [6],
"bbox": [4],
"mask": [5],
},
},
"mobilenet_v2": {
"block_op_type": [
# stage 0
["ir_k3"],
# stage 1
["ir_k3"] * 2,
# stage 2
["ir_k3"] * 3,
# stage 3
["ir_k3"] * 7,
# stage 4
["ir_k3"] * 4,
],
"block_cfg": {
"first": [32, 2],
"stages": [
# [t, c, n, s]
# stage 0
[[1, 16, 1, 1]],
# stage 1
[[6, 24, 2, 2]],
# stage 2
[[6, 32, 3, 2]],
# stage 3
[[6, 64, 4, 2], [6, 96, 3, 1]],
# stage 4
[[6, 160, 3, 1], [6, 320, 1, 1]],
],
# [c, channel_scale]
"last": [1280, 0.0],
"backbone": [0, 1, 2, 3],
"bbox": [4],
},
},
}
...@@ -187,6 +187,7 @@ class ResNetHead(nn.Module): ...@@ -187,6 +187,7 @@ class ResNetHead(nn.Module):
stride = None stride = None
self.add_module(name, module) self.add_module(name, module)
self.stages.append(name) self.stages.append(name)
self.out_channels = out_channels
def forward(self, x): def forward(self, x):
for stage in self.stages: for stage in self.stages:
......
...@@ -27,8 +27,8 @@ class GeneralizedRCNN(nn.Module): ...@@ -27,8 +27,8 @@ class GeneralizedRCNN(nn.Module):
super(GeneralizedRCNN, self).__init__() super(GeneralizedRCNN, self).__init__()
self.backbone = build_backbone(cfg) self.backbone = build_backbone(cfg)
self.rpn = build_rpn(cfg) self.rpn = build_rpn(cfg, self.backbone.out_channels)
self.roi_heads = build_roi_heads(cfg) self.roi_heads = build_roi_heads(cfg, self.backbone.out_channels)
def forward(self, images, targets=None): def forward(self, images, targets=None):
""" """
......
...@@ -119,3 +119,15 @@ class Pooler(nn.Module): ...@@ -119,3 +119,15 @@ class Pooler(nn.Module):
result[idx_in_level] = pooler(per_level_feature, rois_per_level) result[idx_in_level] = pooler(per_level_feature, rois_per_level)
return result return result
def make_pooler(cfg, head_name):
resolution = cfg.MODEL[head_name].POOLER_RESOLUTION
scales = cfg.MODEL[head_name].POOLER_SCALES
sampling_ratio = cfg.MODEL[head_name].POOLER_SAMPLING_RATIO
pooler = Pooler(
output_size=(resolution, resolution),
scales=scales,
sampling_ratio=sampling_ratio,
)
return pooler
...@@ -3,6 +3,10 @@ ...@@ -3,6 +3,10 @@
from maskrcnn_benchmark.utils.registry import Registry from maskrcnn_benchmark.utils.registry import Registry
BACKBONES = Registry() BACKBONES = Registry()
RPN_HEADS = Registry()
ROI_BOX_FEATURE_EXTRACTORS = Registry() ROI_BOX_FEATURE_EXTRACTORS = Registry()
ROI_BOX_PREDICTOR = Registry() ROI_BOX_PREDICTOR = Registry()
RPN_HEADS = Registry() ROI_KEYPOINT_FEATURE_EXTRACTORS = Registry()
ROI_KEYPOINT_PREDICTOR = Registry()
ROI_MASK_FEATURE_EXTRACTORS = Registry()
ROI_MASK_PREDICTOR = Registry()
...@@ -13,10 +13,11 @@ class ROIBoxHead(torch.nn.Module): ...@@ -13,10 +13,11 @@ class ROIBoxHead(torch.nn.Module):
Generic Box Head class. Generic Box Head class.
""" """
def __init__(self, cfg): def __init__(self, cfg, in_channels):
super(ROIBoxHead, self).__init__() super(ROIBoxHead, self).__init__()
self.feature_extractor = make_roi_box_feature_extractor(cfg) self.feature_extractor = make_roi_box_feature_extractor(cfg, in_channels)
self.predictor = make_roi_box_predictor(cfg) self.predictor = make_roi_box_predictor(
cfg, self.feature_extractor.out_channels)
self.post_processor = make_roi_box_post_processor(cfg) self.post_processor = make_roi_box_post_processor(cfg)
self.loss_evaluator = make_roi_box_loss_evaluator(cfg) self.loss_evaluator = make_roi_box_loss_evaluator(cfg)
...@@ -61,10 +62,10 @@ class ROIBoxHead(torch.nn.Module): ...@@ -61,10 +62,10 @@ class ROIBoxHead(torch.nn.Module):
) )
def build_roi_box_head(cfg): def build_roi_box_head(cfg, in_channels):
""" """
Constructs a new box head. Constructs a new box head.
By default, uses ROIBoxHead, but if it turns out not to be enough, just register a new class By default, uses ROIBoxHead, but if it turns out not to be enough, just register a new class
and make it a parameter in the config and make it a parameter in the config
""" """
return ROIBoxHead(cfg) return ROIBoxHead(cfg, in_channels)
...@@ -12,7 +12,7 @@ from maskrcnn_benchmark.modeling.make_layers import make_fc ...@@ -12,7 +12,7 @@ from maskrcnn_benchmark.modeling.make_layers import make_fc
@registry.ROI_BOX_FEATURE_EXTRACTORS.register("ResNet50Conv5ROIFeatureExtractor") @registry.ROI_BOX_FEATURE_EXTRACTORS.register("ResNet50Conv5ROIFeatureExtractor")
class ResNet50Conv5ROIFeatureExtractor(nn.Module): class ResNet50Conv5ROIFeatureExtractor(nn.Module):
def __init__(self, config): def __init__(self, config, in_channels):
super(ResNet50Conv5ROIFeatureExtractor, self).__init__() super(ResNet50Conv5ROIFeatureExtractor, self).__init__()
resolution = config.MODEL.ROI_BOX_HEAD.POOLER_RESOLUTION resolution = config.MODEL.ROI_BOX_HEAD.POOLER_RESOLUTION
...@@ -38,6 +38,7 @@ class ResNet50Conv5ROIFeatureExtractor(nn.Module): ...@@ -38,6 +38,7 @@ class ResNet50Conv5ROIFeatureExtractor(nn.Module):
self.pooler = pooler self.pooler = pooler
self.head = head self.head = head
self.out_channels = head.out_channels
def forward(self, x, proposals): def forward(self, x, proposals):
x = self.pooler(x, proposals) x = self.pooler(x, proposals)
...@@ -51,7 +52,7 @@ class FPN2MLPFeatureExtractor(nn.Module): ...@@ -51,7 +52,7 @@ class FPN2MLPFeatureExtractor(nn.Module):
Heads for FPN for classification Heads for FPN for classification
""" """
def __init__(self, cfg): def __init__(self, cfg, in_channels):
super(FPN2MLPFeatureExtractor, self).__init__() super(FPN2MLPFeatureExtractor, self).__init__()
resolution = cfg.MODEL.ROI_BOX_HEAD.POOLER_RESOLUTION resolution = cfg.MODEL.ROI_BOX_HEAD.POOLER_RESOLUTION
...@@ -62,12 +63,13 @@ class FPN2MLPFeatureExtractor(nn.Module): ...@@ -62,12 +63,13 @@ class FPN2MLPFeatureExtractor(nn.Module):
scales=scales, scales=scales,
sampling_ratio=sampling_ratio, sampling_ratio=sampling_ratio,
) )
input_size = cfg.MODEL.BACKBONE.OUT_CHANNELS * resolution ** 2 input_size = in_channels * resolution ** 2
representation_size = cfg.MODEL.ROI_BOX_HEAD.MLP_HEAD_DIM representation_size = cfg.MODEL.ROI_BOX_HEAD.MLP_HEAD_DIM
use_gn = cfg.MODEL.ROI_BOX_HEAD.USE_GN use_gn = cfg.MODEL.ROI_BOX_HEAD.USE_GN
self.pooler = pooler self.pooler = pooler
self.fc6 = make_fc(input_size, representation_size, use_gn) self.fc6 = make_fc(input_size, representation_size, use_gn)
self.fc7 = make_fc(representation_size, representation_size, use_gn) self.fc7 = make_fc(representation_size, representation_size, use_gn)
self.out_channels = representation_size
def forward(self, x, proposals): def forward(self, x, proposals):
x = self.pooler(x, proposals) x = self.pooler(x, proposals)
...@@ -85,7 +87,7 @@ class FPNXconv1fcFeatureExtractor(nn.Module): ...@@ -85,7 +87,7 @@ class FPNXconv1fcFeatureExtractor(nn.Module):
Heads for FPN for classification Heads for FPN for classification
""" """
def __init__(self, cfg): def __init__(self, cfg, in_channels):
super(FPNXconv1fcFeatureExtractor, self).__init__() super(FPNXconv1fcFeatureExtractor, self).__init__()
resolution = cfg.MODEL.ROI_BOX_HEAD.POOLER_RESOLUTION resolution = cfg.MODEL.ROI_BOX_HEAD.POOLER_RESOLUTION
...@@ -99,7 +101,6 @@ class FPNXconv1fcFeatureExtractor(nn.Module): ...@@ -99,7 +101,6 @@ class FPNXconv1fcFeatureExtractor(nn.Module):
self.pooler = pooler self.pooler = pooler
use_gn = cfg.MODEL.ROI_BOX_HEAD.USE_GN use_gn = cfg.MODEL.ROI_BOX_HEAD.USE_GN
in_channels = cfg.MODEL.BACKBONE.OUT_CHANNELS
conv_head_dim = cfg.MODEL.ROI_BOX_HEAD.CONV_HEAD_DIM conv_head_dim = cfg.MODEL.ROI_BOX_HEAD.CONV_HEAD_DIM
num_stacked_convs = cfg.MODEL.ROI_BOX_HEAD.NUM_STACKED_CONVS num_stacked_convs = cfg.MODEL.ROI_BOX_HEAD.NUM_STACKED_CONVS
dilation = cfg.MODEL.ROI_BOX_HEAD.DILATION dilation = cfg.MODEL.ROI_BOX_HEAD.DILATION
...@@ -133,6 +134,7 @@ class FPNXconv1fcFeatureExtractor(nn.Module): ...@@ -133,6 +134,7 @@ class FPNXconv1fcFeatureExtractor(nn.Module):
input_size = conv_head_dim * resolution ** 2 input_size = conv_head_dim * resolution ** 2
representation_size = cfg.MODEL.ROI_BOX_HEAD.MLP_HEAD_DIM representation_size = cfg.MODEL.ROI_BOX_HEAD.MLP_HEAD_DIM
self.fc6 = make_fc(input_size, representation_size, use_gn=False) self.fc6 = make_fc(input_size, representation_size, use_gn=False)
self.out_channels = representation_size
def forward(self, x, proposals): def forward(self, x, proposals):
x = self.pooler(x, proposals) x = self.pooler(x, proposals)
...@@ -142,8 +144,8 @@ class FPNXconv1fcFeatureExtractor(nn.Module): ...@@ -142,8 +144,8 @@ class FPNXconv1fcFeatureExtractor(nn.Module):
return x return x
def make_roi_box_feature_extractor(cfg): def make_roi_box_feature_extractor(cfg, in_channels):
func = registry.ROI_BOX_FEATURE_EXTRACTORS[ func = registry.ROI_BOX_FEATURE_EXTRACTORS[
cfg.MODEL.ROI_BOX_HEAD.FEATURE_EXTRACTOR cfg.MODEL.ROI_BOX_HEAD.FEATURE_EXTRACTOR
] ]
return func(cfg) return func(cfg, in_channels)
...@@ -5,16 +5,14 @@ from torch import nn ...@@ -5,16 +5,14 @@ from torch import nn
@registry.ROI_BOX_PREDICTOR.register("FastRCNNPredictor") @registry.ROI_BOX_PREDICTOR.register("FastRCNNPredictor")
class FastRCNNPredictor(nn.Module): class FastRCNNPredictor(nn.Module):
def __init__(self, config, pretrained=None): def __init__(self, config, in_channels):
super(FastRCNNPredictor, self).__init__() super(FastRCNNPredictor, self).__init__()
assert in_channels is not None
stage_index = 4 num_inputs = in_channels
stage2_relative_factor = 2 ** (stage_index - 1)
res2_out_channels = config.MODEL.RESNETS.RES2_OUT_CHANNELS
num_inputs = res2_out_channels * stage2_relative_factor
num_classes = config.MODEL.ROI_BOX_HEAD.NUM_CLASSES num_classes = config.MODEL.ROI_BOX_HEAD.NUM_CLASSES
self.avgpool = nn.AvgPool2d(kernel_size=7, stride=7) self.avgpool = nn.AdaptiveAvgPool2d(1)
self.cls_score = nn.Linear(num_inputs, num_classes) self.cls_score = nn.Linear(num_inputs, num_classes)
num_bbox_reg_classes = 2 if config.MODEL.CLS_AGNOSTIC_BBOX_REG else num_classes num_bbox_reg_classes = 2 if config.MODEL.CLS_AGNOSTIC_BBOX_REG else num_classes
self.bbox_pred = nn.Linear(num_inputs, num_bbox_reg_classes * 4) self.bbox_pred = nn.Linear(num_inputs, num_bbox_reg_classes * 4)
...@@ -35,10 +33,10 @@ class FastRCNNPredictor(nn.Module): ...@@ -35,10 +33,10 @@ class FastRCNNPredictor(nn.Module):
@registry.ROI_BOX_PREDICTOR.register("FPNPredictor") @registry.ROI_BOX_PREDICTOR.register("FPNPredictor")
class FPNPredictor(nn.Module): class FPNPredictor(nn.Module):
def __init__(self, cfg): def __init__(self, cfg, in_channels):
super(FPNPredictor, self).__init__() super(FPNPredictor, self).__init__()
num_classes = cfg.MODEL.ROI_BOX_HEAD.NUM_CLASSES num_classes = cfg.MODEL.ROI_BOX_HEAD.NUM_CLASSES
representation_size = cfg.MODEL.ROI_BOX_HEAD.MLP_HEAD_DIM representation_size = in_channels
self.cls_score = nn.Linear(representation_size, num_classes) self.cls_score = nn.Linear(representation_size, num_classes)
num_bbox_reg_classes = 2 if cfg.MODEL.CLS_AGNOSTIC_BBOX_REG else num_classes num_bbox_reg_classes = 2 if cfg.MODEL.CLS_AGNOSTIC_BBOX_REG else num_classes
...@@ -50,12 +48,15 @@ class FPNPredictor(nn.Module): ...@@ -50,12 +48,15 @@ class FPNPredictor(nn.Module):
nn.init.constant_(l.bias, 0) nn.init.constant_(l.bias, 0)
def forward(self, x): def forward(self, x):
if x.ndimension() == 4:
assert list(x.shape[2:]) == [1, 1]
x = x.view(x.size(0), -1)
scores = self.cls_score(x) scores = self.cls_score(x)
bbox_deltas = self.bbox_pred(x) bbox_deltas = self.bbox_pred(x)
return scores, bbox_deltas return scores, bbox_deltas
def make_roi_box_predictor(cfg): def make_roi_box_predictor(cfg, in_channels):
func = registry.ROI_BOX_PREDICTOR[cfg.MODEL.ROI_BOX_HEAD.PREDICTOR] func = registry.ROI_BOX_PREDICTOR[cfg.MODEL.ROI_BOX_HEAD.PREDICTOR]
return func(cfg) return func(cfg, in_channels)
...@@ -7,11 +7,12 @@ from .loss import make_roi_keypoint_loss_evaluator ...@@ -7,11 +7,12 @@ from .loss import make_roi_keypoint_loss_evaluator
class ROIKeypointHead(torch.nn.Module): class ROIKeypointHead(torch.nn.Module):
def __init__(self, cfg): def __init__(self, cfg, in_channels):
super(ROIKeypointHead, self).__init__() super(ROIKeypointHead, self).__init__()
self.cfg = cfg.clone() self.cfg = cfg.clone()
self.feature_extractor = make_roi_keypoint_feature_extractor(cfg) self.feature_extractor = make_roi_keypoint_feature_extractor(cfg, in_channels)
self.predictor = make_roi_keypoint_predictor(cfg) self.predictor = make_roi_keypoint_predictor(
cfg, self.feature_extractor.out_channels)
self.post_processor = make_roi_keypoint_post_processor(cfg) self.post_processor = make_roi_keypoint_post_processor(cfg)
self.loss_evaluator = make_roi_keypoint_loss_evaluator(cfg) self.loss_evaluator = make_roi_keypoint_loss_evaluator(cfg)
...@@ -46,5 +47,5 @@ class ROIKeypointHead(torch.nn.Module): ...@@ -46,5 +47,5 @@ class ROIKeypointHead(torch.nn.Module):
return x, proposals, dict(loss_kp=loss_kp) return x, proposals, dict(loss_kp=loss_kp)
def build_roi_keypoint_head(cfg): def build_roi_keypoint_head(cfg, in_channels):
return ROIKeypointHead(cfg) return ROIKeypointHead(cfg, in_channels)
from torch import nn from torch import nn
from torch.nn import functional as F from torch.nn import functional as F
from maskrcnn_benchmark.modeling import registry
from maskrcnn_benchmark.modeling.poolers import Pooler from maskrcnn_benchmark.modeling.poolers import Pooler
from maskrcnn_benchmark.layers import Conv2d from maskrcnn_benchmark.layers import Conv2d
@registry.ROI_KEYPOINT_FEATURE_EXTRACTORS.register("KeypointRCNNFeatureExtractor")
class KeypointRCNNFeatureExtractor(nn.Module): class KeypointRCNNFeatureExtractor(nn.Module):
def __init__(self, cfg): def __init__(self, cfg, in_channels):
super(KeypointRCNNFeatureExtractor, self).__init__() super(KeypointRCNNFeatureExtractor, self).__init__()
resolution = cfg.MODEL.ROI_KEYPOINT_HEAD.POOLER_RESOLUTION resolution = cfg.MODEL.ROI_KEYPOINT_HEAD.POOLER_RESOLUTION
...@@ -20,7 +22,7 @@ class KeypointRCNNFeatureExtractor(nn.Module): ...@@ -20,7 +22,7 @@ class KeypointRCNNFeatureExtractor(nn.Module):
) )
self.pooler = pooler self.pooler = pooler
input_features = cfg.MODEL.BACKBONE.OUT_CHANNELS input_features = in_channels
layers = cfg.MODEL.ROI_KEYPOINT_HEAD.CONV_LAYERS layers = cfg.MODEL.ROI_KEYPOINT_HEAD.CONV_LAYERS
next_feature = input_features next_feature = input_features
self.blocks = [] self.blocks = []
...@@ -32,6 +34,7 @@ class KeypointRCNNFeatureExtractor(nn.Module): ...@@ -32,6 +34,7 @@ class KeypointRCNNFeatureExtractor(nn.Module):
self.add_module(layer_name, module) self.add_module(layer_name, module)
next_feature = layer_features next_feature = layer_features
self.blocks.append(layer_name) self.blocks.append(layer_name)
self.out_channels = layer_features
def forward(self, x, proposals): def forward(self, x, proposals):
x = self.pooler(x, proposals) x = self.pooler(x, proposals)
...@@ -40,13 +43,8 @@ class KeypointRCNNFeatureExtractor(nn.Module): ...@@ -40,13 +43,8 @@ class KeypointRCNNFeatureExtractor(nn.Module):
return x return x
_ROI_KEYPOINT_FEATURE_EXTRACTORS = { def make_roi_keypoint_feature_extractor(cfg, in_channels):
"KeypointRCNNFeatureExtractor": KeypointRCNNFeatureExtractor func = registry.ROI_KEYPOINT_FEATURE_EXTRACTORS[
}
def make_roi_keypoint_feature_extractor(cfg):
func = _ROI_KEYPOINT_FEATURE_EXTRACTORS[
cfg.MODEL.ROI_KEYPOINT_HEAD.FEATURE_EXTRACTOR cfg.MODEL.ROI_KEYPOINT_HEAD.FEATURE_EXTRACTOR
] ]
return func(cfg) return func(cfg, in_channels)
from torch import nn from torch import nn
from torch.nn import functional as F
from maskrcnn_benchmark import layers from maskrcnn_benchmark import layers
from maskrcnn_benchmark.modeling import registry
@registry.ROI_KEYPOINT_PREDICTOR.register("KeypointRCNNPredictor")
class KeypointRCNNPredictor(nn.Module): class KeypointRCNNPredictor(nn.Module):
def __init__(self, cfg): def __init__(self, cfg, in_channels):
super(KeypointRCNNPredictor, self).__init__() super(KeypointRCNNPredictor, self).__init__()
input_features = cfg.MODEL.ROI_KEYPOINT_HEAD.CONV_LAYERS[-1] input_features = in_channels
num_keypoints = cfg.MODEL.ROI_KEYPOINT_HEAD.NUM_CLASSES num_keypoints = cfg.MODEL.ROI_KEYPOINT_HEAD.NUM_CLASSES
deconv_kernel = 4 deconv_kernel = 4
self.kps_score_lowres = layers.ConvTranspose2d( self.kps_score_lowres = layers.ConvTranspose2d(
...@@ -22,6 +23,7 @@ class KeypointRCNNPredictor(nn.Module): ...@@ -22,6 +23,7 @@ class KeypointRCNNPredictor(nn.Module):
) )
nn.init.constant_(self.kps_score_lowres.bias, 0) nn.init.constant_(self.kps_score_lowres.bias, 0)
self.up_scale = 2 self.up_scale = 2
self.out_channels = num_keypoints
def forward(self, x): def forward(self, x):
x = self.kps_score_lowres(x) x = self.kps_score_lowres(x)
...@@ -31,9 +33,6 @@ class KeypointRCNNPredictor(nn.Module): ...@@ -31,9 +33,6 @@ class KeypointRCNNPredictor(nn.Module):
return x return x
_ROI_KEYPOINT_PREDICTOR = {"KeypointRCNNPredictor": KeypointRCNNPredictor} def make_roi_keypoint_predictor(cfg, in_channels):
func = registry.ROI_KEYPOINT_PREDICTOR[cfg.MODEL.ROI_KEYPOINT_HEAD.PREDICTOR]
return func(cfg, in_channels)
def make_roi_keypoint_predictor(cfg):
func = _ROI_KEYPOINT_PREDICTOR[cfg.MODEL.ROI_KEYPOINT_HEAD.PREDICTOR]
return func(cfg)
...@@ -34,11 +34,12 @@ def keep_only_positive_boxes(boxes): ...@@ -34,11 +34,12 @@ def keep_only_positive_boxes(boxes):
class ROIMaskHead(torch.nn.Module): class ROIMaskHead(torch.nn.Module):
def __init__(self, cfg): def __init__(self, cfg, in_channels):
super(ROIMaskHead, self).__init__() super(ROIMaskHead, self).__init__()
self.cfg = cfg.clone() self.cfg = cfg.clone()
self.feature_extractor = make_roi_mask_feature_extractor(cfg) self.feature_extractor = make_roi_mask_feature_extractor(cfg, in_channels)
self.predictor = make_roi_mask_predictor(cfg) self.predictor = make_roi_mask_predictor(
cfg, self.feature_extractor.out_channels)
self.post_processor = make_roi_mask_post_processor(cfg) self.post_processor = make_roi_mask_post_processor(cfg)
self.loss_evaluator = make_roi_mask_loss_evaluator(cfg) self.loss_evaluator = make_roi_mask_loss_evaluator(cfg)
...@@ -78,5 +79,5 @@ class ROIMaskHead(torch.nn.Module): ...@@ -78,5 +79,5 @@ class ROIMaskHead(torch.nn.Module):
return x, all_proposals, dict(loss_mask=loss_mask) return x, all_proposals, dict(loss_mask=loss_mask)
def build_roi_mask_head(cfg): def build_roi_mask_head(cfg, in_channels):
return ROIMaskHead(cfg) return ROIMaskHead(cfg, in_channels)
...@@ -3,18 +3,23 @@ from torch import nn ...@@ -3,18 +3,23 @@ from torch import nn
from torch.nn import functional as F from torch.nn import functional as F
from ..box_head.roi_box_feature_extractors import ResNet50Conv5ROIFeatureExtractor from ..box_head.roi_box_feature_extractors import ResNet50Conv5ROIFeatureExtractor
from maskrcnn_benchmark.modeling import registry
from maskrcnn_benchmark.modeling.poolers import Pooler from maskrcnn_benchmark.modeling.poolers import Pooler
from maskrcnn_benchmark.layers import Conv2d
from maskrcnn_benchmark.modeling.make_layers import make_conv3x3 from maskrcnn_benchmark.modeling.make_layers import make_conv3x3
registry.ROI_MASK_FEATURE_EXTRACTORS.register(
"ResNet50Conv5ROIFeatureExtractor", ResNet50Conv5ROIFeatureExtractor
)
@registry.ROI_MASK_FEATURE_EXTRACTORS.register("MaskRCNNFPNFeatureExtractor")
class MaskRCNNFPNFeatureExtractor(nn.Module): class MaskRCNNFPNFeatureExtractor(nn.Module):
""" """
Heads for FPN for classification Heads for FPN for classification
""" """
def __init__(self, cfg): def __init__(self, cfg, in_channels):
""" """
Arguments: Arguments:
num_classes (int): number of output classes num_classes (int): number of output classes
...@@ -31,7 +36,7 @@ class MaskRCNNFPNFeatureExtractor(nn.Module): ...@@ -31,7 +36,7 @@ class MaskRCNNFPNFeatureExtractor(nn.Module):
scales=scales, scales=scales,
sampling_ratio=sampling_ratio, sampling_ratio=sampling_ratio,
) )
input_size = cfg.MODEL.BACKBONE.OUT_CHANNELS input_size = in_channels
self.pooler = pooler self.pooler = pooler
use_gn = cfg.MODEL.ROI_MASK_HEAD.USE_GN use_gn = cfg.MODEL.ROI_MASK_HEAD.USE_GN
...@@ -42,12 +47,14 @@ class MaskRCNNFPNFeatureExtractor(nn.Module): ...@@ -42,12 +47,14 @@ class MaskRCNNFPNFeatureExtractor(nn.Module):
self.blocks = [] self.blocks = []
for layer_idx, layer_features in enumerate(layers, 1): for layer_idx, layer_features in enumerate(layers, 1):
layer_name = "mask_fcn{}".format(layer_idx) layer_name = "mask_fcn{}".format(layer_idx)
module = make_conv3x3(next_feature, layer_features, module = make_conv3x3(
next_feature, layer_features,
dilation=dilation, stride=1, use_gn=use_gn dilation=dilation, stride=1, use_gn=use_gn
) )
self.add_module(layer_name, module) self.add_module(layer_name, module)
next_feature = layer_features next_feature = layer_features
self.blocks.append(layer_name) self.blocks.append(layer_name)
self.out_channels = layer_features
def forward(self, x, proposals): def forward(self, x, proposals):
x = self.pooler(x, proposals) x = self.pooler(x, proposals)
...@@ -58,12 +65,8 @@ class MaskRCNNFPNFeatureExtractor(nn.Module): ...@@ -58,12 +65,8 @@ class MaskRCNNFPNFeatureExtractor(nn.Module):
return x return x
_ROI_MASK_FEATURE_EXTRACTORS = { def make_roi_mask_feature_extractor(cfg, in_channels):
"ResNet50Conv5ROIFeatureExtractor": ResNet50Conv5ROIFeatureExtractor, func = registry.ROI_MASK_FEATURE_EXTRACTORS[
"MaskRCNNFPNFeatureExtractor": MaskRCNNFPNFeatureExtractor, cfg.MODEL.ROI_MASK_HEAD.FEATURE_EXTRACTOR
} ]
return func(cfg, in_channels)
def make_roi_mask_feature_extractor(cfg):
func = _ROI_MASK_FEATURE_EXTRACTORS[cfg.MODEL.ROI_MASK_HEAD.FEATURE_EXTRACTOR]
return func(cfg)
...@@ -4,21 +4,16 @@ from torch.nn import functional as F ...@@ -4,21 +4,16 @@ from torch.nn import functional as F
from maskrcnn_benchmark.layers import Conv2d from maskrcnn_benchmark.layers import Conv2d
from maskrcnn_benchmark.layers import ConvTranspose2d from maskrcnn_benchmark.layers import ConvTranspose2d
from maskrcnn_benchmark.modeling import registry
@registry.ROI_MASK_PREDICTOR.register("MaskRCNNC4Predictor")
class MaskRCNNC4Predictor(nn.Module): class MaskRCNNC4Predictor(nn.Module):
def __init__(self, cfg): def __init__(self, cfg, in_channels):
super(MaskRCNNC4Predictor, self).__init__() super(MaskRCNNC4Predictor, self).__init__()
num_classes = cfg.MODEL.ROI_BOX_HEAD.NUM_CLASSES num_classes = cfg.MODEL.ROI_BOX_HEAD.NUM_CLASSES
dim_reduced = cfg.MODEL.ROI_MASK_HEAD.CONV_LAYERS[-1] dim_reduced = cfg.MODEL.ROI_MASK_HEAD.CONV_LAYERS[-1]
num_inputs = in_channels
if cfg.MODEL.ROI_HEADS.USE_FPN:
num_inputs = dim_reduced
else:
stage_index = 4
stage2_relative_factor = 2 ** (stage_index - 1)
res2_out_channels = cfg.MODEL.RESNETS.RES2_OUT_CHANNELS
num_inputs = res2_out_channels * stage2_relative_factor
self.conv5_mask = ConvTranspose2d(num_inputs, dim_reduced, 2, 2, 0) self.conv5_mask = ConvTranspose2d(num_inputs, dim_reduced, 2, 2, 0)
self.mask_fcn_logits = Conv2d(dim_reduced, num_classes, 1, 1, 0) self.mask_fcn_logits = Conv2d(dim_reduced, num_classes, 1, 1, 0)
...@@ -36,9 +31,27 @@ class MaskRCNNC4Predictor(nn.Module): ...@@ -36,9 +31,27 @@ class MaskRCNNC4Predictor(nn.Module):
return self.mask_fcn_logits(x) return self.mask_fcn_logits(x)
_ROI_MASK_PREDICTOR = {"MaskRCNNC4Predictor": MaskRCNNC4Predictor} @registry.ROI_MASK_PREDICTOR.register("MaskRCNNConv1x1Predictor")
class MaskRCNNConv1x1Predictor(nn.Module):
def __init__(self, cfg, in_channels):
super(MaskRCNNConv1x1Predictor, self).__init__()
num_classes = cfg.MODEL.ROI_BOX_HEAD.NUM_CLASSES
num_inputs = in_channels
self.mask_fcn_logits = Conv2d(num_inputs, num_classes, 1, 1, 0)
for name, param in self.named_parameters():
if "bias" in name:
nn.init.constant_(param, 0)
elif "weight" in name:
# Caffe2 implementation uses MSRAFill, which in fact
# corresponds to kaiming_normal_ in PyTorch
nn.init.kaiming_normal_(param, mode="fan_out", nonlinearity="relu")
def forward(self, x):
return self.mask_fcn_logits(x)
def make_roi_mask_predictor(cfg): def make_roi_mask_predictor(cfg, in_channels):
func = _ROI_MASK_PREDICTOR[cfg.MODEL.ROI_MASK_HEAD.PREDICTOR] func = registry.ROI_MASK_PREDICTOR[cfg.MODEL.ROI_MASK_HEAD.PREDICTOR]
return func(cfg) return func(cfg, in_channels)
...@@ -55,7 +55,7 @@ class CombinedROIHeads(torch.nn.ModuleDict): ...@@ -55,7 +55,7 @@ class CombinedROIHeads(torch.nn.ModuleDict):
return x, detections, losses return x, detections, losses
def build_roi_heads(cfg): def build_roi_heads(cfg, in_channels):
# individually create the heads, that will be combined together # individually create the heads, that will be combined together
# afterwards # afterwards
roi_heads = [] roi_heads = []
...@@ -63,11 +63,11 @@ def build_roi_heads(cfg): ...@@ -63,11 +63,11 @@ def build_roi_heads(cfg):
return [] return []
if not cfg.MODEL.RPN_ONLY: if not cfg.MODEL.RPN_ONLY:
roi_heads.append(("box", build_roi_box_head(cfg))) roi_heads.append(("box", build_roi_box_head(cfg, in_channels)))
if cfg.MODEL.MASK_ON: if cfg.MODEL.MASK_ON:
roi_heads.append(("mask", build_roi_mask_head(cfg))) roi_heads.append(("mask", build_roi_mask_head(cfg, in_channels)))
if cfg.MODEL.KEYPOINT_ON: if cfg.MODEL.KEYPOINT_ON:
roi_heads.append(("keypoint", build_roi_keypoint_head(cfg))) roi_heads.append(("keypoint", build_roi_keypoint_head(cfg, in_channels)))
# combine individual heads in a single module # combine individual heads in a single module
if roi_heads: if roi_heads:
......
...@@ -15,7 +15,7 @@ class RetinaNetHead(torch.nn.Module): ...@@ -15,7 +15,7 @@ class RetinaNetHead(torch.nn.Module):
Adds a RetinNet head with classification and regression heads Adds a RetinNet head with classification and regression heads
""" """
def __init__(self, cfg): def __init__(self, cfg, in_channels):
""" """
Arguments: Arguments:
in_channels (int): number of channels of the input feature in_channels (int): number of channels of the input feature
...@@ -24,7 +24,6 @@ class RetinaNetHead(torch.nn.Module): ...@@ -24,7 +24,6 @@ class RetinaNetHead(torch.nn.Module):
super(RetinaNetHead, self).__init__() super(RetinaNetHead, self).__init__()
# TODO: Implement the sigmoid version first. # TODO: Implement the sigmoid version first.
num_classes = cfg.MODEL.RETINANET.NUM_CLASSES - 1 num_classes = cfg.MODEL.RETINANET.NUM_CLASSES - 1
in_channels = cfg.MODEL.BACKBONE.OUT_CHANNELS
num_anchors = len(cfg.MODEL.RETINANET.ASPECT_RATIOS) \ num_anchors = len(cfg.MODEL.RETINANET.ASPECT_RATIOS) \
* cfg.MODEL.RETINANET.SCALES_PER_OCTAVE * cfg.MODEL.RETINANET.SCALES_PER_OCTAVE
...@@ -92,13 +91,13 @@ class RetinaNetModule(torch.nn.Module): ...@@ -92,13 +91,13 @@ class RetinaNetModule(torch.nn.Module):
RetinaNet outputs and losses. Only Test on FPN now. RetinaNet outputs and losses. Only Test on FPN now.
""" """
def __init__(self, cfg): def __init__(self, cfg, in_channels):
super(RetinaNetModule, self).__init__() super(RetinaNetModule, self).__init__()
self.cfg = cfg.clone() self.cfg = cfg.clone()
anchor_generator = make_anchor_generator_retinanet(cfg) anchor_generator = make_anchor_generator_retinanet(cfg)
head = RetinaNetHead(cfg) head = RetinaNetHead(cfg, in_channels)
box_coder = BoxCoder(weights=(10., 10., 5., 5.)) box_coder = BoxCoder(weights=(10., 10., 5., 5.))
box_selector_test = make_retinanet_postprocessor(cfg, box_coder, is_train=False) box_selector_test = make_retinanet_postprocessor(cfg, box_coder, is_train=False)
...@@ -149,5 +148,5 @@ class RetinaNetModule(torch.nn.Module): ...@@ -149,5 +148,5 @@ class RetinaNetModule(torch.nn.Module):
return boxes, {} return boxes, {}
def build_retinanet(cfg): def build_retinanet(cfg, in_channels):
return RetinaNetModule(cfg) return RetinaNetModule(cfg, in_channels)
...@@ -10,6 +10,66 @@ from .loss import make_rpn_loss_evaluator ...@@ -10,6 +10,66 @@ from .loss import make_rpn_loss_evaluator
from .anchor_generator import make_anchor_generator from .anchor_generator import make_anchor_generator
from .inference import make_rpn_postprocessor from .inference import make_rpn_postprocessor
class RPNHeadConvRegressor(nn.Module):
"""
A simple RPN Head for classification and bbox regression
"""
def __init__(self, cfg, in_channels, num_anchors):
"""
Arguments:
cfg : config
in_channels (int): number of channels of the input feature
num_anchors (int): number of anchors to be predicted
"""
super(RPNHeadConvRegressor, self).__init__()
self.cls_logits = nn.Conv2d(in_channels, num_anchors, kernel_size=1, stride=1)
self.bbox_pred = nn.Conv2d(
in_channels, num_anchors * 4, kernel_size=1, stride=1
)
for l in [self.cls_logits, self.bbox_pred]:
torch.nn.init.normal_(l.weight, std=0.01)
torch.nn.init.constant_(l.bias, 0)
def forward(self, x):
assert isinstance(x, (list, tuple))
logits = [self.cls_logits(y) for y in x]
bbox_reg = [self.bbox_pred(y) for y in x]
return logits, bbox_reg
class RPNHeadFeatureSingleConv(nn.Module):
"""
Adds a simple RPN Head with one conv to extract the feature
"""
def __init__(self, cfg, in_channels):
"""
Arguments:
cfg : config
in_channels (int): number of channels of the input feature
"""
super(RPNHeadFeatureSingleConv, self).__init__()
self.conv = nn.Conv2d(
in_channels, in_channels, kernel_size=3, stride=1, padding=1
)
for l in [self.conv]:
torch.nn.init.normal_(l.weight, std=0.01)
torch.nn.init.constant_(l.bias, 0)
self.out_channels = in_channels
def forward(self, x):
assert isinstance(x, (list, tuple))
x = [F.relu(self.conv(z)) for z in x]
return x
@registry.RPN_HEADS.register("SingleConvRPNHead") @registry.RPN_HEADS.register("SingleConvRPNHead")
class RPNHead(nn.Module): class RPNHead(nn.Module):
""" """
...@@ -52,14 +112,13 @@ class RPNModule(torch.nn.Module): ...@@ -52,14 +112,13 @@ class RPNModule(torch.nn.Module):
proposals and losses. Works for both FPN and non-FPN. proposals and losses. Works for both FPN and non-FPN.
""" """
def __init__(self, cfg): def __init__(self, cfg, in_channels):
super(RPNModule, self).__init__() super(RPNModule, self).__init__()
self.cfg = cfg.clone() self.cfg = cfg.clone()
anchor_generator = make_anchor_generator(cfg) anchor_generator = make_anchor_generator(cfg)
in_channels = cfg.MODEL.BACKBONE.OUT_CHANNELS
rpn_head = registry.RPN_HEADS[cfg.MODEL.RPN.RPN_HEAD] rpn_head = registry.RPN_HEADS[cfg.MODEL.RPN.RPN_HEAD]
head = rpn_head( head = rpn_head(
cfg, in_channels, anchor_generator.num_anchors_per_location()[0] cfg, in_channels, anchor_generator.num_anchors_per_location()[0]
...@@ -138,11 +197,11 @@ class RPNModule(torch.nn.Module): ...@@ -138,11 +197,11 @@ class RPNModule(torch.nn.Module):
return boxes, {} return boxes, {}
def build_rpn(cfg): def build_rpn(cfg, in_channels):
""" """
This gives the gist of it. Not super important because it doesn't change as much This gives the gist of it. Not super important because it doesn't change as much
""" """
if cfg.MODEL.RETINANET_ON: if cfg.MODEL.RETINANET_ON:
return build_retinanet(cfg) return build_retinanet(cfg, in_channels)
return RPNModule(cfg) return RPNModule(cfg, in_channels)
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
import os
def get_config_root_path():
''' Path to configs for unit tests '''
# cur_file_dir is root/tests/env_tests
cur_file_dir = os.path.dirname(os.path.abspath(os.path.realpath(__file__)))
ret = os.path.dirname(os.path.dirname(cur_file_dir))
ret = os.path.join(ret, "configs")
return ret
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
import unittest
import copy
import torch
# import modules to to register backbones
from maskrcnn_benchmark.modeling.backbone import build_backbone # NoQA
from maskrcnn_benchmark.modeling import registry
from maskrcnn_benchmark.config import cfg as g_cfg
from utils import load_config
# overwrite configs if specified, otherwise default config is used
BACKBONE_CFGS = {
"R-50-FPN": "e2e_faster_rcnn_R_50_FPN_1x.yaml",
"R-101-FPN": "e2e_faster_rcnn_R_101_FPN_1x.yaml",
"R-152-FPN": "e2e_faster_rcnn_R_101_FPN_1x.yaml",
"R-50-FPN-RETINANET": "retinanet/retinanet_R-50-FPN_1x.yaml",
"R-101-FPN-RETINANET": "retinanet/retinanet_R-101-FPN_1x.yaml",
}
class TestBackbones(unittest.TestCase):
def test_build_backbones(self):
''' Make sure backbones run '''
self.assertGreater(len(registry.BACKBONES), 0)
for name, backbone_builder in registry.BACKBONES.items():
print('Testing {}...'.format(name))
if name in BACKBONE_CFGS:
cfg = load_config(BACKBONE_CFGS[name])
else:
# Use default config if config file is not specified
cfg = copy.deepcopy(g_cfg)
backbone = backbone_builder(cfg)
# make sures the backbone has `out_channels`
self.assertIsNotNone(
getattr(backbone, 'out_channels', None),
'Need to provide out_channels for backbone {}'.format(name)
)
N, C_in, H, W = 2, 3, 224, 256
input = torch.rand([N, C_in, H, W], dtype=torch.float32)
out = backbone(input)
for cur_out in out:
self.assertEqual(
cur_out.shape[:2],
torch.Size([N, backbone.out_channels])
)
if __name__ == "__main__":
unittest.main()
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
import unittest
import numpy as np
import torch
from maskrcnn_benchmark.modeling.box_coder import BoxCoder
class TestBoxCoder(unittest.TestCase):
def test_box_decoder(self):
""" Match unit test UtilsBoxesTest.TestBboxTransformRandom in
caffe2/operators/generate_proposals_op_util_boxes_test.cc
"""
box_coder = BoxCoder(weights=(1.0, 1.0, 1.0, 1.0))
bbox = torch.from_numpy(
np.array(
[
175.62031555,
20.91103172,
253.352005,
155.0145874,
169.24636841,
4.85241556,
228.8605957,
105.02092743,
181.77426147,
199.82876587,
192.88427734,
214.0255127,
174.36262512,
186.75761414,
296.19091797,
231.27906799,
22.73153877,
92.02596283,
135.5695343,
208.80291748,
]
)
.astype(np.float32)
.reshape(-1, 4)
)
deltas = torch.from_numpy(
np.array(
[
0.47861834,
0.13992102,
0.14961673,
0.71495209,
0.29915856,
-0.35664671,
0.89018666,
0.70815367,
-0.03852064,
0.44466892,
0.49492538,
0.71409376,
0.28052918,
0.02184832,
0.65289006,
1.05060139,
-0.38172557,
-0.08533806,
-0.60335309,
0.79052375,
]
)
.astype(np.float32)
.reshape(-1, 4)
)
gt_bbox = (
np.array(
[
206.949539,
-30.715202,
297.387665,
244.448486,
143.871216,
-83.342888,
290.502289,
121.053398,
177.430283,
198.666245,
196.295273,
228.703079,
152.251892,
145.431564,
387.215454,
274.594238,
5.062420,
11.040955,
66.328903,
269.686218,
]
)
.astype(np.float32)
.reshape(-1, 4)
)
results = box_coder.decode(deltas, bbox)
np.testing.assert_allclose(results.detach().numpy(), gt_bbox, atol=1e-4)
if __name__ == "__main__":
unittest.main()
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
import unittest
import glob
import os
import utils
class TestConfigs(unittest.TestCase):
def test_configs_load(self):
''' Make sure configs are loadable '''
cfg_root_path = utils.get_config_root_path()
files = glob.glob(
os.path.join(cfg_root_path, "./**/*.yaml"), recursive=True)
self.assertGreater(len(files), 0)
for fn in files:
print('Loading {}...'.format(fn))
utils.load_config_from_file(fn)
if __name__ == "__main__":
unittest.main()
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
import unittest
import numpy as np
import torch
import maskrcnn_benchmark.modeling.backbone.fbnet_builder as fbnet_builder
TEST_CUDA = torch.cuda.is_available()
def _test_primitive(self, device, op_name, op_func, N, C_in, C_out, expand, stride):
op = op_func(C_in, C_out, expand, stride).to(device)
input = torch.rand([N, C_in, 7, 7], dtype=torch.float32).to(device)
output = op(input)
self.assertEqual(
output.shape[:2], torch.Size([N, C_out]),
'Primitive {} failed for shape {}.'.format(op_name, input.shape)
)
class TestFBNetBuilder(unittest.TestCase):
def test_identity(self):
id_op = fbnet_builder.Identity(20, 20, 1)
input = torch.rand([10, 20, 7, 7], dtype=torch.float32)
output = id_op(input)
np.testing.assert_array_equal(np.array(input), np.array(output))
id_op = fbnet_builder.Identity(20, 40, 2)
input = torch.rand([10, 20, 7, 7], dtype=torch.float32)
output = id_op(input)
np.testing.assert_array_equal(output.shape, [10, 40, 4, 4])
def test_primitives(self):
''' Make sures the primitives runs '''
for op_name, op_func in fbnet_builder.PRIMITIVES.items():
print('Testing {}'.format(op_name))
_test_primitive(
self, "cpu",
op_name, op_func,
N=20, C_in=16, C_out=32, expand=4, stride=1
)
@unittest.skipIf(not TEST_CUDA, "no CUDA detected")
def test_primitives_cuda(self):
''' Make sures the primitives runs on cuda '''
for op_name, op_func in fbnet_builder.PRIMITIVES.items():
print('Testing {}'.format(op_name))
_test_primitive(
self, "cuda",
op_name, op_func,
N=20, C_in=16, C_out=32, expand=4, stride=1
)
def test_primitives_empty_batch(self):
''' Make sures the primitives runs '''
for op_name, op_func in fbnet_builder.PRIMITIVES.items():
print('Testing {}'.format(op_name))
# test empty batch size
_test_primitive(
self, "cpu",
op_name, op_func,
N=0, C_in=16, C_out=32, expand=4, stride=1
)
@unittest.skipIf(not TEST_CUDA, "no CUDA detected")
def test_primitives_cuda_empty_batch(self):
''' Make sures the primitives runs '''
for op_name, op_func in fbnet_builder.PRIMITIVES.items():
print('Testing {}'.format(op_name))
# test empty batch size
_test_primitive(
self, "cuda",
op_name, op_func,
N=0, C_in=16, C_out=32, expand=4, stride=1
)
if __name__ == "__main__":
unittest.main()
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
import unittest
import copy
import torch
# import modules to to register feature extractors
from maskrcnn_benchmark.modeling.backbone import build_backbone # NoQA
from maskrcnn_benchmark.modeling.roi_heads.roi_heads import build_roi_heads # NoQA
from maskrcnn_benchmark.modeling import registry
from maskrcnn_benchmark.structures.bounding_box import BoxList
from maskrcnn_benchmark.config import cfg as g_cfg
from utils import load_config
# overwrite configs if specified, otherwise default config is used
FEATURE_EXTRACTORS_CFGS = {
}
# overwrite configs if specified, otherwise default config is used
FEATURE_EXTRACTORS_INPUT_CHANNELS = {
# in_channels was not used, load through config
"ResNet50Conv5ROIFeatureExtractor": 1024,
}
def _test_feature_extractors(
self, extractors, overwrite_cfgs, overwrite_in_channels
):
''' Make sure roi box feature extractors run '''
self.assertGreater(len(extractors), 0)
in_channels_default = 64
for name, builder in extractors.items():
print('Testing {}...'.format(name))
if name in overwrite_cfgs:
cfg = load_config(overwrite_cfgs[name])
else:
# Use default config if config file is not specified
cfg = copy.deepcopy(g_cfg)
in_channels = overwrite_in_channels.get(
name, in_channels_default)
fe = builder(cfg, in_channels)
self.assertIsNotNone(
getattr(fe, 'out_channels', None),
'Need to provide out_channels for feature extractor {}'.format(name)
)
N, C_in, H, W = 2, in_channels, 24, 32
input = torch.rand([N, C_in, H, W], dtype=torch.float32)
bboxes = [[1, 1, 10, 10], [5, 5, 8, 8], [2, 2, 3, 4]]
img_size = [384, 512]
box_list = BoxList(bboxes, img_size, "xyxy")
out = fe([input], [box_list] * N)
self.assertEqual(
out.shape[:2],
torch.Size([N * len(bboxes), fe.out_channels])
)
class TestFeatureExtractors(unittest.TestCase):
def test_roi_box_feature_extractors(self):
''' Make sure roi box feature extractors run '''
_test_feature_extractors(
self,
registry.ROI_BOX_FEATURE_EXTRACTORS,
FEATURE_EXTRACTORS_CFGS,
FEATURE_EXTRACTORS_INPUT_CHANNELS,
)
def test_roi_keypoints_feature_extractors(self):
''' Make sure roi keypoints feature extractors run '''
_test_feature_extractors(
self,
registry.ROI_KEYPOINT_FEATURE_EXTRACTORS,
FEATURE_EXTRACTORS_CFGS,
FEATURE_EXTRACTORS_INPUT_CHANNELS,
)
def test_roi_mask_feature_extractors(self):
''' Make sure roi mask feature extractors run '''
_test_feature_extractors(
self,
registry.ROI_MASK_FEATURE_EXTRACTORS,
FEATURE_EXTRACTORS_CFGS,
FEATURE_EXTRACTORS_INPUT_CHANNELS,
)
if __name__ == "__main__":
unittest.main()
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
import unittest
import numpy as np
import torch
from maskrcnn_benchmark.layers import nms as box_nms
class TestNMS(unittest.TestCase):
def test_nms_cpu(self):
""" Match unit test UtilsNMSTest.TestNMS in
caffe2/operators/generate_proposals_op_util_nms_test.cc
"""
inputs = (
np.array(
[
10,
10,
50,
60,
0.5,
11,
12,
48,
60,
0.7,
8,
9,
40,
50,
0.6,
100,
100,
150,
140,
0.9,
99,
110,
155,
139,
0.8,
]
)
.astype(np.float32)
.reshape(-1, 5)
)
boxes = torch.from_numpy(inputs[:, :4])
scores = torch.from_numpy(inputs[:, 4])
test_thresh = [0.1, 0.3, 0.5, 0.8, 0.9]
gt_indices = [[1, 3], [1, 3], [1, 3], [1, 2, 3, 4], [0, 1, 2, 3, 4]]
for thresh, gt_index in zip(test_thresh, gt_indices):
keep_indices = box_nms(boxes, scores, thresh)
keep_indices = np.sort(keep_indices)
np.testing.assert_array_equal(keep_indices, np.array(gt_index))
def test_nms1_cpu(self):
""" Match unit test UtilsNMSTest.TestNMS1 in
caffe2/operators/generate_proposals_op_util_nms_test.cc
"""
boxes = torch.from_numpy(
np.array(
[
[350.9821, 161.8200, 369.9685, 205.2372],
[250.5236, 154.2844, 274.1773, 204.9810],
[471.4920, 160.4118, 496.0094, 213.4244],
[352.0421, 164.5933, 366.4458, 205.9624],
[166.0765, 169.7707, 183.0102, 232.6606],
[252.3000, 183.1449, 269.6541, 210.6747],
[469.7862, 162.0192, 482.1673, 187.0053],
[168.4862, 174.2567, 181.7437, 232.9379],
[470.3290, 162.3442, 496.4272, 214.6296],
[251.0450, 155.5911, 272.2693, 203.3675],
[252.0326, 154.7950, 273.7404, 195.3671],
[351.7479, 161.9567, 370.6432, 204.3047],
[496.3306, 161.7157, 515.0573, 210.7200],
[471.0749, 162.6143, 485.3374, 207.3448],
[250.9745, 160.7633, 264.1924, 206.8350],
[470.4792, 169.0351, 487.1934, 220.2984],
[474.4227, 161.9546, 513.1018, 215.5193],
[251.9428, 184.1950, 262.6937, 207.6416],
[252.6623, 175.0252, 269.8806, 213.7584],
[260.9884, 157.0351, 288.3554, 206.6027],
[251.3629, 164.5101, 263.2179, 202.4203],
[471.8361, 190.8142, 485.6812, 220.8586],
[248.6243, 156.9628, 264.3355, 199.2767],
[495.1643, 158.0483, 512.6261, 184.4192],
[376.8718, 168.0144, 387.3584, 201.3210],
[122.9191, 160.7433, 172.5612, 231.3837],
[350.3857, 175.8806, 366.2500, 205.4329],
[115.2958, 162.7822, 161.9776, 229.6147],
[168.4375, 177.4041, 180.8028, 232.4551],
[169.7939, 184.4330, 181.4767, 232.1220],
[347.7536, 175.9356, 355.8637, 197.5586],
[495.5434, 164.6059, 516.4031, 207.7053],
[172.1216, 194.6033, 183.1217, 235.2653],
[264.2654, 181.5540, 288.4626, 214.0170],
[111.7971, 183.7748, 137.3745, 225.9724],
[253.4919, 186.3945, 280.8694, 210.0731],
[165.5334, 169.7344, 185.9159, 232.8514],
[348.3662, 184.5187, 354.9081, 201.4038],
[164.6562, 162.5724, 186.3108, 233.5010],
[113.2999, 186.8410, 135.8841, 219.7642],
[117.0282, 179.8009, 142.5375, 221.0736],
[462.1312, 161.1004, 495.3576, 217.2208],
[462.5800, 159.9310, 501.2937, 224.1655],
[503.5242, 170.0733, 518.3792, 209.0113],
[250.3658, 195.5925, 260.6523, 212.4679],
[108.8287, 163.6994, 146.3642, 229.7261],
[256.7617, 187.3123, 288.8407, 211.2013],
[161.2781, 167.4801, 186.3751, 232.7133],
[115.3760, 177.5859, 163.3512, 236.9660],
[248.9077, 188.0919, 264.8579, 207.9718],
[108.1349, 160.7851, 143.6370, 229.6243],
[465.0900, 156.7555, 490.3561, 213.5704],
[107.5338, 173.4323, 141.0704, 235.2910],
]
).astype(np.float32)
)
scores = torch.from_numpy(
np.array(
[
0.1919,
0.3293,
0.0860,
0.1600,
0.1885,
0.4297,
0.0974,
0.2711,
0.1483,
0.1173,
0.1034,
0.2915,
0.1993,
0.0677,
0.3217,
0.0966,
0.0526,
0.5675,
0.3130,
0.1592,
0.1353,
0.0634,
0.1557,
0.1512,
0.0699,
0.0545,
0.2692,
0.1143,
0.0572,
0.1990,
0.0558,
0.1500,
0.2214,
0.1878,
0.2501,
0.1343,
0.0809,
0.1266,
0.0743,
0.0896,
0.0781,
0.0983,
0.0557,
0.0623,
0.5808,
0.3090,
0.1050,
0.0524,
0.0513,
0.4501,
0.4167,
0.0623,
0.1749,
]
).astype(np.float32)
)
gt_indices = np.array(
[
1,
6,
7,
8,
11,
12,
13,
14,
17,
18,
19,
21,
23,
24,
25,
26,
30,
32,
33,
34,
35,
37,
43,
44,
47,
50,
]
)
keep_indices = box_nms(boxes, scores, 0.5)
keep_indices = np.sort(keep_indices)
np.testing.assert_array_equal(keep_indices, gt_indices)
if __name__ == "__main__":
unittest.main()
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
import unittest
import copy
import torch
# import modules to to register predictors
from maskrcnn_benchmark.modeling.backbone import build_backbone # NoQA
from maskrcnn_benchmark.modeling.roi_heads.roi_heads import build_roi_heads # NoQA
from maskrcnn_benchmark.modeling import registry
from maskrcnn_benchmark.config import cfg as g_cfg
from utils import load_config
# overwrite configs if specified, otherwise default config is used
PREDICTOR_CFGS = {
}
# overwrite configs if specified, otherwise default config is used
PREDICTOR_INPUT_CHANNELS = {
}
def _test_predictors(
self, predictors, overwrite_cfgs, overwrite_in_channels,
hwsize,
):
''' Make sure predictors run '''
self.assertGreater(len(predictors), 0)
in_channels_default = 64
for name, builder in predictors.items():
print('Testing {}...'.format(name))
if name in overwrite_cfgs:
cfg = load_config(overwrite_cfgs[name])
else:
# Use default config if config file is not specified
cfg = copy.deepcopy(g_cfg)
in_channels = overwrite_in_channels.get(
name, in_channels_default)
fe = builder(cfg, in_channels)
N, C_in, H, W = 2, in_channels, hwsize, hwsize
input = torch.rand([N, C_in, H, W], dtype=torch.float32)
out = fe(input)
yield input, out, cfg
class TestPredictors(unittest.TestCase):
def test_roi_box_predictors(self):
''' Make sure roi box predictors run '''
for cur_in, cur_out, cur_cfg in _test_predictors(
self,
registry.ROI_BOX_PREDICTOR,
PREDICTOR_CFGS,
PREDICTOR_INPUT_CHANNELS,
hwsize=1,
):
self.assertEqual(len(cur_out), 2)
scores, bbox_deltas = cur_out[0], cur_out[1]
self.assertEqual(
scores.shape[1], cur_cfg.MODEL.ROI_BOX_HEAD.NUM_CLASSES)
self.assertEqual(scores.shape[0], cur_in.shape[0])
self.assertEqual(scores.shape[0], bbox_deltas.shape[0])
self.assertEqual(scores.shape[1] * 4, bbox_deltas.shape[1])
def test_roi_keypoints_predictors(self):
''' Make sure roi keypoint predictors run '''
for cur_in, cur_out, cur_cfg in _test_predictors(
self,
registry.ROI_KEYPOINT_PREDICTOR,
PREDICTOR_CFGS,
PREDICTOR_INPUT_CHANNELS,
hwsize=14,
):
self.assertEqual(cur_out.shape[0], cur_in.shape[0])
self.assertEqual(
cur_out.shape[1], cur_cfg.MODEL.ROI_KEYPOINT_HEAD.NUM_CLASSES)
def test_roi_mask_predictors(self):
''' Make sure roi mask predictors run '''
for cur_in, cur_out, cur_cfg in _test_predictors(
self,
registry.ROI_MASK_PREDICTOR,
PREDICTOR_CFGS,
PREDICTOR_INPUT_CHANNELS,
hwsize=14,
):
self.assertEqual(cur_out.shape[0], cur_in.shape[0])
self.assertEqual(
cur_out.shape[1], cur_cfg.MODEL.ROI_BOX_HEAD.NUM_CLASSES)
if __name__ == "__main__":
unittest.main()
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
import unittest
import copy
import torch
# import modules to to register rpn heads
from maskrcnn_benchmark.modeling.backbone import build_backbone # NoQA
from maskrcnn_benchmark.modeling.rpn.rpn import build_rpn # NoQA
from maskrcnn_benchmark.modeling import registry
from maskrcnn_benchmark.config import cfg as g_cfg
from utils import load_config
# overwrite configs if specified, otherwise default config is used
RPN_CFGS = {
}
class TestRPNHeads(unittest.TestCase):
def test_build_rpn_heads(self):
''' Make sure rpn heads run '''
self.assertGreater(len(registry.RPN_HEADS), 0)
in_channels = 64
num_anchors = 10
for name, builder in registry.RPN_HEADS.items():
print('Testing {}...'.format(name))
if name in RPN_CFGS:
cfg = load_config(RPN_CFGS[name])
else:
# Use default config if config file is not specified
cfg = copy.deepcopy(g_cfg)
rpn = builder(cfg, in_channels, num_anchors)
N, C_in, H, W = 2, in_channels, 24, 32
input = torch.rand([N, C_in, H, W], dtype=torch.float32)
LAYERS = 3
out = rpn([input] * LAYERS)
self.assertEqual(len(out), 2)
logits, bbox_reg = out
for idx in range(LAYERS):
self.assertEqual(
logits[idx].shape,
torch.Size([
input.shape[0], num_anchors,
input.shape[2], input.shape[3],
])
)
self.assertEqual(
bbox_reg[idx].shape,
torch.Size([
logits[idx].shape[0], num_anchors * 4,
logits[idx].shape[2], logits[idx].shape[3],
]),
)
if __name__ == "__main__":
unittest.main()
from __future__ import absolute_import, division, print_function, unicode_literals
# Set up custom environment before nearly anything else is imported
# NOTE: this should be the first import (no not reorder)
from maskrcnn_benchmark.utils.env import setup_environment # noqa F401 isort:skip
import env_tests.env as env_tests
import os
import copy
from maskrcnn_benchmark.config import cfg as g_cfg
def get_config_root_path():
return env_tests.get_config_root_path()
def load_config(rel_path):
''' Load config from file path specified as path relative to config_root '''
cfg_path = os.path.join(env_tests.get_config_root_path(), rel_path)
return load_config_from_file(cfg_path)
def load_config_from_file(file_path):
''' Load config from file path specified as absolute path '''
ret = copy.deepcopy(g_cfg)
ret.merge_from_file(file_path)
return ret
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment