You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Questions like:
I use the following code to extract features based on the FPN model. The config is:
"COCO-InstanceSegmentation/mask_rcnn_R_101_FPN_3x.yaml"
Feature extraction code:
'''
img_path = "input.jpg"
img_ori = cv2.imread(img_path)
height, width = img_ori.shape[:2]
img = predictor.transform_gen.get_transform(img_ori).apply_image(img_ori)
img = torch.as_tensor(img.astype("float32").transpose(2, 0, 1))
inputs = [{"image": img, "height": height, "width": width}]
with torch.no_grad():
imglists = predictor.model.preprocess_image(inputs) # don't forget to preprocess
features = predictor.model.backbone(imglists.tensor) # set of cnn features
proposals, _ = predictor.model.proposal_generator(imglists, features, None) # RPN
proposal_boxes = [x.proposal_boxes for x in proposals]
features_list = [features[f] for f in predictor.model.roi_heads.in_features]
proposal_rois = predictor.model.roi_heads.box_pooler(features_list, proposal_boxes)
**box_features** = predictor.model.roi_heads.box_head(proposal_rois)
'''
I use box_features as the feature of the object detection. But its dimension is 1024, which is inconsistent with the original bottom-up-attention image feature dimension in 2048. They all use residual-101 as the backbone network, so why are the feature dimensions inconsistent?
I apologized if the answer is obvious, I am very new to object detection.
Thank you!
The text was updated successfully, but these errors were encountered:
The Github version is converted from the original bottom-up attention repo. It takes a specific detector trained with Visual Genome. Please take the configurations under the folder py-bottom-up-attention/configs/VG-Detection/. Hope these help!
Questions like:
I use the following code to extract features based on the FPN model. The config is:
"COCO-InstanceSegmentation/mask_rcnn_R_101_FPN_3x.yaml"
Feature extraction code:
'''
img_path = "input.jpg"
img_ori = cv2.imread(img_path)
height, width = img_ori.shape[:2]
img = predictor.transform_gen.get_transform(img_ori).apply_image(img_ori)
img = torch.as_tensor(img.astype("float32").transpose(2, 0, 1))
inputs = [{"image": img, "height": height, "width": width}]
with torch.no_grad():
imglists = predictor.model.preprocess_image(inputs) # don't forget to preprocess
features = predictor.model.backbone(imglists.tensor) # set of cnn features
proposals, _ = predictor.model.proposal_generator(imglists, features, None) # RPN
'''
I use box_features as the feature of the object detection. But its dimension is 1024, which is inconsistent with the original bottom-up-attention image feature dimension in 2048. They all use residual-101 as the backbone network, so why are the feature dimensions inconsistent?
I apologized if the answer is obvious, I am very new to object detection.
Thank you!
The text was updated successfully, but these errors were encountered: