Skip to content

Commit

Permalink
support rec
Browse files Browse the repository at this point in the history
  • Loading branch information
hhaAndroid committed Nov 22, 2023
1 parent 44d1281 commit 6fad548
Show file tree
Hide file tree
Showing 4 changed files with 198 additions and 45 deletions.
32 changes: 29 additions & 3 deletions configs/grounding_dino/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,16 +78,42 @@ Note:
## LVIS Results

| Model | MiniVal APr | MiniVal APc | MiniVal APf | MiniVal AP | Val1.0 APr | Val1.0 APc | Val1.0 APf | Val1.0 AP | Pre-Train Data | Config | Download |
|:-----------------:|:-----------:|:-----------:|:-----------:|:----------:| :--------: | :--------: | :--------: | :-------: | :------------------------: | :---------------------------------------------------------------------: | :------------------------------------------------------------------------------------------: |
| Grounding DINO-T | 18.8 | 24.2 | 34.7 | 28.8 | | | | | O365,GoldG,Cap4M | [config](lvis/grounding_dino_swin-t_pretrain_zeroshot_mini-lvis.py) | [model](https://download.openmmlab.com/mmdetection/v3.0/grounding_dino/groundingdino_swint_ogc_mmdet-822d7e9d.pth) |
| Grounding DINO-B | 27.9 | 33.4 | 37.2 | 34.7 | | | | | COCO,O365,GoldG,Cap4M,OpenImage,ODinW-35,RefCOCO | [config](lvis/grounding_dino_swin-b_pretrain_zeroshot_mini-lvis.py) | [model](https://download.openmmlab.com/mmdetection/v3.0/grounding_dino/groundingdino_swinb_cogcoor_mmdet-55949c9c.pth) |
|:-----------------:|:-----------:|:-----------:|:-----------:|:----------:|:----------:|:----------:|:----------:|:---------:| :------------------------: | :---------------------------------------------------------------------: | :------------------------------------------------------------------------------------------: |
| Grounding DINO-T | 18.8 | 24.2 | 34.7 | 28.8 | 10.1 | 15.3 | 29.9 | 20.1 | O365,GoldG,Cap4M | [config](lvis/grounding_dino_swin-t_pretrain_zeroshot_mini-lvis.py) | [model](https://download.openmmlab.com/mmdetection/v3.0/grounding_dino/groundingdino_swint_ogc_mmdet-822d7e9d.pth) |
| Grounding DINO-B | 27.9 | 33.4 | 37.2 | 34.7 | 19.0 | 24.1 | 32.9 | 26.7 | COCO,O365,GoldG,Cap4M,OpenImage,ODinW-35,RefCOCO | [config](lvis/grounding_dino_swin-b_pretrain_zeroshot_mini-lvis.py) | [model](https://download.openmmlab.com/mmdetection/v3.0/grounding_dino/groundingdino_swinb_cogcoor_mmdet-55949c9c.pth) |


Note:

1. The above are zero-shot evaluation results.
2. The evaluation metric we used is LVIS FixAP. For specific details, please refer to [Evaluating Large-Vocabulary Object Detectors: The Devil is in the Details](https://arxiv.org/pdf/2102.01066.pdf).

## Referring Expression Comprehension Results

| Method | Grounding DINO-T | Grounding DINO-B |
|------------------------|-------------------|-------------------|
| RefCOCO val @1,5,10 | 50.77/89.45/94.86 | 84.61/97.88/99.10 |
| RefCOCO testA @1,5,10 | 57.45/91.29/95.62 | 88.65/98.89/99.63 |
| RefCOCO testB @1,5,10 | 44.97/86.54/92.88 | 80.51/96.64/98.51 |
| RefCOCO+ val @1,5,10 | 51.64/86.35/92.57 | 73.67/96.60/98.65 |
| RefCOCO+ testA @1,5,10 | 57.25/86.74/92.65 | 82.19/97.92/99.09 |
| RefCOCO+ testB @1,5,10 | 46.35/84.05/90.67 | 64.10/94.25/97.46 |
| RefCOCOg val @1,5,10 | 60.42/92.10/96.18 | 78.33/97.28/98.57 |
| RefCOCOg test @1,5,10 | 59.74/92.08/96.28 | 78.11/97.06/98.65 |

Note:

1. `@1,5,10` refers to precision at the top 1, 5, and 10 positions in a predicted ranked list.
2. The pretraining data used by Grounding DINO-T is `O365,GoldG,Cap4M`, and the corresponding evaluation configuration is (grounding_dino_swin-t_pretrain_zeroshot_refcoco)[refcoco/grounding_dino_swin-t_pretrain_zeroshot_refcoco.py].
3. The pretraining data used by Grounding DINO-B is `COCO,O365,GoldG,Cap4M,OpenImage,ODinW-35,RefCOCO`, and the corresponding evaluation configuration is (grounding_dino_swin-t_pretrain_zeroshot_refcoco)[refcoco/grounding_dino_swin-b_pretrain_zeroshot_refcoco.py].

Test Command

```shell
cd mmdetection
./tools/dist_test.sh configs/grounding_dino/refcoco/grounding_dino_swin-t_pretrain_zeroshot_refexp.py https://download.openmmlab.com/mmdetection/v3.0/grounding_dino/groundingdino_swint_ogc_mmdet-822d7e9d.pth 8
./tools/dist_test.sh configs/grounding_dino/refcoco/grounding_dino_swin-b_pretrain_zeroshot_refexp.py https://download.openmmlab.com/mmdetection/v3.0/grounding_dino/groundingdino_swinb_cogcoor_mmdet-55949c9c.pth 8
```

## Custom Dataset

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
_base_ = './grounding_dino_swin-t_pretrain_zeroshot_refexp.py'

model = dict(
type='GroundingDINO',
backbone=dict(
pretrain_img_size=384,
embed_dims=128,
depths=[2, 2, 18, 2],
num_heads=[4, 8, 16, 32],
window_size=12,
drop_path_rate=0.3,
patch_norm=True),
neck=dict(in_channels=[256, 512, 1024]),
)

This file was deleted.

Original file line number Diff line number Diff line change
@@ -0,0 +1,155 @@
_base_ = '../grounding_dino_swin-t_pretrain_obj365_goldg_cap4m.py'

model = dict(test_cfg=dict(max_per_img=15))

data_root = 'data/coco/'

test_pipeline = [
dict(
type='LoadImageFromFile', backend_args=None,
imdecode_backend='pillow'),
dict(
type='FixScaleResize',
scale=(800, 1333),
keep_ratio=True,
backend='pillow'),
dict(type='LoadAnnotations', with_bbox=True),
dict(
type='PackDetInputs',
meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',
'scale_factor', 'text', 'custom_entities', 'tokens_positive'))
]

# -------------------------------------------------#
ann_file = 'mdetr_annotations/final_refexp_val.json'
val_dataset_all_val = dict(
type='MDETRStyleRefCocoDataset',
data_root=data_root,
ann_file=ann_file,
data_prefix=dict(img='train2014/'),
test_mode=True,
return_classes=True,
pipeline=test_pipeline,
backend_args=None)
val_evaluator_all_val = dict(
type='RefExpMetric',
ann_file=data_root + ann_file,
metric='bbox',
iou_thrs=0.5,
topk=(1, 5, 10))

# -------------------------------------------------#
ann_file = 'mdetr_annotations/finetune_refcoco_testA.json'
val_dataset_refcoco_testA = dict(
type='MDETRStyleRefCocoDataset',
data_root=data_root,
ann_file=ann_file,
data_prefix=dict(img='train2014/'),
test_mode=True,
return_classes=True,
pipeline=test_pipeline,
backend_args=None)

val_evaluator_refcoco_testA = dict(
type='RefExpMetric',
ann_file=data_root + ann_file,
metric='bbox',
iou_thrs=0.5,
topk=(1, 5, 10))

# -------------------------------------------------#
ann_file = 'mdetr_annotations/finetune_refcoco_testB.json'
val_dataset_refcoco_testB = dict(
type='MDETRStyleRefCocoDataset',
data_root=data_root,
ann_file=ann_file,
data_prefix=dict(img='train2014/'),
test_mode=True,
return_classes=True,
pipeline=test_pipeline,
backend_args=None)

val_evaluator_refcoco_testB = dict(
type='RefExpMetric',
ann_file=data_root + ann_file,
metric='bbox',
iou_thrs=0.5,
topk=(1, 5, 10))

# -------------------------------------------------#
ann_file = 'mdetr_annotations/finetune_refcoco+_testA.json'
val_dataset_refcoco_plus_testA = dict(
type='MDETRStyleRefCocoDataset',
data_root=data_root,
ann_file=ann_file,
data_prefix=dict(img='train2014/'),
test_mode=True,
return_classes=True,
pipeline=test_pipeline,
backend_args=None)

val_evaluator_refcoco_plus_testA = dict(
type='RefExpMetric',
ann_file=data_root + ann_file,
metric='bbox',
iou_thrs=0.5,
topk=(1, 5, 10))

# -------------------------------------------------#
ann_file = 'mdetr_annotations/finetune_refcoco+_testB.json'
val_dataset_refcoco_plus_testB = dict(
type='MDETRStyleRefCocoDataset',
data_root=data_root,
ann_file=ann_file,
data_prefix=dict(img='train2014/'),
test_mode=True,
return_classes=True,
pipeline=test_pipeline,
backend_args=None)

val_evaluator_refcoco_plus_testB = dict(
type='RefExpMetric',
ann_file=data_root + ann_file,
metric='bbox',
iou_thrs=0.5,
topk=(1, 5, 10))

# -------------------------------------------------#
ann_file = 'mdetr_annotations/finetune_refcocog_test.json'
val_dataset_refcocog_test = dict(
type='MDETRStyleRefCocoDataset',
data_root=data_root,
ann_file=ann_file,
data_prefix=dict(img='train2014/'),
test_mode=True,
return_classes=True,
pipeline=test_pipeline,
backend_args=None)

val_evaluator_refcocog_test = dict(
type='RefExpMetric',
ann_file=data_root + ann_file,
metric='bbox',
iou_thrs=0.5,
topk=(1, 5, 10))
# -------------------------------------------------#
datasets = [val_dataset_all_val, val_dataset_refcoco_testA,
val_dataset_refcoco_testB, val_dataset_refcoco_plus_testA,
val_dataset_refcoco_plus_testB, val_dataset_refcocog_test]
dataset_prefixes = [
'val', 'refcoco_testA', 'refcoco_testB', 'refcoco+_testA', 'refcoco+_testB', 'refcocog_test'
]
metrics = [val_evaluator_all_val, val_evaluator_refcoco_testA,
val_evaluator_refcoco_testB, val_evaluator_refcoco_plus_testA,
val_evaluator_refcoco_plus_testB, val_evaluator_refcocog_test]

val_dataloader = dict(
dataset=dict(_delete_=True, type='ConcatDataset', datasets=datasets))
test_dataloader = val_dataloader

val_evaluator = dict(
_delete_=True,
type='MultiDatasetsEvaluator',
metrics=metrics,
dataset_prefixes=dataset_prefixes)
test_evaluator = val_evaluator

0 comments on commit 6fad548

Please sign in to comment.