DeiT

Paper：Training data-efficient image transformers & distillation through attention
Origin Repo：facebookresearch/deit
Code：deit.py

Evaluate Transforms：

# backend: pil
# input_size: 224x224
transforms = T.Compose([
    T.Resize(248, interpolation='bicubic'),
    T.CenterCrop(224),
    T.ToTensor(),
    T.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])

# backend: pil
# input_size: 384x384
transforms = T.Compose([
    T.Resize(384, interpolation='bicubic'),
    T.CenterCrop(384),
    T.ToTensor(),
    T.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])

Model Details：

Model	Model Name	Params (M)	FLOPs (G)	Top-1 (%)	Top-5 (%)	Pretrained Model
DeiT-tiny	deit_ti	5.7	1.1	72.18	91.11	Download
DeiT-small	deit_s	22.0	4.2	79.85	95.04	Download
DeiT-base	deit_b	86.4	16.8	81.99	95.74	Download
DeiT-tiny distilled	deit_ti_distilled	5.9	1.1	74.50	91.89	Download
DeiT-small distilled	deit_s_distilled	22.4	4.3	81.22	95.39	Download
DeiT-base distilled	deit_b_distilled	87.2	16.9	83.39	96.49	Download
DeiT-base 384	deit_b_384	86.4	49.3	83.10	96.37	Download
DeiT-base distilled 384	deit_b_distilled_384	87.2	49.4	85.43	97.33	Download

Citation：

@article{touvron2020deit,
    title = {Training data-efficient image transformers & distillation through attention},
    author = {Hugo Touvron and Matthieu Cord and Matthijs Douze and Francisco Massa and Alexandre Sablayrolles and Herv'e J'egou},
    journal = {arXiv preprint arXiv:2012.12877},
    year = {2020}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

deit.md

deit.md

DeiT

Files

deit.md

Latest commit

History

deit.md

File metadata and controls

DeiT