Our code is based on mdistiller for knowledge distillation (https://github.com/megvii-research/mdistiller.git), and Transformer-Explainability for explainability tests (https://github.com/hila-chefer/Transformer-Explainability.git).
On CIFAR-100:
Teacher Student |
ResNet56 ResNet20 |
ResNet110 ResNet32 |
ResNet32x4 ResNet8x4 |
WRN-40-2 WRN-16-2 |
WRN-40-2 WRN-40-1 |
VGG13 VGG8 |
---|---|---|---|---|---|---|
KD | 70.66 | 73.08 | 73.33 | 74.92 | 73.54 | 72.98 |
Exp-KD | 71.77 | 74.13 | 77.36 | 76.27 | 74.77 | 74.85 |
Teacher Student |
ResNet32x4 ShuffleNet-V1 |
WRN-40-2 ShuffleNet-V1 |
VGG13 MobileNet-V2 |
ResNet50 MobileNet-V2 |
ResNet32x4 MobileNet-V2 |
---|---|---|---|---|---|
KD | 74.07 | 74.83 | 67.37 | 67.35 | 74.45 |
Exp-KD | 78.20 | 77.75 | 70.63 | 71.74 | 78.98 |
On ImageNet:
Teacher Student |
ResNet34 ResNet18 |
ResNet50 MobileNet-V1 |
---|---|---|
KD | 71.03 | 70.50 |
Exp-KD | 71.74 | 72.43 |
Environments:
- Python 3.6
- PyTorch 1.9.0
- torchvision 0.10.0
Install the package:
sudo pip3 install -r requirements.txt
sudo python3 setup.py develop
- Wandb as the logger
- The registeration: https://wandb.ai/home.
- If you don't want wandb as your logger, set
CFG.LOG.WANDB
asFalse
atmdistiller/engine/cfg.py
.
- Evaluation
-
You can evaluate the performance of models trained by yourself.
-
If test the models on ImageNet, please download the dataset at https://image-net.org/ and put them to
./data/imagenet
# evaluate teachers python3 tools/eval.py -m resnet32x4 # resnet32x4 on cifar100 python3 tools/eval.py -m ResNet34 -d imagenet # ResNet34 on imagenet # evaluate students python3 tools/eval.py -m model_name -c output/your_exp/student_best # your checkpoints
- Training on CIFAR-100
-
Download the
cifar_teachers.tar
at https://github.com/megvii-research/mdistiller/releases/tag/checkpoints and untar it to./download_ckpts
viatar xvf cifar_teachers.tar
.# for instance, our Exp-KD method. python3 tools/train.py --cfg configs/cifar100/cam/res32x4_res8x4.yaml # you can also change settings at command line python3 tools/train.py --cfg configs/cifar100/cam/res32x4_res8x4.yaml SOLVER.BATCH_SIZE 128 SOLVER.LR 0.1
- Training on ImageNet
-
Download the dataset at https://image-net.org/ and put them to
./data/imagenet
# for instance, our Exp-KD method. python3 tools/train.py --cfg configs/imagenet/r34_r18/cam.yaml
- Training on MS-COCO
- see detection.md
- Extension: Visualizations
- Jupyter notebooks: tsne and correlation_matrices
MDistiller is released under the MIT license. See LICENSE for details.