Install ☘️
# It is recommanded to create a separate virtual environment
conda create -n vision python=3.10
conda activate vision
# torch==2.0.1(lower is also ok) -> https://pytorch.org/get-started/locally/
conda install pytorch torchvision torchaudio cpuonly -c pytorch # cpu-version
conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia # cuda-version
pip install -r requirements.txt
# Without Arial.ttf, inference may be slow due to network IO.
mkdir -p ~/.config/DuKe
cp misc/Arial.ttf ~/.config/DuKe
Training 🌟️
# one machine one gpu
python main.py --cfgs configs/task/pet.yaml
# one machine multiple gpus
CUDA_VISIBLE_DEVICES=0,1,2,3 torchrun --nproc_per_node 4 main.py --cfgs configs/classification/pet.yaml
--sync_bn[Option: this will lead to training slowly]
--resume[Option: training from checkpoint]
--load_from[Option: training from fine-tuning]
- [Apr. 2024] Face Recognition Task(FRT) is supported now 🚀️️! We provide ResNet, EfficientNet, and Swin Transformer as backbone; As for head, ArcFace, CircleLoss, MegFace and MV Softmax could be used for training. Note: partial implementation refers to JD-FaceX
- [Jun. 2023] Image Classification Task(ICT) has launched 🚀️️! Supporting many powerful strategies, such as progressive learning, online enhancement, beautiful training interface, exponential moving average, etc. The models are fully integrated into torchvision.
- [May. 2023] The first initialization version of Vision.
Method | Paper |
---|---|
SAM | Sharpness-Aware Minimization for Efficiently Improving Generalization |
Progressive Learning | EfficientNetV2: Smaller Models and Faster Training |
OHEM | Training Region-based Object Detectors with Online Hard Example Mining |
Focal Loss | Focal Loss for Dense Object Detection |
Cosine Annealing | SGDR: Stochastic Gradient Descent with Warm Restarts |
Label Smoothing | Rethinking the Inception Architecture for Computer Vision |
Mixup | MixUp: Beyond Empirical Risk Minimization |
CutOut | Improved Regularization of Convolutional Neural Networks with Cutout |
Attention Pool | Augmenting Convolutional networks with attention-based aggregation |
GradCAM | Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization |
ArcFace | ArcFace: Additive Angular Margin Loss for Deep Face Recognition |
CircleLoss | Circle Loss: A Unified Perspective of Pair Similarity Optimization |
MegFace | MagFace: A Universal Representation for Face Recognition and Quality Assessment |
MV Softmax | Mis-classified Vector Guided Softmax Loss for Face Recognition |
Method | Paper | Name in configs, eg: torchvision-mobilenet_v2 |
---|---|---|
MobileNetv2 | MobileNetV2: Inverted Residuals and Linear Bottlenecks | mobilenet_v2 |
MobileNetv3 | Searching for MobileNetV3 | mobilenet_v3_small, mobilenet_v3_large |
ShuffleNetv2 | ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design | shufflenet_v2_x0_5, shufflenet_v2_x1_0, shufflenet_v2_x1_5, shufflenet_v2_x2_0 |
ResNet | Deep Residual Learning for Image Recognition | resnet18, resnet34, resnet50, resnet101, resnet152 |
ResNeXt | Aggregated Residual Transformations for Deep Neural Networks | resnext50_32x4d, resnext101_32x8d, resnext101_64x4d |
ConvNext | A ConvNet for the 2020s | convnext_tiny, convnext_small, convnext_base, convnext_large |
EfficientNet | EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks | efficientnet_b{0..7} |
EfficientNetv2 | EfficientNetV2: Smaller Models and Faster Training | efficientnet_v2_s, efficientnet_v2_m, efficientnet_v2_l |
Swin Transformer | Swin Transformer: Hierarchical Vision Transformer using Shifted Windows | swin_t, swin_s, swin_b |
Swin Transformerv2 | Swin Transformer V2: Scaling Up Capacity and Resolution | swin_v2_t, swin_v2_s, swin_v2_b |
- Split the data set into training set and validation set
python tools/data_prepare.py --postfix <jpg or png> --root <input your data realpath> --frac <train segment ratio, eg: 0.9 0.6 0.3 0.9 0.9>
- Data augmented visualization
cd visiondk
python -m tools.test_augment
- If you enjoy reproducing papers and algorithms, welcome to pull request.
- If you have some confusion about the repo, please submit issues.