Skip to content

Efficient vision foundation models for high-resolution generation and perception.

License

Notifications You must be signed in to change notification settings

mit-han-lab/efficientvit

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

98 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

EfficientViT: Multi-Scale Linear Attention for High-Resolution Dense Prediction [paper]

Efficient vision foundation models for high-resolution generation and perception.

Content

Deep Compression Autoencoder for Efficient High-Resolution Diffusion Models [paper] [readme]

demo

Figure 1: We address the reconstruction accuracy drop of high spatial-compression autoencoders.

Figure 2 (Video): DC-AE delivers significant training and inference speedup without performance drop.

Figure 3: DC-AE enables efficient text-to-image generation on the laptop.

EfficientViT-SAM: Accelerated Segment Anything Model Without Accuracy Loss [paper] [online demo] [readme]

EfficientViT-Classification [paper] [readme]

EfficientViT-Segmentation [paper] [readme]

demo

EfficientViT-GazeSAM [readme]

GazeSAM demo

News

If you are interested in getting updates, please join our mailing list here.

Getting Started

conda create -n efficientvit python=3.10
conda activate efficientvit
pip install -r requirements.txt

Third-Party Implementation/Integration

Contact

Han Cai

Reference

If EfficientViT or EfficientViT-SAM or DC-AE is useful or relevant to your research, please kindly recognize our contributions by citing our paper:

@inproceedings{cai2023efficientvit,
  title={Efficientvit: Lightweight multi-scale attention for high-resolution dense prediction},
  author={Cai, Han and Li, Junyan and Hu, Muyan and Gan, Chuang and Han, Song},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={17302--17313},
  year={2023}
}
@article{zhang2024efficientvit,
  title={EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss},
  author={Zhang, Zhuoyang and Cai, Han and Han, Song},
  journal={arXiv preprint arXiv:2402.05008},
  year={2024}
}
@article{chen2024deep,
  title={Deep Compression Autoencoder for Efficient High-Resolution Diffusion Models},
  author={Chen, Junyu and Cai, Han and Chen, Junsong and Xie, Enze and Yang, Shang and Tang, Haotian and Li, Muyang and Lu, Yao and Han, Song},
  journal={arXiv preprint arXiv:2410.10733},
  year={2024}
}