Skip to content

Tensorflow code for Vision Transformers training with the Self-Supervised learning method DINO

Notifications You must be signed in to change notification settings

TanyaChutani/DINO_Tf2.x

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

35 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Self-Supervised Vision Transformers with DINO

Introduction

While the volume of data collected for vision based tasks has increased exponentially in recent times, annotating all unstructured datasets is practically impossible.

DINO which is based self supervised learning, does not require large amounts of labelled data to achieve state of the art results on segmentation tasks, unlike traditional supervised methods.

To be specific, DINO is self distillation with NO labels, wherein 2 models (teacher and student) are used. While they have the same model architecture, the teacher model is trained using an exponentially weighted average of the student model's parameters.

This technique was introduced in the research paper by Facebook AI titled "Emerging Properties in Self-Supervised Vision Transformers".

Visualization of the generated attention maps highlight that DINO can learn class specific features automatically, which help us generate accurate segmentation maps without the need of labelled data in vision based tasks.

Commands to download the data

gdown --id 1Lw_XPTbkoHUtWpG4U9ByYIwwmLlufvyj
unzip PASCALVOC2007.zip

Train model

git clone https://github.com/TanyaChutani/DINO_Tf2.x.git
pip install -r requirements.txt
python main.py

Results

About

Tensorflow code for Vision Transformers training with the Self-Supervised learning method DINO

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published