Effortlessly achieve real-time depth extraction from videos using the advanced intel-isl/MiDaS depth extraction model, eliminating the need for cumbersome frame extraction.
Ensure the presence of a CUDA-capable GPU, preferably Nvidia Pascal and onwards, for optimal performance.
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
For systems without a CUDA-capable GPU:
pip3 install torch torchvision torchaudio
Install additional requirements:
pip install -r requirements.txt
Gratitude extended to the following contributors and projects:
Enhance functionality through the following roadmap features:
- FrameSkip: Implement depth scan on every 2nd frame and interpolate using VFI every other frame.
- Is this even FP16? (Yes, it is now :D)
Organize your files within the designated input folder. Execute the following command in the terminal:
Currently available commands include:
- -height
- -width
- -half ( use cuda half precision, increase performance for close to no quality loss, True or Falsse, set to True by default)
- -nt ( number of threads to utilize, set to 1 by defautl)
- -v (option to show images, True or False, set to True by default )
Example code to run in terminal:
python inference.py -video -height 1280 -width 720 -half True -nt 2 -v False
- Note: Images are compressed; consider this in your assessment.
Explore the GitHub repository for detailed information and updates. Your feedback and contributions are greatly appreciated!