Change Log

Major updates and new features to this project will be listed in this document.

August 3, 2021

note: API changes from this update are intended to be backwards-compatible, so previous code should still run.

Re-training SSD-Mobilenet Object Detection tutorial with PyTorch
Support for collection of object detection datasets and bounding-box labeling in camera-capture tool
videoSource and videoOutput APIs for C++/Python that supports multiple types of video streams:
- MIPI CSI cameras
- V4L2 cameras
- RTP / RTSP
- Videos & Images
- Image sequences
- OpenGL windows
Unified the -console and -camera samples to process both images and video streams
Support for uchar3/uchar4/float3/float4 images (default is now uchar3 as opposed to float4)
Replaced opaque Python memory capsule with jetson.utils.cudaImage object
- See Image Capsules in Python for more info
- Images are now subscriptable/indexable from Python to directly access the pixel dataset
- Numpy ndarray conversion now supports uchar3/uchar4/float3/float4 formats
cudaConvertColor() automated colorspace conversion function (RGB, BGR, YUV, Bayer, grayscale, ect)
Python CUDA bindings for cudaResize(), cudaCrop(), cudaNormalize(), cudaOverlay()
- See Image Manipulation with CUDA and cuda-examples.py for examples of using these
Transitioned to using Python3 by default since Python 2.7 is now past EOL
DIGITS tutorial is now marked as deprecated (replaced by PyTorch transfer learning tutorial)
Logging can now be controlled/disabled from the command line (e.g. --log-level=verbose)

Thanks to everyone from the forums and GitHub who helped to test these updates in advance!

Dataset	Resolution	CLI Argument	Accuracy	Jetson Nano	Jetson Xavier
Cityscapes	512x256	`fcn-resnet18-cityscapes-512x256`	83.3%	48 FPS	480 FPS
Cityscapes	1024x512	`fcn-resnet18-cityscapes-1024x512`	87.3%	12 FPS	175 FPS
Cityscapes	2048x1024	`fcn-resnet18-cityscapes-2048x1024`	89.6%	3 FPS	47 FPS
DeepScene	576x320	`fcn-resnet18-deepscene-576x320`	96.4%	26 FPS	360 FPS
DeepScene	864x480	`fcn-resnet18-deepscene-864x480`	96.9%	14 FPS	190 FPS
Multi-Human	512x320	`fcn-resnet18-mhp-512x320`	86.5%	34 FPS	370 FPS
Multi-Human	640x360	`fcn-resnet18-mhp-512x320`	87.1%	23 FPS	325 FPS
Pascal VOC	320x320	`fcn-resnet18-voc-320x320`	85.9%	45 FPS	508 FPS
Pascal VOC	512x320	`fcn-resnet18-voc-512x320`	88.5%	34 FPS	375 FPS
SUN RGB-D	512x400	`fcn-resnet18-sun-512x400`	64.3%	28 FPS	340 FPS
SUN RGB-D	640x512	`fcn-resnet18-sun-640x512`	65.1%	17 FPS	224 FPS

Python API support for imageNet, detectNet, and camera/display utilities
Python examples for processing static images and live camera streaming
Support for interacting with numpy ndarrays from CUDA
Onboard re-training of ResNet-18 models with PyTorch
Example datasets: 800MB Cat/Dog and 1.5GB PlantCLEF
Camera-based tool for collecting and labeling custom datasets
Text UI tool for selecting/downloading pre-trained models
New pre-trained image classification models (on 1000-class ImageNet ILSVRC)
- ResNet-18, ResNet-50, ResNet-101, ResNet-152
- VGG-16, VGG-19
- Inception-v4
New pre-trained object detection models (on 90-class MS-COCO)
- SSD-Mobilenet-v1
- SSD-Mobilenet-v2
- SSD-Inception-v2
API Reference documentation for C++ and Python