Roadmap #4

voldemortX · 2021-02-21T13:34:11Z

Roadmap for our users (to state feature requests) and contributors.

* Low priority tasks.

2022Q? (2022.4.2 - ):

Lane detection methods
- *ConvNeXt
- Add and test VGG16-RESA
- Explore ERFNet-RESA create model erfnet_resa #74
- Add and test ERFNet-LSTR
- LaneATT laneatt #90 LaneATT TensorRT support #102
- Explore CondLaneNet
- *Explore FOLOLanes
Semantic segmentation methods
- *MobileNetV2
- *MobileNetV3
- *RepVGG
- *Swin Transformer V1
- *ConvNeXt
- *Add ERFNet training from scratch
Datasets and pre-processings
- *Support CurveLanes
- *Support Comma10K
Visualization
- Comparison with GT in lane detection [BC-Break] Migrate private repo GT & curve visualizations #72
Framework
- *ONNX inference support
- *TensorRT inference support
- *SCNN TensorRT conversion
- Semi-real profiling benchmark
- *Get rid of mmcv dependency How can we Inference the BezierLaneNet model on a custom Video? #65 Deformable #94 Trying to convert bezierlanenet to onnx #97
Documentation
- More advanced tutorials [BC-Break] Migrate private repo GT & curve visualizations #72
- Per-model docs

2022Q1 (2022.1.10 - 2022.3.31):

Lane detection methods
- BezierLaneNet [BC-Break] BézierLaneNet #60
- MobileNetV2 MobileNets #53
- MobileNetV3 MobileNets #53
- RepVGGs Rc repvgg #54
- Swin Transformer V1 swin #56
- *ConvNeXt
- Add and test VGG16-RESA
- Explore ERFNet-RESA
- Add and test ERFNet-LSTR
- LaneATT
- Explore CondLaneNet
- Explore FOLOLanes
Semantic segmentation methods
- *MobileNetV2
- *MobileNetV3
- *RepVGG
- *Swin Transformer V1
- *ConvNeXt
- *Add ERFNet training from scratch
Datasets and pre-processings
- *Support CurveLanes
- Cherry-pick keypoint affine transform from private branch
- Support Comma10K
Visualization
- Comparison with GT in lane detection
Framework
- *Merge private branch
- *ONNX inference support
- *TensorRT inference support
- *SCNN TensorRT conversion
- Semi-real profiling benchmark
- *Get rid of mmcv dependency How can we Inference the BezierLaneNet model on a custom Video? #65
Documentation
- More advanced tutorials
- Per-model docs

2021Q4 (2021.10.1 - 2021.12.31):

Lane detection methods
- Add and test VGG16-RESA (moved from Q3)
- Test RESA (ResNets) (moved from Q3) RESA implementation follow-up #27 RESA TuSimple results #31
- Explore ERFNet-RESA (moved from Q3)
- Add and test ResNet34-LSTR (moved from Q3) LSTR resnet34 CULane #29
- Add and test ERFNet-LSTR (moved from Q3)
- Explore CondLaneNet
- Explore FOLOLanes
Semantic segmentation methods
- *Add ERFNet training from scratch (moved from Q3)
Datasets and pre-processings
- *Support CurveLanes (moved from Q3)
- Cherry-pick keypoint affine transform from private branch
- Support Comma10K
Visualization
- Comparison with GT in lane detection
- *More inference-free visualizations (folder, etc.) visualize the folder must use provided model？condlanenet？ #48 [BC-Break] The Great Refactor with config files #45
Framework
- Refactor with configs [BC-Break] The Great Refactor with config files #45
- *Merge private branch
- requirements.txt Sugggestion: Missing requirements.txt file. #37
  ~~- [ ] *Replace thop with fvcore ~~
- *Torch -> ONNX Torch to ONNX conversion #43
- *ONNX -> TensorRT To TensorRT #47
- *ONNX inference support
- *TensorRT inference support
- *SCNN TensorRT conversion

2021Q3 (2021.7.1 - 2021.9.30):

Datasets and pre-processings
- *Investigate a possible shared memory leak from padding mask, Python native List or transforms in keypoint datasets
Lane detection methods
- Add RESA (ResNets) Implement RESA #22
- Add and test VGG16-RESA
- Test RESA (ResNets)
- Explore ERFNet-RESA
- Add and test ResNet34-LSTR
- Add and test ERFNet-LSTR
- Test LSTR with simple data augmentation on TuSimple 844ebd7
- Test LSTR on CULane 0c5bcc5
- Test ResNet34, ERFNet with strong data augmentation on TuSimple 721fc26
Semantic segmentation methods
- *Add ERFNet training from scratch
Datasets and pre-processings
- *Support BDD100K
- *Support CurveLanes
Visualization (demo/inference code? #7)
- Support demo with video input for lane detection Inference & Visualization for lane detection #24
- Support demo with video input for semantic segmentation Segmentation demo and inference on videos and dir of images #23
- Support demo with image dir input for lane detection Inference & Visualization for lane detection #24
- Support demo with image dir input for semantic segmentation Segmentation demo and inference on videos and dir of images #23
- *Support demo with camera input for lane detection
- *Support demo with camera input for semantic segmentation
Framework
- Support multi-GPU training with Torch DDP 6a31436
- Support lower PyTorch/CUDA/CuDNN versions Support lower PyTorch & CuDNN versions #25
- Support PyTorch cross-version loading solution

2021Q2 (2021.4.1 - 2021.6.30):

Lane detection methods
- Add RESA (VGG16, ResNets)
- Explore ERFNet-RESA
- Add ResNet18-LSTR LSTR implementation (keep track of all the refactors) #11 57c7acc [BC-Breaking] TuSimple max lane number testing constrain bug fix #13 HorizontalFlip for keypoints #18 RandomCrop and related keypoint transforms #19 Strong augmentation for lane detection #20 c260e6a
- Add ResNet34-LSTR/ERFNet-LSTR
  ~~Add ERFNet-PRNet~~ Awaiting more info
  *Add ENet-SAD Unable to re-implement
  *Explore ERFNet-SAD Unable to re-implement
Semantic segmentation methods
- *Add ERFNet training from scratch
Datasets and pre-processings
- *Support general affine transforms for keypoints
- *Support BDD100K
- Support LLAMAS llamas #15
Visualization (demo/inference code? #7)
- *Support demo with video input for lane detection
- *Support demo with video input for semantic segmentation
- *Support demo with camera input for lane detection
- *Support demo with camera input for semantic segmentation
Benchmark
- Investigate the "ENet slower than ERFNet" problem
- Count fps/flops/mem for transformer-based method LSTR 82d2c5d
Documentation
- *Explanations/Descriptions for re-implemented methods, especially the improved parts
Framework
- *Support multi-GPU training

2021Q1 (-2021.3.31):

Lane detection methods
- Add ResNet backbones cea2ce8
  ~~Add RESA (VGG16, ResNets)~~
  ~~Explore ERFNet-RESA~~
  ~~Add ResNet18-LSTR~~
  ~~Try add a LSTR that is directly comparable with other methods, i.e. on a common backbone~~
  ~~Add ERFNet-PRNet~~
- Add ENet Baseline ed3f739
  ~~Add ENet-SAD~~
  *Explore ERFNet-SAD
Semantic segmentation methods
- Add and test ENet on Cityscapes ee3444b
  *Add ERFNet training from scratch
- Add --workers option 89d3695
Datasets and pre-processings
- Support TuSimple and CULane loading as keypoints & keypoint transforms (Rotation, Resize) Keep track of keypoint transforms and keypoint data loading #5, 839d096
  ~~Support BDD100K~~
  *Support LLAMAS
Visualization
- Add lane markers visualization toolkit for images 6b29f02 (only support visualization from files )
- Organize segmentation result visualization toolkit for images 5ed43d9
Benchmark
- *Explore FPS tests 93b2a21
- *Try provide FLOPs and memory counts for implemented methods 93b2a21
Documentation
- Better guide for downloading and preparing datasets (partly addressed by Add dataset docs and refactor #8)
- *Guide for visualization toolkits c680c90

The text was updated successfully, but these errors were encountered:

voldemortX · 2021-03-31T13:17:14Z

The maintainers have rather low bandwidth these days, about half features for Q1 remain unfinished and are pushed to Q2. Any help would be much appreciated!

SikandAlex · 2021-11-12T07:22:26Z

I am interested in helping to support this library. I am in a position to add support for comma10k as well as GAN-based weather augmentations soon. I have previously trained this on BDD100K: https://github.com/hustvl/YOLOP

I really do not recommend this architecture because despite seeming very attractive, being very flexible in training, and learning all the tasks well, I could not convert it to TensorRT. I opened issues both there and in the tensorrtx repository but the HKUST students did not help me at all, and many files were missing from their repository.

See here:

hustvl/YOLOP#12
wang-xinyu/tensorrtx#793

I would also like to move these two papers into the Roadmap:

https://arxiv.org/abs/2105.05003 CondLaneNet
https://arxiv.org/abs/2105.13680 FOLOLane

I am a MS in AI Candidate at Boston University (taking semester off because courses are useless), I see that you are a student at SJTU @voldemortX I am a huge fan of your university and have read many papers from there. I regard them as the #1 world leader's in many computer vision application fields particularly surveillance. I would enjoy working with you.

Right now I have trained DDRNet for real-time semantic segmentation on comma10k dataset. I think comma10k dataset has huge use for the community because it is fully permissive so we can augment it with labels it is missing / new formats etc. I will update when I can submit some code, I will release it slightly after I build it into my pipeline to give my company a slight edge before I make it open-source. I do not have much experience with pull requests but I will do my best.

voldemortX · 2021-11-12T07:37:14Z

@SikandAlex It will be an honor to have your help as well!

I am interested in helping to support this library. I am in a position to add support for comma10k as well as GAN-based weather augmentations soon. I have previously trained this on BDD100K: https://github.com/hustvl/YOLOP

Any new supports on datasets is welcomed!

I would also like to move these two papers into the Roadmap:

https://arxiv.org/abs/2105.05003 CondLaneNet https://arxiv.org/abs/2105.13680 FOLOLane

The CondLaneNet is open-sourced and could be easier to implement. While FOLOLane might prove a harder method that need more work, since we do not yet have one of its backbones (BiSeNet).

I'll add them in the Roadmap for Q4 and they can of course continue into 22. You can submit PRs whenever you have a ready-to-go bunch of codes (e.g. implemented one of the dataset class and tested its loading, or finished an algorithm).
Thanks again for your help!

voldemortX · 2021-11-12T07:46:20Z

TensorRT support is also what @cedricgsh and I have talked about recently. We too agree that pytorch-auto-drive should not stop at a research codebase. Our primal aim would be a TensorRT benchmark for model speed and op-based FLOPs calculation from fvcore. But given our current bandwidth, I think that would need to wait until 22Q1 at least (a refactor of the framework might be required).

SikandAlex · 2021-11-12T08:29:49Z

I have an AGX Xavier on hand that I will hopefully be able to provide some benchmarks on for certain models. Unfortunately I'm no expert at TensorRT custom layers etc and some operations it seems are unable to be supported by many developers.

There is so many papers that claim to get certain FPS on deploy to embedded GPUs but they never release their code, they only release testing code and no training code, the results are not reproducible even if there is training code, so really it is a huge mess. I have spent the past 2 months trying to determine the best papers and best approach as of Fall 2021 given my computational limitations.

After reading as many papers as I can over the past 3 months, driving around in my friend's Tesla, seeing many other models (amazing AI research from Asia is just destroying us here in US in my opinion), there are not that many critical components to Level 2 highway autonomy (which should be the first step).

Object Detection for Vehicles/Pedestrians

YOLOX by Megvii (won AI City Challenge for Streaming Perception for Autonomous Driving 2021, lot of hype, good for edge)
YOLOV5 by Ultralytics (great support and community, I have extensive experience)
Yolo-FastestV2 https://github.com/dog-qiuqiu/Yolo-FastestV2 (super lightweight variant)

Segmentation of Driveable Surface / Road

After much research I narrowed down the candidate models to the following:

SFNet
DDRNet (I successfully train model that does inference at 100+ FPS on RTX 2080 Ti)
DF-Seg
Attanet
STDC

Lane Detection (Segmentation with Post-Processing OR Polynomial/Keypoint/Row-Wise/Transformer/Non-Segmentation etc)

I have found that real-time segmentation approaches in regard to lane markings need to incorporate high-resolution features and not resize the input image before doing anything as many networks do. This is because the semantics of the lane lines beyond a certain distance from the vehicle are lost at lower resolution and then you can't predict the path far enough in advance. I also do not know how to post-process the segmentation based approach properly as I have not yet experimented with DBSCAN or RANSAC. From all my readings, this shows the most potential to me but it seems to require the 4-lane probability output rather than just the binary mask for lane segmentation I am currently producing:

https://github.com/czming/RONELD-Lane-Detection
https://arxiv.org/abs/2010.09548

When I tested the models at https://github.com/Turoad/lanedet#Benchmark-and-model-zoo
the CondLaneNet model performed the best which is why I recommended it be placed on the road map although it uses a heavy Resnet-101 backbone.

Depth (Monocular/Stereo, ideally monocular)

PyDNet
MobileStereoNet
MiDaS
MonoDepth(2/Wavelet)
FastDepth
LapDepth
HITNet

3D Object Detection (Maybe can replace Depth)

FCOS3D
FCOS3D++/PGD
DD3D

Bird's Eye View / Top View / Projection Transform

Finally, this is the best Github project that I've been able to find related to self-driving. All the models seem to come from OpenVINO / PINO model library.

https://github.com/iwatake2222/self-driving-ish_computer_vision_system

I'm not sure what this model is trained on, or what architecture, but it seems to work well?
https://docs.openvino.ai/2018_R5/_docs_Transportation_segmentation_curbs_release1_caffe_desc_road_segmentation_adas_0001.html

On the control side of things, MPC solver is solution to use for latitude/longitude control.

This is the extensive information I have been able to collect. I automatically discarded models that didn't have code implementations but I think I could have made a mistake here or there. I was going to keep all this information to myself but all I do is code non-stop all day and have no life and still only make slow progress. My friends just work at company's like Facebook and Google and don't wan't to work on anything exciting. Impossible to get them to give up stock to work with me, also has a learning curve. So hopefully by giving back to the open-source community, they will give back to me and we can all improve science and also make money.

voldemortX · 2021-11-12T09:06:42Z

Well my knowledge about self-driving is kind of only in the research stage for now. And I certainly learned a lot from your comments. Though I'm still skeptical about deep learning's performance in actual applications of self-driving, especially on RGB inputs. So now what I do in SenseTime is more about human-centric applications.

Btw, we'll release our own lane detector later, perhaps early 22 at the latest. It end-to-end achieves a reasonable performance (~75 CULane, ~95 LLAMAS) at 150 FPS in PyTorch with a small model, which we believe could be beneficial to application. But it needs to remain in a private repo for now.

voldemortX · 2021-12-05T15:48:31Z

@SikandAlex We have now pushed initial supports for ONNX and TensorRT conversions, maybe they can be helpful to your applications? Refer to DEPLOY.md.

* FCOS-like architecture * loss framework * debug * stash * loss complete * hooks * bug fix * bezier testing fix * debug

voldemortX added enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed labels Feb 21, 2021

voldemortX pinned this issue Feb 21, 2021

voldemortX linked a pull request Mar 10, 2021 that will close this issue

Keep track of keypoint transforms and keypoint data loading #5

Merged

5 tasks

voldemortX mentioned this issue Mar 27, 2021

Questions about LSTR's performance. liuruijin17/LSTR#19

Closed

voldemortX mentioned this issue Apr 15, 2021

PRNet相关实现 #12

Open

voldemortX linked a pull request May 8, 2021 that will close this issue

llamas #15

Merged

cedricgsh closed this as completed in #15 May 10, 2021

voldemortX reopened this May 10, 2021

cedricgsh pushed a commit that referenced this issue Dec 15, 2021

New architecture with Hungarian (#4)

68b4c5d

* FCOS-like architecture * loss framework * debug * stash * loss complete * hooks * bug fix * bezier testing fix * debug

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Roadmap #4

Roadmap #4

voldemortX commented Feb 21, 2021 •

edited

Loading

voldemortX commented Mar 31, 2021

SikandAlex commented Nov 12, 2021 •

edited

Loading

voldemortX commented Nov 12, 2021

voldemortX commented Nov 12, 2021

SikandAlex commented Nov 12, 2021 •

edited

Loading

voldemortX commented Nov 12, 2021 •

edited

Loading

voldemortX commented Dec 5, 2021

Roadmap #4

Roadmap #4

Comments

voldemortX commented Feb 21, 2021 • edited Loading

voldemortX commented Mar 31, 2021

SikandAlex commented Nov 12, 2021 • edited Loading

voldemortX commented Nov 12, 2021

voldemortX commented Nov 12, 2021

SikandAlex commented Nov 12, 2021 • edited Loading

voldemortX commented Nov 12, 2021 • edited Loading

voldemortX commented Dec 5, 2021

voldemortX commented Feb 21, 2021 •

edited

Loading

SikandAlex commented Nov 12, 2021 •

edited

Loading

SikandAlex commented Nov 12, 2021 •

edited

Loading

voldemortX commented Nov 12, 2021 •

edited

Loading