Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Roadmap #4

Open
voldemortX opened this issue Feb 21, 2021 · 7 comments · Fixed by #5 or #15
Open

Roadmap #4

voldemortX opened this issue Feb 21, 2021 · 7 comments · Fixed by #5 or #15
Labels
enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed

Comments

@voldemortX
Copy link
Owner

voldemortX commented Feb 21, 2021

Roadmap for our users (to state feature requests) and contributors.

* Low priority tasks.

2022Q? (2022.4.2 - ):

2022Q1 (2022.1.10 - 2022.3.31):

  • Lane detection methods
  • Semantic segmentation methods
    • *MobileNetV2
    • *MobileNetV3
    • *RepVGG
    • *Swin Transformer V1
    • *ConvNeXt
    • *Add ERFNet training from scratch
  • Datasets and pre-processings
    • *Support CurveLanes
    • Cherry-pick keypoint affine transform from private branch
    • Support Comma10K
  • Visualization
    • Comparison with GT in lane detection
  • Framework
  • Documentation
    • More advanced tutorials
    • Per-model docs

2021Q4 (2021.10.1 - 2021.12.31):

2021Q3 (2021.7.1 - 2021.9.30):

2021Q2 (2021.4.1 - 2021.6.30):

2021Q1 (-2021.3.31):

  • Lane detection methods
    • Add ResNet backbones cea2ce8
      Add RESA (VGG16, ResNets)
      Explore ERFNet-RESA
      Add ResNet18-LSTR
      Try add a LSTR that is directly comparable with other methods, i.e. on a common backbone
      Add ERFNet-PRNet
    • Add ENet Baseline ed3f739
      Add ENet-SAD
      *Explore ERFNet-SAD
  • Semantic segmentation methods
    • Add and test ENet on Cityscapes ee3444b
      *Add ERFNet training from scratch
    • Add --workers option 89d3695
  • Datasets and pre-processings
  • Visualization
    • Add lane markers visualization toolkit for images 6b29f02 (only support visualization from files )
    • Organize segmentation result visualization toolkit for images 5ed43d9
  • Benchmark
    • *Explore FPS tests 93b2a21
    • *Try provide FLOPs and memory counts for implemented methods 93b2a21
  • Documentation
@voldemortX voldemortX added enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed labels Feb 21, 2021
@voldemortX voldemortX pinned this issue Feb 21, 2021
@voldemortX voldemortX linked a pull request Mar 10, 2021 that will close this issue
5 tasks
@voldemortX
Copy link
Owner Author

The maintainers have rather low bandwidth these days, about half features for Q1 remain unfinished and are pushed to Q2. Any help would be much appreciated!

@voldemortX voldemortX linked a pull request May 8, 2021 that will close this issue
@voldemortX voldemortX reopened this May 10, 2021
@SikandAlex
Copy link

SikandAlex commented Nov 12, 2021

I am interested in helping to support this library. I am in a position to add support for comma10k as well as GAN-based weather augmentations soon. I have previously trained this on BDD100K: https://github.com/hustvl/YOLOP

I really do not recommend this architecture because despite seeming very attractive, being very flexible in training, and learning all the tasks well, I could not convert it to TensorRT. I opened issues both there and in the tensorrtx repository but the HKUST students did not help me at all, and many files were missing from their repository.

See here:

hustvl/YOLOP#12
wang-xinyu/tensorrtx#793

I would also like to move these two papers into the Roadmap:

https://arxiv.org/abs/2105.05003 CondLaneNet
https://arxiv.org/abs/2105.13680 FOLOLane

I am a MS in AI Candidate at Boston University (taking semester off because courses are useless), I see that you are a student at SJTU @voldemortX I am a huge fan of your university and have read many papers from there. I regard them as the #1 world leader's in many computer vision application fields particularly surveillance. I would enjoy working with you.

Right now I have trained DDRNet for real-time semantic segmentation on comma10k dataset. I think comma10k dataset has huge use for the community because it is fully permissive so we can augment it with labels it is missing / new formats etc. I will update when I can submit some code, I will release it slightly after I build it into my pipeline to give my company a slight edge before I make it open-source. I do not have much experience with pull requests but I will do my best.

@voldemortX
Copy link
Owner Author

@SikandAlex It will be an honor to have your help as well!

I am interested in helping to support this library. I am in a position to add support for comma10k as well as GAN-based weather augmentations soon. I have previously trained this on BDD100K: https://github.com/hustvl/YOLOP

Any new supports on datasets is welcomed!

I would also like to move these two papers into the Roadmap:

https://arxiv.org/abs/2105.05003 CondLaneNet https://arxiv.org/abs/2105.13680 FOLOLane

The CondLaneNet is open-sourced and could be easier to implement. While FOLOLane might prove a harder method that need more work, since we do not yet have one of its backbones (BiSeNet).

I'll add them in the Roadmap for Q4 and they can of course continue into 22. You can submit PRs whenever you have a ready-to-go bunch of codes (e.g. implemented one of the dataset class and tested its loading, or finished an algorithm).
Thanks again for your help!

@voldemortX
Copy link
Owner Author

TensorRT support is also what @cedricgsh and I have talked about recently. We too agree that pytorch-auto-drive should not stop at a research codebase. Our primal aim would be a TensorRT benchmark for model speed and op-based FLOPs calculation from fvcore. But given our current bandwidth, I think that would need to wait until 22Q1 at least (a refactor of the framework might be required).

@SikandAlex
Copy link

SikandAlex commented Nov 12, 2021

I have an AGX Xavier on hand that I will hopefully be able to provide some benchmarks on for certain models. Unfortunately I'm no expert at TensorRT custom layers etc and some operations it seems are unable to be supported by many developers.

There is so many papers that claim to get certain FPS on deploy to embedded GPUs but they never release their code, they only release testing code and no training code, the results are not reproducible even if there is training code, so really it is a huge mess. I have spent the past 2 months trying to determine the best papers and best approach as of Fall 2021 given my computational limitations.

After reading as many papers as I can over the past 3 months, driving around in my friend's Tesla, seeing many other models (amazing AI research from Asia is just destroying us here in US in my opinion), there are not that many critical components to Level 2 highway autonomy (which should be the first step).

  1. Object Detection for Vehicles/Pedestrians
  • YOLOX by Megvii (won AI City Challenge for Streaming Perception for Autonomous Driving 2021, lot of hype, good for edge)
  • YOLOV5 by Ultralytics (great support and community, I have extensive experience)
  • Yolo-FastestV2 https://github.com/dog-qiuqiu/Yolo-FastestV2 (super lightweight variant)
  1. Segmentation of Driveable Surface / Road

After much research I narrowed down the candidate models to the following:

  • SFNet
  • DDRNet (I successfully train model that does inference at 100+ FPS on RTX 2080 Ti)
  • DF-Seg
  • Attanet
  • STDC
  1. Lane Detection (Segmentation with Post-Processing OR Polynomial/Keypoint/Row-Wise/Transformer/Non-Segmentation etc)

I have found that real-time segmentation approaches in regard to lane markings need to incorporate high-resolution features and not resize the input image before doing anything as many networks do. This is because the semantics of the lane lines beyond a certain distance from the vehicle are lost at lower resolution and then you can't predict the path far enough in advance. I also do not know how to post-process the segmentation based approach properly as I have not yet experimented with DBSCAN or RANSAC. From all my readings, this shows the most potential to me but it seems to require the 4-lane probability output rather than just the binary mask for lane segmentation I am currently producing:

https://github.com/czming/RONELD-Lane-Detection
https://arxiv.org/abs/2010.09548

When I tested the models at https://github.com/Turoad/lanedet#Benchmark-and-model-zoo
the CondLaneNet model performed the best which is why I recommended it be placed on the road map although it uses a heavy Resnet-101 backbone.

  1. Depth (Monocular/Stereo, ideally monocular)
  • PyDNet
  • MobileStereoNet
  • MiDaS
  • MonoDepth(2/Wavelet)
  • FastDepth
  • LapDepth
  • HITNet
  1. 3D Object Detection (Maybe can replace Depth)
  • FCOS3D
  • FCOS3D++/PGD
  • DD3D
  1. Bird's Eye View / Top View / Projection Transform

Finally, this is the best Github project that I've been able to find related to self-driving. All the models seem to come from OpenVINO / PINO model library.

https://github.com/iwatake2222/self-driving-ish_computer_vision_system

I'm not sure what this model is trained on, or what architecture, but it seems to work well?
https://docs.openvino.ai/2018_R5/_docs_Transportation_segmentation_curbs_release1_caffe_desc_road_segmentation_adas_0001.html

On the control side of things, MPC solver is solution to use for latitude/longitude control.

This is the extensive information I have been able to collect. I automatically discarded models that didn't have code implementations but I think I could have made a mistake here or there. I was going to keep all this information to myself but all I do is code non-stop all day and have no life and still only make slow progress. My friends just work at company's like Facebook and Google and don't wan't to work on anything exciting. Impossible to get them to give up stock to work with me, also has a learning curve. So hopefully by giving back to the open-source community, they will give back to me and we can all improve science and also make money.

@voldemortX
Copy link
Owner Author

voldemortX commented Nov 12, 2021

Well my knowledge about self-driving is kind of only in the research stage for now. And I certainly learned a lot from your comments. Though I'm still skeptical about deep learning's performance in actual applications of self-driving, especially on RGB inputs. So now what I do in SenseTime is more about human-centric applications.

Btw, we'll release our own lane detector later, perhaps early 22 at the latest. It end-to-end achieves a reasonable performance (~75 CULane, ~95 LLAMAS) at 150 FPS in PyTorch with a small model, which we believe could be beneficial to application. But it needs to remain in a private repo for now.

@voldemortX
Copy link
Owner Author

@SikandAlex We have now pushed initial supports for ONNX and TensorRT conversions, maybe they can be helpful to your applications? Refer to DEPLOY.md.

cedricgsh pushed a commit that referenced this issue Dec 15, 2021
* FCOS-like architecture

* loss framework

* debug

* stash

* loss complete

* hooks

* bug fix

* bezier testing fix

* debug
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed
Projects
None yet
2 participants