MODEL ZOO

Common settings and notes

The experiments are run with PyTorch 1.1, CUDA 10.0, and CUDNN 7.5.
The training is conducted on 4 V100 GPUs in a DGX server.
Testing times are measured on a TITAN RTX GPU with batch size 1.

Waymo 3D Detection

We provide training / validation configurations, pretrained models, and prediction files for all models in the paper. To access these pretrained models, please send us an email with your name, institute, a screenshot of the the Waymo dataset registration confirmation mail, and your intended usage. Please send a second email if we don't get back to you in two days. Please note that Waymo open dataset is under strict non-commercial license so we are not allowed to share the model with you if it will used for any profit-oriented activities.

One-stage VoxelNet

Model	Veh_L2	Ped_L2	Cyc_L2	MAPH	FPS
VoxelNet	66.2	62.6	67.6	65.5	13

In the paper, our models only detect Vehicle and Pedestrian. Here, we provide the three classes config that also enables cyclist detection (and perform similarly). We encourage the community to also report three class performance in the future.

Ablations for training schedule

CenterPoint is fast to train and converge in as little as 3~6 epochs. We tried a few training schedules for CenterPoint-Voxel and list their performance below.

Schedule	Veh_L2	Ped_L2	Cyc_L2	MAPH	Training Time
36 epoch	66.2	62.6	67.6	65.5	84hr
12 epoch	65.6	61.3	67.1	64.7	28hr
6 epoch	65.5	59.5	66.4	63.4	14hr
3 epoch	61.5	56.2	64.5	60.7	7hr

Two-stage VoxelNet

By default, we finetune a pretrained one stage model for 6 epochs. To save GPU memory, we also freeze the backbone weight.

Model	Split	Veh_L2	Ped_L2	Cyc_L2	MAPH	FPS
VoxelNet	Val	67.9	65.6	68.6	67.4	13
VoxelNet	Test	71.9	67.0	68.2	69.0	13

Two frame model

To provide richer input information and enable a more reasonable velocity estimation, we transform and merge the Lidar points of previous frame into current frame. This two frame model significanty boosts the detection performance.

Model	Split	Veh_L2	Ped_L2	Cyc_L2	MAPH	FPS
One-stage	Val	67.3	67.5	69.9	68.2	11
Two-stage	Val	69.7	70.3	70.9	70.3	11
Two-stage	Test	73.0	71.5	71.3	71.9	11

PointPillars

Model	Veh_L2	Ped_L2	Cyc_L2	MAPH	FPS
centerpoint_pillar	65.5	55.1	60.2	60.3	19
centerpoint_pillar_two_stage	66.7	55.9	61.7	61.4	16

For PointPillars, we notice a 1.5 mAPH drop when converting from two class model to three class model. You can refer to ONE_STAGE and TWO_STAGE configs to reproduce the two class result.

Waymo 3D Tracking

For 3D Tracking, we apply our center-based tracking on top of our two frame model's detection result.

	Split	Veh_L2	Ped_L2	Cyc_L2	MOTA	FPS
centerpoint_voxel_two_sweep	Val	55.0	55.0	57.4	55.8	11
centerpoint_voxel_two_sweep	Test	59.4	56.6	60.0	58.7	11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

MODEL ZOO

Common settings and notes

Waymo 3D Detection

One-stage VoxelNet

Ablations for training schedule

Two-stage VoxelNet

Two frame model

PointPillars

Waymo 3D Tracking

Files

README.md

Latest commit

History

README.md

File metadata and controls

MODEL ZOO

Common settings and notes

Waymo 3D Detection

One-stage VoxelNet

Ablations for training schedule

Two-stage VoxelNet

Two frame model

PointPillars

Waymo 3D Tracking