Skip to content

Estimation of the visible and hidden traversable space from a single color image – CVPR 2020

License

Notifications You must be signed in to change notification settings

daniyar-niantic/footprints

 
 

Repository files navigation

Jamie Watson, Michael Firman, Aron Monszpart and Gabriel J. Brostow – CVPR 2020 (Oral presentation)

[Link to Paper]

We introduce Footprints, a method for estimating the visible and hidden traversable space from a single RGB image

Rig results Rig results

Understanding the shape of a scene from a single color image is a formidable computer vision task. Most methods aim to predict the geometry of surfaces that are visible to the camera, which is of limited use when planning paths for robots or augmented reality agents. Models which predict beyond the line of sight often parameterize the scene with voxels or meshes, which can be expensive to use in machine learning frameworks.

Our method predicts the hidden ground geometry and extent from a single image:

Web version of figure 1

Our predictions enable virtual characters to more realistically explore their environment.

Baseline exploration Our exploration
Baseline: The virtual character can only explore the ground visible to the camera Ours: The penguin can explore both the visible and hidden ground

⚙️ Setup

Our code and models were developed with PyTorch 1.3.1. The environment.yml and requirements.txt list our dependencies.

We recommend installing and activating a new conda environment from these files with:

conda env create -f environment.yml -n footprints
conda activate footprints

🖼️ Prediction

We provide three pretrained models:

  • kitti, a model trained on the KITTI driving dataset with a resolution of 192x640,
  • matterport, a model trained on the indoor Matterport dataset with a resolution of 512x640, and
  • handheld, a model trained on our own handheld stereo footage with a resolution of 256x448.

We provide code to make predictions for a single image, or a whole folder of images, using any of these pretrained models. Models will be automatically downloaded when required, and input images will be automatically resized to the correct input resolution for each model.

Single image prediction:

python -m footprints.predict --image test_data/cyclist.jpg --model kitti

Multi image prediction:

python -m footprints.predict --image test_data --model handheld

By default, .npy predictions and .jpg visualisations will be saved to the predictions folder; this can be changed with the --save_dir flag.

Training code is coming soon

⏳ Evaluation

To evaluate a folder of predictions, run:

python -m footprints.evaluate \
    --datatype kitti \
    --metric iou \
    --predictions path/to/predictions/folder

The following options are provided:

  • --datatype can be either kitti or matterport.
  • --metric can be iou or depth

If necessary, the ground truth files will be automatically downloaded and placed in the ground_truth_files folder.

You can also download the KITTI annotations directly from here. For each image, there are 3 .png files:

  • XXXXX_ground.png contains the mask of the boundary of visible and hidden ground, ignoring all objects
  • XXXXX_objects.png contains the mask of the ground space taken up by objects (the footprints)
  • XXXXX_combined.png contains the full evaluation mask - the visible and hidden ground, taking into account object footprints

E.g. evaluating on the KITTI test set (assuming images are in a folder named KITTI_test_rgbs) could be done by:

python -m footprints.predict \
 --image KITTI_test_rgbs \
 --model kitti \
 --save_dir ./predictions
python -m footprints.evaluate \
 --datatype kitti \
 --metric iou \
 --predictions ./predictions/outputs

Method and further results

We learn from stereo video sequences, using camera poses, per-frame depth and semantic segmentation to form training data, which is used to supervise an image-to-image network.

Video version of figure 3

More results on the KITTI dataset:

KITTI results

✏️ 📄 Citation

If you find our work useful or interesting, please consider citing our paper:

@inproceedings{watson-2020-footprints,
 title   = {Footprints and Free Space from a Single Color Image},
 author  = {Jamie Watson and
            Michael Firman and
            Aron Monszpart and
            Gabriel J. Brostow},
 booktitle = {Computer Vision and Pattern Recognition ({CVPR})},
 year = {2020}
}

👩‍⚖️ License

Copyright © Niantic, Inc. 2020. Patent Pending. All rights reserved. Please see the license file for terms.

About

Estimation of the visible and hidden traversable space from a single color image – CVPR 2020

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 100.0%