Visual Odometry methods based on classical 3D geometry have been around for years, using either indirect feature matching or direct visual error minimization. Lately,learning-based methods that combine both matching and geometry estimation in a single network have achieved impressive results. One such method is DeepTAM. Further, it has been shown that classical methods benefit from the extended field of view provided by using multiple cameras. However, these setups have been ignored by current learning-based methods.
In this work, we extend the existing DeepTAM pipeline to leverage a multi-camera setup with known geometry. We demonstrate the generalizability of DeepTAM to other monocular setups and highlight the scenarios in which it performs poorly. We show the efficacy of our proposed multi-camera VO pipeline to receive better pose estimates using experiments based on simulation.
Contributors: Mayank Mittal, Rohit Suri, Fadhil Ginting, Parker Ewen
This code has been tested on a computer with following specifications:
- OS Platform and Distribution: Linux Ubuntu 16.04LTS
- CUDA/cuDNN version: CUDA 9.0.176, cuDNN 7.1.4
- GPU model and memory: NVidia GeForce GTX 1070-MaxQ, 8GB
- Python: 3.5.2
- TensorFlow: 1.9.0
For each camera, the directory should be organized as shown below:
data/
└── cam_1/
├── depth
├── depth.txt
├── groundtruth.txt
├── rgb
└── rgb.txt
The text files be similar to the ones present in TUM RGBD sequences, i.e. each line should first contain the timestamp information followed by the data:
- Images: Data is the file path relative to the sequence directory name specified in the
config.yaml
file - Groundtruth: Data is the cartesian position and quaternion orientation of that particular camera (in world/camera frame)
An example YAML configuration file for the RGBD Freiburg1 Desk Sequence is present here.
NOTE: Please ensure that the sequence directory and camera intrinsics are correctly modified according to the dataset.
For multi-camera, an additional YAML file needs to be written which would contain the path to all the configuration files for all the cameras in the system. An example for the same for AirSim dataset is available here.
To install the virtual environment and all required dependencies, run:
./install.sh
Source the virtual environment installed:
workon deeptam_py
Download the pre-trained weights for the DeepTAM tracking network:
cd resources/weights
chmod +x download_weights.sh
./download_weights.sh
Input Arguments:
--config_file
or-f
: set to the path of configuration YAML file--output_dir
or-o
: set to the path of configuration YAML file (default: '')--weights
or-w
: set to the path for the weights of the DeepTAM tracking network (without the .index, .meta or .data extensions)--tracking_network
or-n
: set to the path of the tracking network (default: path to module deeptam_tracker.models.networks)--disable_vis
or-v
: disable the frame-by-frame visualization for speed-up
Example:
- Download the RGBD Freiburg1 Desk Sequence:
cd resources/data
chmod +x download_testdata_rgbd_freiburg1.sh
./download_testdata_rgbd_freiburg1.sh
- To run DeepTAM with a single camera setup, run:
cd scripts
# run the python script
python single_camera_tracking.py \
--config_file ../resources/hyperparameters/freiburg1_config.yaml \
--weights ../resources/weights/deeptam_tracker_weights/snapshot-300000
Input Arguments:
--config_file
or-f
: set to the path of YAML file for multi-camera--weights
or-w
: set to the path for the weights of the DeepTAM tracking network (without the .index, .meta or .data extensions)--tracking_network
or-n
: set to the path of the tracking network (default: path to module deeptam_tracker.models.networks)--disable_vis
or-v
: disable the frame-by-frame visualization for speed-up--method
or-m
: type of pose fusion method to use (default:"rejection"
, options:"rejection"
/"naive"
/"sift"
)
Example:
-
To download the SUNCG data with camera rig, download the zip file from here. Please copy the extracted data to the
resources/data
directory. -
To run DeepTAM with a multi camera setup with inlier-based averaging, run:
cd scripts
# run the python script
python multi_camera_tracking.py \
--config_file ../resources/hyperparameters/suncg3cameras/suncg_config.yaml \
--weights ../resources/weights/deeptam_tracker_weights/snapshot-300000 \
--method rejection