An overview of a typical place recognition pipeline. At first, the input data is encoded into a query descriptor. Then, a K-nearest neighbors search is performed between the query and the database. Finally, the position of the closest database descriptor found is considered as the answer.
Detailed description of featured library modules can be found in the docs/modules.md document.
- PlaceRecognitionPipeline
- SequencePointcloudRegistrationPipeline
- PlaceRecognitionPipeline with semantics
- ArucoLocalizationPipeline
- LocalizationPipeline without dynamic objects
- Localization by specific scene elements (Semantic Object Context (SOC) module)
- Module for generating global vector representations of multimodal outdoor data
- MultimodalPlaceRecognitionTrainer
- TextLabelsPlaceRecognitionPipeline
- DepthReconstruction
- ITLPCampus
-
The library requires
PyTorch
,MinkowskiEngine
and (optionally)faiss
libraries to be installed manually: -
Another option is to use the docker image. You can read detailed description in the docker/README.md. Quick-start commands to build, start and enter the container:
# from repo root dir bash docker/build_devel.sh bash docker/start.sh [DATASETS_DIR] bash docker/into.sh
-
After the pre-requisites are met, install the Open Place Recognition library with the following command:
pip install -e .
-
If you want to use the
GeoTransformer
model for pointcloud registration, you should install the package located in thethird_party
directory:# load submodules from git git submodule update --init # change dir cd third_party/GeoTransformer/ # install the package bash setup.sh
You can download the weights from the public Google Drive folder.
Developers only
We use DVC to manage the weights storage. To download the weights, run the following command (assuming that dvc is already installed):
dvc pull
You will be be asked to authorize the Google Drive access. After that, the weights will be downloaded to the weights
directory. For more details, see the DVC documentation.
We introduce the ITLP-Campus dataset. The dataset was recorded on the Husky wheeled robotic platform on the university campus and consists of tracks recorded at different times of day (day/dusk/night) and different seasons (winter/spring). You can find more detail in the VitalyyBezuglyj/ITLP-Campus repository.
Subpackage containing dataset classes and functions.
Usage example:
from opr.datasets import OxfordDataset
train_dataset = OxfordDataset(
dataset_root="/home/docker_opr/Datasets/pnvlad_oxford_robotcar_full/",
subset="train",
data_to_load=["image_stereo_centre", "pointcloud_lidar"]
)
The iterator will return a dictionary with the following keys:
"idx"
: index of the sample in the dataset, single number Tensor"utm"
: UTM coordinates of the sample, Tensor of shape(2)
- (optional)
"image_stereo_centre"
: image Tensor of shape(C, H, W)
- (optional)
"pointcloud_lidar_feats"
: point cloud features Tensor of shape(N, 1)
- (optional)
"pointcloud_lidar_coords"
: point cloud coordinates Tensor of shape(N, 3)
More details can be found in the demo_datasets.ipynb notebook.
The opr.losses
subpackage contains ready-to-use loss functions implemented in PyTorch, featuring a common interface.
Usage example:
from opr.losses import BatchHardTripletMarginLoss
loss_fn = BatchHardTripletMarginLoss(margin=0.2)
idxs = sample_batch["idxs"]
positives_mask = dataset.positives_mask[idxs][:, idxs]
negatives_mask = dataset.negatives_mask[idxs][:, idxs]
loss, stats = loss_fn(output["final_descriptor"], positives_mask, negatives_mask)
The loss functions introduce a unified interface:
- Input:
embeddings
: descriptor Tensor of shape(B, D)
positives_mask
: boolean mask Tensor of shape(B, B)
negatives_mask
: boolean mask Tensor of shape(B, B)
- Output:
loss
: loss value Tensorstats
: dictionary with additional statistics
More details can be found in the demo_losses.ipynb notebook.
The opr.models
subpackage contains ready-to-use neural networks implemented in PyTorch, featuring a common interface.
Usage example:
from opr.models.place_recognition import MinkLoc3D
model = MinkLoc3D()
# forward pass
output = model(batch)
The models introduce unified input and output formats:
- Input: a
batch
dictionary with the following keys (all keys are optional, depending on the model and dataset):"images_<camera_name>"
: images Tensor of shape(B, 3, H, W)
"masks_<camera_name>"
: semantic segmentation masks Tensor of shape(B, 1, H, W)
"pointclouds_lidar_coords"
: point cloud coordinates Tensor of shape(B * N_points, 4)
"pointclouds_lidar_feats"
: point cloud features Tensor of shape(B * N_points, C)
- Output: a dictionary with the requiered key
"final_descriptor"
and optional keys for intermediate descriptors:"final_descriptor"
: final descriptor Tensor of shape(B, D)
More details can be found in the demo_models.ipynb notebook.
The opr.trainers
subpackage contains ready-to-use training algorithms.
Usage example:
from opr.trainers.place_recognition import UnimodalPlaceRecognitionTrainer
trainer = UnimodalPlaceRecognitionTrainer(
checkpoints_dir=checkpoints_dir,
model=model,
loss_fn=loss_fn,
optimizer=optimizer,
scheduler=scheduler,
batch_expansion_threshold=cfg.batch_expansion_threshold,
wandb_log=(not cfg.debug and not cfg.wandb.disabled),
device=cfg.device,
)
trainer.train(
epochs=cfg.epochs,
train_dataloader=dataloaders["train"],
val_dataloader=dataloaders["val"],
test_dataloader=dataloaders["test"],
)
The opr.pipelines
subpackage contains ready-to-use pipelines for model inference.
Usage example:
from opr.models.place_recognition import MinkLoc3Dv2
from opr.pipelines.place_recognition import PlaceRecognitionPipeline
pipe = **PlaceRecognitionPipeline**(
database_dir="/home/docker_opr/Datasets/ITLP_Campus/ITLP_Campus_outdoor/databases/00",
model=MinkLoc3Dv2(),
model_weights_path=None,
device="cuda",
)
out = pipe.infer(sample)
The pipeline introduces a unified interface for model inference:
- Input: a dictionary with the following keys
(all keys are optional, depending on the model and dataset):
"image_<camera_name>"
: image Tensor of shape(3, H, W)
"mask_<camera_name>"
: semantic segmentation mask Tensor of shape(1, H, W)
"pointcloud_lidar_coords"
: point cloud coordinates Tensor of shape(N_points, 4)
"pointcloud_lidar_feats"
: point cloud features Tensor of shape(N_points, C)
- Output: a dictionary with keys:
"idx"
for predicted index in the database,"pose"
for predicted pose in the format[tx, ty, tz, qx, qy, qz, qw]
,"descriptor"
for predicted descriptor.
More details can be found in the demo_pipelines.ipynb notebook.
Model | Modality | Train Dataset | Config | Weights |
---|---|---|---|---|
MinkLoc3D (paper) | LiDAR | NCLT | minkloc3d.yaml | minkloc3d_nclt.pth |
Custom | Multi-Image, Multi-Semantic, LiDAR | NCLT | multi-image_multi-semantic_lidar_late-fusion.yaml | multi-image_multi-semantic_lidar_late-fusion_nclt.pth |
Custom | Multi-Image, LiDAR | NCLT | multi-image_lidar_late-fusion.yaml | multi-image_lidar_late-fusion_nclt.pth |
- OPR-Project/OpenPlaceRecognition-ROS2 - ROS-2 implementation of OpenPlaceRecognition modules
- OPR-Project/ITLP-Campus - ITLP-Campus dataset tools.
- KirillMouraviev/simple_toposlam_model - An implementation of the Topological SLAM method that uses the OPR library.
MIT License (the license is subject to change in future versions)