Release: 0.2.0 (#4)

* downgraded cuda and torch versions for backward compatibility * fixed configs and added train script * add username to docker image tag to avoid conflicts when building on the same server * added logging to wandb * add script for building positive and non-negative indexes for datasets * BUGFIX: tmp workaround in nclt dataset no longer needed * add image-only model config + option which modality to test * implemented multi-camera setup * add multicamera to oxford * Add naive semantic modality * Fix merging collisions * save config in checkpoint file * add test script * coords_limit arg in nclt dataset * text baseline notebook for itl_mipt * removed outdated scripts * fixed default config to MinkLoc++ baseline * self-attention layers implemented * tmp code to test multi-image models with reranking * add MLP layer + multi-image model with mlp + moved general config into separate dir * separated multiimage config to _add and _concat + add general 4/5 rgb configs * add cross-attention fusion for images * nclt next baseline * add single cross-attention fusion module + fixed old cross-attention * add calculation of clip embeddings * Fix merging collisions * add train on CLIP embs * Add ohe-hot encoding * Add chonky, not tested * Fix issues * Chonky works * add 2rgb+lidar config + fixed batch_size_limit to 128 * fix: test_modality in general multimodal configs -> fusion * add 128-dim img configs * fix: aaded general config for image-only * Add chinky + cloud setup * Add im-onehot-sem-cloud model config * Add semantics to Oxford Dataset script * Fix oxford ds script * text+cam exps * Fix OH Sem + add multi-configs * Multi setup (trivial) works! * Update multimodal config * add multi-semantic module and configs for mssplace * add all modalities configs * add configs for 2rgb+2text+lidar * add dataset_root arg to test script * updated requirements with plotly * add support for Intensity Oxford dataset * save only last and best checkpoints * add config for minkloc3dv2 * Add some configs * WIP: SVT-Net implementation * fix: svtnet config - set number of layers according to official implementation * chore: install pyenv with miltiple python versions in dockerfile * feat: move to src/ layout * chore: move project metadata to pyproject.toml * chore: move flake8 config to .flake8 * chore: add new linters to dev requirements * chore: add script to start docker container with xhost sharing * chore: configure flake8 with new extensions * chore: configure pre-commit to use local flake8 installation * chore: configure pre-commit to use local mypy installation * chore: add nox to dev dependencies * chore: add coverage[toml]==7.2.7 and pytest-cov==4.1.0 to dev dependencies * fix: enable multiple python versions in dockerfile globally * fix: pre-commit mypy configuration to check only added files * chore: mypy ignore nox missing imports * chore: add base noxfile * chore: parametrize multiple python and pytorch versions in nox * chore: ingore ('3.11', '1.13.1') python and pytorch combination * chore: remove 3.11 nox tests because it is not supported by MinkowskiEngine * chore: add linting session to nox * feat: package version infer from pyproject.toml file only * chore: fix noxfile to install MinkowskiEngine correctly * chore: ignore unused import error in __init__.py files * chore: ignore ANN101 flake8 error * chore: ignore B905 flake8 error * chore: ignore missing imports in pre-commit mypy * test: minkloc3d instantiate from config file * chore: configure pre-commit to use remote mirrors-mypy repo * feat: minkloc3d model in new framework architecture * chore: add sphinx to dev requirements * refactor: remove redundant 'models.multimodal' module * test: add e2e marker for end-to-end testing * test: add __init__.py files in tests dirs * test: move load_config function to tests.utils module * chore: add default 'not e2e' marker in nox tests session * test: mark test_minkloc3d_instantiate as end-to-end test * test: test that OxfordDataset can be instantiated from real config with real data * refactor: base dataset and oxford dataset code * refactor: optimize building positive and non-negative indexes speed * doc: remove link to MinkLoc3D-SI from Oxford dataset docstring * fix: set idx type in dataset __getitem__ to tensor * fix: modify base dataset collate_fn signature to take input data_list * test: collate_fn works in end-to-end configuration * feat: add collate_fn implementation to OxfordDataset * fix: add __init__.py to opr.layers module * fix: add docstring to opr.datasets module * refactor: create new opr.samplers module * refactor: delete old opr.datasets.samplers module * test: init tests.samplers module * test: batch sampler can be instantiated from config in end-to-end test * refactor: move batch_sampler to opr.samplers module * feat: add trainers module * feat: add trainers.place_recognition module * test: init dir for miners tests * fix: typo * test: add tests for BatchHardTripletMiner * refactor: major refactor for BatchHardTripletMiner class * feat: script for NCLT preprocessing * feat: add util module for preprocessing scripts * feat: add check_in_buffer_set to utils * feat: script for splitting NCLT dataset * refactor: set img size constants in the beginning of the file * fix: large_cam_dir referenced before assigned * fix: torch.cat(lidar_feats_list) * test: e2e test for NCLT dataset * feat: NCLT dataset re-implemented * feat: nclt dataset config * fix: nclt dataset feature quantization * feat: use_intensity_values for NCLT dataset * fix: workaround for compatibility with oxford dataset * feat: add nclt dataset * feat: option to extract data from all tracks of NCLT * feat: base dataset class now have positive and negative masks and distributed collate fn * feat: update OxfordDataset to new format * fix: remove redundant distributed_collate_fn * refactor: move transform functions to base dataset class * docs: fix descriptions of transform arguments * feat: update NCLT dataset according to new format * feat: add NCLTDataset to init * refactor: remove many of redundant configs * feat: add distributed wrapper for batch sampler * docs: remove outdated info from readme * chore: remove mypy from pre-commit * refactor: simplify place recognition models code structure * feat: building blocks for models are now in 'modules' subpackage * refactor: move fusion modules into 'opr.modules' subpackage * refactor: move gem modules into 'opr.modules' subpackage * refactor: move eca modules into 'opr.modules' subpackage * feat: add MinkResNetFPN feature extractor module * feat: MinkLoc3D implementation * feat: MinkLoc3Dv2 model implementation * fix: remove redundant resnet module from opr.models subpackage * fix: remove redundant opr.layers subpackage * fix: remove redundant opr.models.layers subpackage * chore: remove gitlab-ci config * chore: remove strange output.txt file * chore: move notebooks to separate directory * docs: opr.datasets description * refactor: unified output type for place recognition models * docs: add notebook with usage example for opr.datasets subpackage * docs: usage example for opr.models subpackage * chore: add faiss dependency * chore: better MinkowskiEngine installation in Dockerfile * feat: init PlaceRecognitionPipeline * docs: usage example for opr.pipelines * fix: return type in infer method is np.ndarray * docs: add docstring to PlaceRecognitionPipeline.__init__ method * fix: rename miner and fix linter issues * fix: BatchHardTripletMiner forward method docstring - embeddings is Tensor, not Dictionary * refactor: new loss format * docs: losses usage sample * fix: utils module linter fixes * refactor: move 'parse_device' method to opr.utils module as function * feat: replace print with logging * refactor: remove outdated configs * fix: more reasonable default value for quantization size in NCLT dataset * fix: fist filter point by distance, then convert to spherical * fix: remove outdated configs * refactor: remove miner config * feat: UnimodalPlaceRecognitionTrainer * feat: config files for training * feat: training script * chore: remove outdated training module * feat: update make_dataloaders and make_distributed_dataloaders * fix: remove ComposedModel import from testing module * feat: new config for minkloc3dv2 * feat: ITLP dataset * feat: ResNet18 image-based Place Recognition model * fix: copy device value when creating new Conv2d * feat: add SemanticModel and SemanticResNet18 for Place Recognition based on semantic segmentation masks * feat: move test code to separate function * feat: 2-layer MLP module * feat: SelfAttention module * feat: init new project submodule * feat: new ConvNeXt feature extractor * Revert "feat: init new project submodule" This reverts commit bba8e4b. * chore: bump version to 0.2.0 --------- Co-authored-by: Vitaly Bezuglyj <[email protected]> Co-authored-by: Bezuglyj Vitaly <[email protected]> Co-authored-by: Iliia Petryashin <[email protected]>
OPR-Project · Nov 28, 2023 · 34b8a77 · 34b8a77
1 parent 1685996
commit 34b8a77
Show file tree

Hide file tree

Showing 113 changed files with 12,896 additions and 1,668 deletions.
diff --git a/.flake8 b/.flake8
@@ -0,0 +1,11 @@
+[flake8]
+select = ANN,B,B9,BLK,C,D,DAR,E,F,I,S,W
+ignore = ANN101,ANN401,B905,E203,E501,W503
+per-file-ignores =
+    tests/*:S101
+    __init__.py:F401
+max-complexity = 10
+max-line-length = 110
+exclude = .git,venv,build,__pycache__
+docstring-convention = google
+strictness = short
diff --git a/.gitignore b/.gitignore
@@ -1,5 +1,9 @@
-.vscode/
+checkpoints/
+outputs/
+wandb/
+*.joblib
 
+.vscode/
 # Byte-compiled / optimized / DLL files
 __pycache__/
 *.py[cod]
@@ -154,9 +158,27 @@ dmypy.json
 # Cython debug symbols
 cython_debug/
 
+
+outputs/
+checkpoints/
+wandb/
 # PyCharm
 #  JetBrains specific template is maintained in a separate JetBrains.gitignore that can
 #  be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
 #  and can be added to the global gitignore or merged into this file.  For a more nuclear
 #  option (not recommended) you can uncomment the following to ignore the entire idea folder.
 #.idea/
+.gitignore
+
+configs/general/multimodal.yaml
+configs/nclt_lidar_cam5.yaml
+configs/config_nclt_text_PCA_concat.yaml
+configs/config_nclt_text_PCA_concat_lr_0.0001.yaml
+.gitignore
+configs/config_nclt_text_PCA_add_lr_0.0001.yaml
+configs/config_itlp_default.yaml
+configs/config_nclt_text_PCA_add.yaml
+configs/nclt_lidar_cam5_texts.yaml
+configs/config_nclt_text.yaml
+.gitignore
+configs/config_nclt_text.yaml
diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -30,44 +30,10 @@ repos:
   - id: requirements-txt-fixer
   - id: trailing-whitespace
 
-# code formatter
-- repo: https://github.com/psf/black
-  rev: 23.3.0
-  hooks:
-  - id: black
-    args: ["--line-length=110"]
-
-# sort imports
-- repo: https://github.com/PyCQA/isort
-  rev: 5.12.0
-  hooks:
-  - id: isort
-    args: ["--profile=black"]
-
-# flake8 linter
-- repo: https://github.com/PyCQA/flake8
-  rev: 6.0.0
+- repo: local
   hooks:
   - id: flake8
-    exclude: __init__.py|tests
-    args:
-    - --max-line-length=110
-    - --docstring-convention=google
-    - --extend-select=B950
-    - --extend-ignore=E203,E501
-    additional_dependencies:
-    - pep8-naming
-    - flake8-bugbear
-    - flake8-docstrings
-
-# type annotations linter
-- repo: https://github.com/pre-commit/mirrors-mypy
-  rev: v1.2.0
-  hooks:
-  - id: mypy
-
-# darglint docstring linter
-- repo: https://github.com/terrencepreilly/darglint
-  rev: v1.8.1
-  hooks:
-  - id: darglint
+    name: flake8
+    entry: flake8
+    language: system
+    types: [python]
diff --git a/README.md b/README.md
@@ -4,7 +4,10 @@
 
 ### Pre-requisites
 
-- The library requires PyTorch~=1.13 and MinkowskiEngine library to be installed manually. See [PyTorch website](https://pytorch.org/get-started/previous-versions/) and [MinkowskiEngine repository](https://github.com/NVIDIA/MinkowskiEngine) for the detailed instructions.
+- The library requires `PyTorch`, `MinkowskiEngine` and (optionally) `faiss` libraries to be installed manually:
+  - [PyTorch Get Started](https://pytorch.org/get-started/locally/)
+  - [MinkowskiEngine repository](https://github.com/NVIDIA/MinkowskiEngine)
+  - [faiss repository](https://github.com/facebookresearch/faiss)
 
 - Another option is to use the suggested Dockerfile. The following commands should be used to build, start and enter the container:
 
@@ -34,18 +37,121 @@
     pip install .
     ```
 
-## Usage
+## Package Structure
 
-Currently only MinkLoc++ pretrained on Oxford RobotCar available. You can download it using [google drive link](https://drive.google.com/file/d/1zlfdX217Nh3_QL5r0XAHUjDFjIPxUmMg/view?usp=share_link) (the link is subject to change).
+### opr.datasets
 
-If everything is installed correctly, you can use the library like below:
+Subpackage containing dataset classes and functions.
+
+Usage example:
 
 ```python
-from opr.models import minkloc_multimodal
+from opr.datasets import OxfordDataset
 
-baseline_model = minkloc_multimodal(weights="path_to_checkpoint")
+train_dataset = OxfordDataset(
+    dataset_root="/home/docker_opr/Datasets/pnvlad_oxford_robotcar_full/",
+    subset="train",
+    data_to_load=["image_stereo_centre", "pointcloud_lidar"]
+)
 ```
 
+The iterator will return a dictionary with the following keys:
+- `"idx"`: index of the sample in the dataset, single number Tensor
+- `"utm"`: UTM coordinates of the sample, Tensor of shape `(2)`
+- (optional) `"image_stereo_centre"`: image Tensor of shape `(C, H, W)`
+- (optional) `"pointcloud_lidar_feats"`: point cloud features Tensor of shape `(N, 1)`
+- (optional) `"pointcloud_lidar_coords"`: point cloud coordinates Tensor of shape `(N, 3)`
+
+More details can be found in the [demo_datasets.ipynb](./notebooks/demo_datasets.ipynb) notebook.
+
+### opr.losses
+
+The `opr.losses` subpackage contains ready-to-use loss functions implemented in PyTorch, featuring a common interface.
+
+Usage example:
+
+```python
+from opr.losses import BatchHardTripletMarginLoss
+
+loss_fn = BatchHardTripletMarginLoss(margin=0.2)
+
+idxs = sample_batch["idxs"]
+positives_mask = dataset.positives_mask[idxs][:, idxs]
+negatives_mask = dataset.negatives_mask[idxs][:, idxs]
+
+loss, stats = loss_fn(output["final_descriptor"], positives_mask, negatives_mask)
+```
+
+The loss functions introduce a unified interface:
+- **Input:**
+  - `embeddings`: descriptor Tensor of shape `(B, D)`
+  - `positives_mask`: boolean mask Tensor of shape `(B, B)`
+  - `negatives_mask`: boolean mask Tensor of shape `(B, B)`
+- **Output:**
+  - `loss`: loss value Tensor
+  - `stats`: dictionary with additional statistics
+
+More details can be found in the [demo_losses.ipynb](./notebooks/demo_losses.ipynb) notebook.
+
+### opr.models
+
+The `opr.models` subpackage contains ready-to-use neural networks implemented in PyTorch, featuring a common interface.
+
+Usage example:
+
+```python
+from opr.models.place_recognition import MinkLoc3D
+
+model = MinkLoc3D()
+
+# forward pass
+output = model(batch)
+```
+
+The models introduce unified input and output formats:
+- **Input:** a `batch` dictionary with the following keys
+  (all keys are optional, depending on the model and dataset):
+  - `"images_<camera_name>"`: images Tensor of shape `(B, 3, H, W)`
+  - `"masks_<camera_name>"`: semantic segmentation masks Tensor of shape `(B, 1, H, W)`
+  - `"pointclouds_lidar_coords"`: point cloud coordinates Tensor of shape `(B * N_points, 4)`
+  - `"pointclouds_lidar_feats"`: point cloud features Tensor of shape `(B * N_points, C)`
+- **Output:** a dictionary with the requiered key `"final_descriptor"`
+  and optional keys for intermediate descriptors:
+  - `"final_descriptor"`: final descriptor Tensor of shape `(B, D)`
+
+More details can be found in the [demo_models.ipynb](./notebooks/demo_models.ipynb) notebook.
+
+### opr.pipelines
+
+The `opr.pipelines` subpackage contains ready-to-use pipelines for model inference.
+
+Usage example:
+
+```python
+from opr.models.place_recognition import MinkLoc3Dv2
+from opr.pipelines.place_recognition import PlaceRecognitionPipeline
+
+pipe = PlaceRecognitionPipeline(
+    database_dir="/home/docker_opr/Datasets/ITLP_Campus/ITLP_Campus_outdoor/databases/00",
+    model=MinkLoc3Dv2(),
+    model_weights_path=None,
+    device="cuda",
+)
+
+out = pipe.infer(sample)
+```
+
+The pipeline introduces a unified interface for model inference:
+- **Input:** a dictionary with the following keys
+  (all keys are optional, depending on the model and dataset):
+  - `"image_<camera_name>"`: image Tensor of shape `(3, H, W)`
+  - `"mask_<camera_name>"`: semantic segmentation mask Tensor of shape `(1, H, W)`
+  - `"pointcloud_lidar_coords"`: point cloud coordinates Tensor of shape `(N_points, 4)`
+  - `"pointcloud_lidar_feats"`: point cloud features Tensor of shape `(N_points, C)`
+- **Output:** a dictionary with keys:
+  - `"pose"` for predicted pose in the format `[tx, ty, tz, qx, qy, qz, qw]`,
+  - `"descriptor"` for predicted descriptor.
+
 ## License
 
 [MIT License](./LICENSE) (**_the license is subject to change in future versions_**)
diff --git a/configs/config.yaml b/configs/config.yaml
diff --git a/configs/dataset/nclt.yaml b/configs/dataset/nclt.yaml
@@ -1,18 +1,17 @@
-dataset:
-  _target_: opr.datasets.nclt.NCLTDataset
+_target_: opr.datasets.nclt.NCLTDataset
 
-  dataset_root: /home/docker_opr/Datasets/NCLT_preprocessed
-  modalities: [image, cloud,]
-  images_subdir: lb3_small/Cam5
-  mink_quantization_size: 0.5
-
-sampler:
-  _target_: opr.datasets.samplers.batch_sampler.BatchSampler
-
-  batch_size: 8
-  batch_size_limit: 160
-  batch_expansion_rate: 1.4
-  positives_per_group: 2
-  seed: 3121999
-
-num_workers: 4
+dataset_root: ???
+data_to_load: [pointcloud_lidar,]
+positive_threshold: 10.0
+negative_threshold: 50.0
+images_dirname: images_small
+masks_dirname: segmentation_masks_small
+pointclouds_dirname: velodyne_data
+pointcloud_quantization_size: 0.5
+max_point_distance: 40.0
+spherical_coords: False
+use_intensity_values: False
+image_transform: null
+semantic_transform: null
+pointcloud_transform: null
+pointcloud_set_transform: null
diff --git a/configs/dataset/oxford.yaml b/configs/dataset/oxford.yaml
@@ -1,19 +1,16 @@
-dataset:
-  _target_: opr.datasets.oxford.OxfordDataset
+_target_: opr.datasets.OxfordDataset
 
-  dataset_root: /home/docker_opr/Datasets/pnvlad_oxford_robotcar
-  modalities: [image, cloud,]
-  images_subdir: stereo_centre_small
-  random_select_nearest_images: False
-  mink_quantization_size: 0.01
-
-sampler:
-  _target_: opr.datasets.samplers.batch_sampler.BatchSampler
-
-  batch_size: 8
-  batch_size_limit: 160
-  batch_expansion_rate: 1.4
-  positives_per_group: 2
-  seed: 3121999
-
-num_workers: 4
+dataset_root: ???
+data_to_load: [pointcloud_lidar,]
+positive_threshold: 10.0
+negative_threshold: 50.0
+images_dirname: images_small
+masks_dirname: segmentation_masks_small
+pointclouds_dirname: null
+pointcloud_quantization_size: 0.01
+max_point_distance: null
+spherical_coords: False
+image_transform: null
+semantic_transform: null
+pointcloud_transform: null
+pointcloud_set_transform: null