Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question regarding latent embedding #11

Open
HannahHaensen opened this issue Aug 23, 2022 · 6 comments
Open

Question regarding latent embedding #11

HannahHaensen opened this issue Aug 23, 2022 · 6 comments

Comments

@HannahHaensen
Copy link

Is the latent embedding required?

latent_embeddings = get_latent_embeddings(model_points, estimator)

how to obtain this from a new dataset? do i have to train NOCS?

@zubair-irshad
Copy link
Owner

Thanks for your interest in our work. To train CenterSnap on your own dataset, yes you need latent embeddings. We obtain them by pre-training an auto-encoder (Please see the snippet below from our paper, Figure 2)
image

To pre-train an auto-encoder, we provide additional scripts under external/shape_pretraining. To train this on your own dataset, you would need CAD models (please see the data preparation here). These CAD models should be same as the one used for rendering RGB images (for synthetic) or obtained using any scanning tool for training/finetuning on real dataset (just like NOCS). Also see NOCS object models here.

Note that we do not require CAD models during inference time. It really depends on how you want to train on a new dataset. The shape pre-training stage learns a latent embedding vector per shape so it might be better to train any new CAD models combined with the NOCS synthetic CAD models (assuming they are within the same category as NOCS) so it learns a better prior over all 3D information it sees. Hope it helps!

@HannahHaensen
Copy link
Author

thank you for the detailed answer! I will try it

@yuanzhen2020
Copy link

Great work! May i ask the GT latent embedding code for training is generated through the CAD model of training set? And is it possible to generate latent embedding code from partial point cloud (obtained by masked depth).

@zubair-irshad
Copy link
Owner

  1. Yes, that's correct. We generate the GT latent embedding code by training an auto-encoder using the CAD models available in the training set (Please also see above response for a detailed answer).
  2. Unfortunately it is not possible to use partial pointclouds i.e. masked depth maps directly since it would not give the proper size of the objects which is crucial for obtaining the accurate size information needed for absolute pose information and eventual accurate reconstruction. Please see this and this on how we obtain sizes and rotated bounding boxes from canonical bounding boxes.
    Note that we use these CAD models and GT latent embedding as a strong prior so we can generalize to unseen images (i.e. we do not require any CAD models during testing, we only require a single RGB-D image during testing) and hence the more accurate the prior is learnt, the better the generalization performance would be.

Hope it helps!

@HannahHaensen
Copy link
Author

I have another question regarding this pretrainin in the paper it says you use the shapenet CAD models but here the CAD models from NOCS are used or do I get the README wrong?

@zubair-irshad
Copy link
Owner

@HannahHaensen, The CAD models used by NOCS are exactly similar to ShapeNet. The difference is, NOCS trains on only 6 Shapenet Categories and selects a subset of ShapeNet models for which they render the synthetic images on table top scenes i.e. their train set. For the paper, we only train on the subset of ShapeNet models (6 categories only) used by NOCS but we have also trained our auto-encoder on all shapenet models (unfortunately we cannot release the pre-trained checkpoints for that). Note that if you wish to, you can train the auto-encoder on all shapenet categories using the same train script. Hope this helps!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants