Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use my own dataset for training? #82

Open
xiannsa opened this issue Apr 20, 2023 · 56 comments
Open

How to use my own dataset for training? #82

xiannsa opened this issue Apr 20, 2023 · 56 comments

Comments

@xiannsa
Copy link

xiannsa commented Apr 20, 2023

Excuse me, is it true that the method for training my own dataset involves preparing a folder to store images for training, write a new dataloader by imitating wireframe adding it to dataset_util.py, and then following the provided five steps for training?

@rpautrat
Copy link
Member

Yes, that is a good summary for re-training on a custom dataset!

@xiannsa
Copy link
Author

xiannsa commented Apr 24, 2023

Excuse me, I imitated Holicity's dataloader, and after the second step of training, no files were generated. So how should I proceed with the third step? I hope you can give some advice.

@abhiagwl4262
Copy link

abhiagwl4262 commented Apr 24, 2023

@rpautrat

I converted my dataset which is in lcnn format currently to wireframe format using -
https://github.com/Delay-Xili/F-Clip/blob/master/dataset/wireframe.py &
https://github.com/Delay-Xili/F-Clip/blob/master/dataset/wireframe_line.py

I hope that also works.

@rpautrat
Copy link
Member

Excuse me, I imitated Holicity's dataloader, and after the second step of training, no files were generated. So how should I proceed with the third step? I hope you can give some advice.

Did any error happen? Step 2 should be generating the ground truth files. Did you also check that the paths are correct, for example where was the output path that you selected?

@rpautrat
Copy link
Member

@rpautrat

I converted my dataset which is in lcnn format currently to wireframe format using - https://github.com/Delay-Xili/F-Clip/blob/master/dataset/wireframe.py & https://github.com/Delay-Xili/F-Clip/blob/master/dataset/wireframe_line.py

I hope that also works.

I am not sure to understand the difference between these two formats, but if you use the one of the original Wireframe dataset, this should be fine.

@xiannsa
Copy link
Author

xiannsa commented Apr 25, 2023

Thank you for your help,I have completed the third step. However, I have encountered a new issue while training in the fourth step: "Input type (torch.cuda.DoubleTensor) and weight type (torch.cuda.FloatTensor) should be the same.

@rpautrat
Copy link
Member

This means that there is a mismatch of types: the model weights are torch.float, while some input (probably your input images) are torch.double. You need to convert your images to float with images.float().

@xiannsa
Copy link
Author

xiannsa commented Apr 26, 2023

Hello, thank you for your help earlier. I was able to successfully complete step four, but I noticed that the training performance did not improve significantly as the epoch count increased. During training, I encountered a runtime warning about invalid value encountered in intersection return lib.intersection(a, b, **kwargs). Eventually, when attempting to continue with step five, I received an error stating KeyError: 'ref_junction_map'. Can you please offer some assistance?

@rpautrat
Copy link
Member

Hi, did you start training using the pre-trained weights? If so, it would make sense that the training performance will not improve much on your custom dataset.

Regarding the warning, I am not sure, I would need to see the full message, but it is probably not too serious.

For step 5, did you make sure to put the option 'return_type' to 'paired_desc' in the config file of your custom dataset?

@xiannsa
Copy link
Author

xiannsa commented Apr 26, 2023

440AR3HCUWWOCS IXN1(HTR
Thank you for your message. I have already begun the fifth step of my training. Regarding the fourth step, I used my own dataset and did not use a pre-trained model from the first step. I am curious to know if the training results were very poor.

@rpautrat
Copy link
Member

These results are actually not very poor, they are similar to what I have. The training metrics are pixel-level precision and recall, so it is very hard to get a good score in these. But the overall quality of your lines should already pretty good now. Did you try visualize them?

@xiannsa
Copy link
Author

xiannsa commented Apr 26, 2023

(%}V TR3T73DH1$QXK(08
H5)MKQG168RTBB)SJ8I5B9E
"I'm on step 5 of my training, and I used the model file from round 11 that was just trained to draw a picture. The result had more lines than when I used the wireframe.tar you provided, but there were also more mistakes. I hope that the models trained in the future will have better results."

@rpautrat
Copy link
Member

The matching looks very good at least, but the lines could indeed be improved. Note that you can also tune the hyperparameters to make it better.

One issue here is for example the lack of junctions, which prevent from discovering too many lines. You could try reducing the parameter 'detection_thresh' of the model config file train_full_pipeline.yaml, so that the network can detect more junctions.

@abhiagwl4262
Copy link

abhiagwl4262 commented Apr 27, 2023

@rpautrat can you just help me with .mat files for validation data ? what is the scale the lines have been resized to ?

Should I scale my lines to (512, 512) or (500, 500) or something else ?

@rpautrat
Copy link
Member

I am sorry, which .mat file? What do you want to validate exactly? If you follow the metrics of the paper, there should be no .mat file needed, the metrics are computed between line detections across two images warped by a homography.

@abhiagwl4262
Copy link

abhiagwl4262 commented Apr 28, 2023

@rpautrat

in this line you require a .mat file -

mat_paths = [p[:-2] + "_line.mat" for p in prefix_paths]

How do I create one for my custom data?

And How do I verify step 2 ?

@rpautrat
Copy link
Member

Hi, you should take example on holicity_dataset.py and not wireframe_dataset.py when creating the dataloader for your own dataset. The Wireframe dataset has indeed some special features (inlcuding these .mat files coming from the original ground truth), but for a generic dataset like yours, there is no such special features. So it is better to copy the structure of holicity_dataset.py, where there is no such .mat file necessary.

@rpautrat
Copy link
Member

Steps 2 and 3 go together, and you can verify their output in the notebook https://github.com/cvg/SOLD2/blob/main/notebooks/visualize_exported_dataset.ipynb

@abhiagwl4262
Copy link

abhiagwl4262 commented May 2, 2023

I did try visualizing the exported dataset using above notebook -

My dataset is in wireframe format.

The keys the exported data has are -
dict_keys(['image', 'junctions', 'junction_map', 'line_map_pos', 'line_map_neg', 'heatmap_pos', 'heatmap_neg', 'valid_mask', 'file_key'])

And throwing the error -

KeyError: 'ref_image

@rpautrat
Copy link
Member

rpautrat commented May 2, 2023

Hi, I don't understand, why do you have the exported data in this format? If you follow step 3, the output file should only have keys 'junctions' and 'line_map'. So how did you get yours?

@abhiagwl4262
Copy link

abhiagwl4262 commented May 3, 2023

@rpautrat If you don't recommend Wireframe then it should not be default option of the repo.

In the readme itself you mentioned -
You can download the version of the [Wireframe dataset](https://github.com/huangkuns/wireframe) that we used during our training and testing [here](https://www.polybox.ethz.ch/index.php/s/IfdEf7RoHol7jeg). This repository also includes some files to train on the [Holicity dataset](https://holicity.io/) to add more outdoor images, but note that we did not extensively test this dataset and the original paper was based on the Wireframe dataset only.

If you work extensively on wireframe and you know that all the steps support this very well then wireframe should be the better choice to work on this repo.

And it would be very helpful if you can explain how I can turn lcnn annotations into holicity dataset format in the readme.

@abhiagwl4262
Copy link

@rpautrat Even if I use wireframe format why I don't have
ref_image, ref_junctions, ref_line_map, ref_line_points keys ?

@abhiagwl4262
Copy link

I did try visualizing the exported dataset using above notebook -

My dataset is in wireframe format.

The keys the exported data has are - dict_keys(['image', 'junctions', 'junction_map', 'line_map_pos', 'line_map_neg', 'heatmap_pos', 'heatmap_neg', 'valid_mask', 'file_key'])

And throwing the error -

KeyError: 'ref_image

I am not able to visualize exported data of original wireframe dataset as well. I am facing same issue. Please help because You guys made it work for original wireframe dataset

@abhiagwl4262
Copy link

abhiagwl4262 commented May 4, 2023

@rpautrat I also don't see "ref_image" key in any code related to exporting pseudo labels. Is the notebook - https://github.com/cvg/SOLD2/blob/main/notebooks/visualize_exported_dataset.ipynb correct ?

I ran grep -rnw . -e "ref_image" and it returned -

./notebooks/visualize_exported_dataset.ipynb:60:    "ref_img = data1['ref_image'].numpy().squeeze()\n",
./notebooks/visualize_exported_dataset.ipynb:129:    "ref_img = data1['ref_image'].numpy().squeeze()\n",
./notebooks/visualize_exported_dataset.ipynb:196:    "ref_img = data1['ref_image'].numpy().squeeze()\n",
Binary file ./sold2/__pycache__/train.cpython-38.pyc matches
./sold2/train.py:238:            input_images = data["ref_image"].cuda()
./sold2/train.py:420:            input_images = data["ref_image"].cuda()

the keys expected here -

def __getitem__(self, idx):
"""Return data
file_key: str, keys used to retrieve data from the filename dataset.
image: torch.float, C*H*W range 0~1,
junctions: torch.float, N*2,
junction_map: torch.int32, 1*H*W range 0 or 1,
line_map_pos: torch.int32, N*N range 0 or 1,
line_map_neg: torch.int32, N*N range 0 or 1,
heatmap_pos: torch.int32, 1*H*W range 0 or 1,
heatmap_neg: torch.int32, 1*H*W range 0 or 1,
valid_mask: torch.int32, 1*H*W range 0 or 1
"""
are matching with wireframe data that I downloaded from readme link.

@rpautrat
Copy link
Member

rpautrat commented May 4, 2023

Hi @abhiagwl4262 , I think there is a confusion regarding the wireframe dataset. When we train on it, we use our own exported ground truth, we don't use the original ground truth. So yes, wireframe is the default option but with exported ground truth. The possibility to use this 'official' ground truth is a legacy and was only used during evaluation, but it is not intended for you to use it. I hope this clarifies the confusion.

The right way to use this repo is to export your own ground truth with steps 2 and 3 (on the Wireframe dataset or any other dataset), then use this pseudo ground truth. The wireframe dataloader doesn't support the use of the 'official' ground truth and pair loading (i.e. to get this 'ref_image').

In case you are looking for it, the 'ref_image' comes from here:

outputs["ref_" + key] = val

It is available when you use an exported ground truth, together with the 'paired_desc' option.

@abhiagwl4262
Copy link

@rpautrat Do I have to run step 2 and 3 with 'paired_desc' option ?

@rpautrat
Copy link
Member

rpautrat commented May 5, 2023

No, 'paired_desc' is only for step 5, when training the descriptor. That is why it is not activated in the default configuration.

@abhiagwl4262
Copy link

abhiagwl4262 commented May 5, 2023

@rpautrat But you said the keys expected in the notebook will be present when I use paired_desc option

Do I have to modify notebook to see step 2 and step 3 results ?

@rpautrat
Copy link
Member

rpautrat commented May 5, 2023

Ah maybe that's where the confusion came from. The 'paired_desc' option has to be put in the field 'return_type' of the config/wireframe_dataset.yaml config file. This was explained for step 5, but it was probably not clear that you have to do the same for the notebook.

Steps 2 and 3 do not need this 'paired_desc' and it will anyway be ignored if you use it. So you can reuse your current results.

@abhiagwl4262
Copy link

abhiagwl4262 commented May 5, 2023

So this notebook to visualize exported data is of no use ?

@rpautrat
Copy link
Member

rpautrat commented May 5, 2023

Mmmmh no? Again if you execute steps 2 and 3, then modify the config file config/wireframe_dataset.yaml by adding the 'paired_desc' option, then you can use the notebook.

I will add a line to the notebook to make it automatic in the future.

@abhiagwl4262
Copy link

@rpautrat That would be nice. please do that.

And Also if you can add details of how to convert lines in LCNN format(JSONs) to holicity, then repo would become real good. I can raise PR for LCNN to wireframe conversion.

@rpautrat
Copy link
Member

rpautrat commented May 5, 2023

I updated the notebook.

A PR with a script converting from LCNN format to holicity one would be nice, yes! Thank you in advance.

@xiannsa
Copy link
Author

xiannsa commented May 5, 2023

Excuse me, can I use videos to draw lines besides drawing on images in Jupyter Notebook?

@rpautrat
Copy link
Member

rpautrat commented May 5, 2023

If you want to make a video with lines, one option is to draw lines on each frame, then combine all frames together in a video.

@abhiagwl4262
Copy link

abhiagwl4262 commented May 5, 2023

I updated the notebook.

A PR with a script converting from LCNN format to holicity one would be nice, yes! Thank you in advance.

@rpautrat
Did you check if the notebook works ?

I got following error on new_points = (homography @ new_points.T).T

ValueError: matmul: Input operand 0 does not have enough dimensions (has 0, gufunc core with signature (n?,k),(k,m?)->(n?,m?) requires 1)

@xiannsa
Copy link
Author

xiannsa commented May 8, 2023

,Excuse me ,how can i train it use colored original images instead of grayscale images?

@nmanhong
Copy link

nmanhong commented May 8, 2023

@xiannsa 大哥,带我一个啊,教教我,我不会,现成的发给我看看,可以吗?

@xiannsa
Copy link
Author

xiannsa commented May 8, 2023

@nmanhong 你遇到什么问题了吗,训练里面的吗

@xiannsa
Copy link
Author

xiannsa commented May 8, 2023

@rpautrat ,Excuse me ,how can i train it use colored original images instead of grayscale images?

@rpautrat
Copy link
Member

rpautrat commented May 8, 2023

I updated the notebook.
A PR with a script converting from LCNN format to holicity one would be nice, yes! Thank you in advance.

@rpautrat Did you check if the notebook works ?

I got following error on new_points = (homography @ new_points.T).T

ValueError: matmul: Input operand 0 does not have enough dimensions (has 0, gufunc core with signature (n?,k),(k,m?)->(n?,m?) requires 1)

Yes I ran the notebook and it works fine for me.

It seems that the homography is empty in your case, not sure why. You may want to inspect the homography generation.

@rpautrat
Copy link
Member

rpautrat commented May 8, 2023

@rpautrat ,Excuse me ,how can i train it use colored original images instead of grayscale images?

In the config file, change the 'input_channel' to 3: https://github.com/cvg/SOLD2/blob/bbce15cca60ded06354106ffb66e3cb3ee23f2ad/sold2/config/train_detector.yaml#LL7C24-L7C24
In the dataloader, don't convert images to grayscale, but keep them in RGB. This should be enough to train on RGB images.

@nmanhong
Copy link

nmanhong commented May 9, 2023

@xiannsa 大哥你是在搞训练自己的数据集吗?我也想训练自己的数据集,但是我不会搞,你搞好了 可以发给我学习学学习吗

@xiannsa
Copy link
Author

xiannsa commented May 9, 2023

@rpautrat I changed the input_channel to 3 and modified the 'gray_scale: True' parameter in the holicity_dataset.yaml file to 'gray_scale: False', as you suggested. However, when I reached the fourth step of the training, I encountered the error 'ValueError: operands could not be broadcast together with remapped shapes (512, 512, 3, 1) (512, 512, 1)。Do you have any suggestions on how it ?
image
image

@rpautrat
Copy link
Member

rpautrat commented May 9, 2023

Hi, can you please send the full error that you get?

@xiannsa
Copy link
Author

xiannsa commented May 10, 2023

Here is the error I encountered while attempting to train with colored images using the checkpoint you provided for the second step. I hope you can give me some advice. Also, I have been able to use video for match lines, but the frame rate is a bit low. It takes 0.7 seconds per frame on my RTX 3090. I would like to know how many frames you were able to achieve in the video demonstration.
3@5}056FJA3HD0N_XX(X H3
image

@xiannsa
Copy link
Author

xiannsa commented May 10, 2023

@rpautrat Excuse me, if I lower the network depth from 4 to 3 while training, can it improve the efficiency of the model to draw lines?
image

@rpautrat
Copy link
Member

Hi, you cannot reuse the existing checkpoint for colored images, since it was trained for grayscale ones. You need to start training from scratch in your case.

Or another solution is to hack into the function loading the weights:

SOLD2/sold2/train.py

Lines 36 to 57 in bbce15c

def restore_weights(model, state_dict, strict=True):
""" Restore weights in compatible mode. """
# Try to directly load state dict
try:
model.load_state_dict(state_dict, strict=strict)
# Deal with some version compatibility issue (catch version incompatible)
except:
err = model.load_state_dict(state_dict, strict=False)
# missing keys are those in model but not in state_dict
missing_keys = err.missing_keys
# Unexpected keys are those in state_dict but not in model
unexpected_keys = err.unexpected_keys
# Load mismatched keys manually
model_dict = model.state_dict()
for idx, key in enumerate(missing_keys):
dict_keys = [_ for _ in unexpected_keys if not "tracked" in _]
model_dict[key] = state_dict[dict_keys[idx]]
model.load_state_dict(model_dict)
return model

and to load all the weights of the pretrained network, except the very first one, this conv1 raising the error. All the others should be identical between grayscale / rgb images.

@rpautrat
Copy link
Member

@rpautrat Excuse me, if I lower the network depth from 4 to 3 while training, can it improve the efficiency of the model to draw lines? image

This code has not been optimized for speed, it is a research code. There would be ways to make it much more efficient though. A good starting point would be to convert the line matching purely on GPU instead of partly GPU, partly CPU (e.g. here:

def filter_and_match_lines(self, scores):
).

I would not reduce the depth of the backbone from 4 to 3, the results would probably severely decrease.

For the video, we processed the frames offline independently, then combined the frames into a video. So you can choose the frame rate as you wish with this method.

@xiannsa
Copy link
Author

xiannsa commented May 10, 2023

Thank you for your help. I previously tried training from scratch by modifying the 'input_channel' in 'train_detector.yaml' to 3 and training on a synthetic dataset. However, I encountered an error: 'RuntimeError: Given groups=1, weight of size..., expected input... to have 3 channels, but got 1 channel instead.' Could it be because the synthetic dataset itself is in grayscale?

@rpautrat
Copy link
Member

Yes, of course everything is by default in grayscale, so the synthetic dataset is also in grayscale. Converting it into rgb might require more work though, especially if you want to create interesting colored texture inside the geometrical shapes...

@xiannsa
Copy link
Author

xiannsa commented May 12, 2023

I apologize for the interruption. May I know which parameters can be adjusted to improve the completeness of the waterline for the ship, as shown in the image? Currently, only a part of the waterline is visible.
image

@rpautrat
Copy link
Member

Hi, here the issue is that there is no detected junction at the end of the water line, so you need to increase the number of detected junctions.

For this, you can reduce the 'detection_thresh' of the model config, and potentially also increase the 'max_num_junctions' as well, to increase the maximum number of junctions.

@xiannsa
Copy link
Author

xiannsa commented May 12, 2023

Thank you for your suggestion. However, I would like to know if after adjusting the parameters, do I need to retrain steps 4 and 5 and modify the corresponding parameters in the "lines_match" section to see the effect?

@rpautrat
Copy link
Member

No, you don't need to retrain anything there. You can directly change the parameters at test time when using your already trained model.

@VitoriaCarvalho
Copy link

@rpautrat

Hello! I'm trying to use SOLD² to detect lines in a stack of objects. In this case, I need to detect specific lines, not all straight segments in the image. Therefore, I need to train the model with my ground truth, but I am having difficulty understanding the pipeline. Is it necessary to train with synthetic data even in the case of a problem where it is not necessary to detect all straight segments present in the image?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants