How to use my own dataset for training? #82

xiannsa · 2023-04-20T04:09:17Z

Excuse me, is it true that the method for training my own dataset involves preparing a folder to store images for training, write a new dataloader by imitating wireframe adding it to dataset_util.py, and then following the provided five steps for training?

rpautrat · 2023-04-20T06:49:59Z

Yes, that is a good summary for re-training on a custom dataset!

xiannsa · 2023-04-24T12:05:43Z

Excuse me, I imitated Holicity's dataloader, and after the second step of training, no files were generated. So how should I proceed with the third step? I hope you can give some advice.

abhiagwl4262 · 2023-04-24T12:06:21Z

@rpautrat

I converted my dataset which is in lcnn format currently to wireframe format using -
https://github.com/Delay-Xili/F-Clip/blob/master/dataset/wireframe.py &
https://github.com/Delay-Xili/F-Clip/blob/master/dataset/wireframe_line.py

I hope that also works.

rpautrat · 2023-04-24T16:20:46Z

Excuse me, I imitated Holicity's dataloader, and after the second step of training, no files were generated. So how should I proceed with the third step? I hope you can give some advice.

Did any error happen? Step 2 should be generating the ground truth files. Did you also check that the paths are correct, for example where was the output path that you selected?

rpautrat · 2023-04-24T16:21:40Z

@rpautrat

I converted my dataset which is in lcnn format currently to wireframe format using - https://github.com/Delay-Xili/F-Clip/blob/master/dataset/wireframe.py & https://github.com/Delay-Xili/F-Clip/blob/master/dataset/wireframe_line.py

I hope that also works.

I am not sure to understand the difference between these two formats, but if you use the one of the original Wireframe dataset, this should be fine.

xiannsa · 2023-04-25T03:46:57Z

Thank you for your help,I have completed the third step. However, I have encountered a new issue while training in the fourth step: "Input type (torch.cuda.DoubleTensor) and weight type (torch.cuda.FloatTensor) should be the same.

rpautrat · 2023-04-25T06:44:13Z

This means that there is a mismatch of types: the model weights are torch.float, while some input (probably your input images) are torch.double. You need to convert your images to float with images.float().

xiannsa · 2023-04-26T03:02:10Z

Hello, thank you for your help earlier. I was able to successfully complete step four, but I noticed that the training performance did not improve significantly as the epoch count increased. During training, I encountered a runtime warning about invalid value encountered in intersection return lib.intersection(a, b, **kwargs). Eventually, when attempting to continue with step five, I received an error stating KeyError: 'ref_junction_map'. Can you please offer some assistance?

rpautrat · 2023-04-26T06:13:11Z

Hi, did you start training using the pre-trained weights? If so, it would make sense that the training performance will not improve much on your custom dataset.

Regarding the warning, I am not sure, I would need to see the full message, but it is probably not too serious.

For step 5, did you make sure to put the option 'return_type' to 'paired_desc' in the config file of your custom dataset?

xiannsa · 2023-04-26T06:32:18Z

Thank you for your message. I have already begun the fifth step of my training. Regarding the fourth step, I used my own dataset and did not use a pre-trained model from the first step. I am curious to know if the training results were very poor.

rpautrat · 2023-04-26T06:38:55Z

These results are actually not very poor, they are similar to what I have. The training metrics are pixel-level precision and recall, so it is very hard to get a good score in these. But the overall quality of your lines should already pretty good now. Did you try visualize them?

xiannsa · 2023-04-26T07:05:44Z

"I'm on step 5 of my training, and I used the model file from round 11 that was just trained to draw a picture. The result had more lines than when I used the wireframe.tar you provided, but there were also more mistakes. I hope that the models trained in the future will have better results."

rpautrat · 2023-04-26T07:36:03Z

The matching looks very good at least, but the lines could indeed be improved. Note that you can also tune the hyperparameters to make it better.

One issue here is for example the lack of junctions, which prevent from discovering too many lines. You could try reducing the parameter 'detection_thresh' of the model config file train_full_pipeline.yaml, so that the network can detect more junctions.

abhiagwl4262 · 2023-04-27T09:41:26Z

@rpautrat can you just help me with .mat files for validation data ? what is the scale the lines have been resized to ?

Should I scale my lines to (512, 512) or (500, 500) or something else ?

rpautrat · 2023-04-27T10:39:33Z

I am sorry, which .mat file? What do you want to validate exactly? If you follow the metrics of the paper, there should be no .mat file needed, the metrics are computed between line detections across two images warped by a homography.

abhiagwl4262 · 2023-04-28T09:23:05Z

@rpautrat

in this line you require a .mat file -

SOLD2/sold2/dataset/wireframe_dataset.py

Line 168 in 3d7bcd6

mat_paths = [p[:-2] + "_line.mat" for p in prefix_paths]

How do I create one for my custom data?

And How do I verify step 2 ?

rpautrat · 2023-04-28T11:49:39Z

Hi, you should take example on holicity_dataset.py and not wireframe_dataset.py when creating the dataloader for your own dataset. The Wireframe dataset has indeed some special features (inlcuding these .mat files coming from the original ground truth), but for a generic dataset like yours, there is no such special features. So it is better to copy the structure of holicity_dataset.py, where there is no such .mat file necessary.

rpautrat · 2023-04-28T11:51:02Z

Steps 2 and 3 go together, and you can verify their output in the notebook https://github.com/cvg/SOLD2/blob/main/notebooks/visualize_exported_dataset.ipynb

abhiagwl4262 · 2023-05-02T16:09:49Z

I did try visualizing the exported dataset using above notebook -

My dataset is in wireframe format.

The keys the exported data has are -
dict_keys(['image', 'junctions', 'junction_map', 'line_map_pos', 'line_map_neg', 'heatmap_pos', 'heatmap_neg', 'valid_mask', 'file_key'])

And throwing the error -

KeyError: 'ref_image

rpautrat · 2023-05-02T20:55:08Z

Hi, I don't understand, why do you have the exported data in this format? If you follow step 3, the output file should only have keys 'junctions' and 'line_map'. So how did you get yours?

abhiagwl4262 · 2023-05-03T06:40:00Z

@rpautrat If you don't recommend Wireframe then it should not be default option of the repo.

In the readme itself you mentioned -
You can download the version of the [Wireframe dataset](https://github.com/huangkuns/wireframe) that we used during our training and testing [here](https://www.polybox.ethz.ch/index.php/s/IfdEf7RoHol7jeg). This repository also includes some files to train on the [Holicity dataset](https://holicity.io/) to add more outdoor images, but note that we did not extensively test this dataset and the original paper was based on the Wireframe dataset only.

If you work extensively on wireframe and you know that all the steps support this very well then wireframe should be the better choice to work on this repo.

And it would be very helpful if you can explain how I can turn lcnn annotations into holicity dataset format in the readme.

abhiagwl4262 · 2023-05-03T07:04:40Z

@rpautrat Even if I use wireframe format why I don't have
ref_image, ref_junctions, ref_line_map, ref_line_points keys ?

abhiagwl4262 · 2023-05-03T11:17:55Z

I did try visualizing the exported dataset using above notebook -

My dataset is in wireframe format.

The keys the exported data has are - dict_keys(['image', 'junctions', 'junction_map', 'line_map_pos', 'line_map_neg', 'heatmap_pos', 'heatmap_neg', 'valid_mask', 'file_key'])

And throwing the error -

KeyError: 'ref_image

I am not able to visualize exported data of original wireframe dataset as well. I am facing same issue. Please help because You guys made it work for original wireframe dataset

abhiagwl4262 · 2023-05-04T09:32:27Z

@rpautrat I also don't see "ref_image" key in any code related to exporting pseudo labels. Is the notebook - https://github.com/cvg/SOLD2/blob/main/notebooks/visualize_exported_dataset.ipynb correct ?

I ran grep -rnw . -e "ref_image" and it returned -

./notebooks/visualize_exported_dataset.ipynb:60:    "ref_img = data1['ref_image'].numpy().squeeze()\n",
./notebooks/visualize_exported_dataset.ipynb:129:    "ref_img = data1['ref_image'].numpy().squeeze()\n",
./notebooks/visualize_exported_dataset.ipynb:196:    "ref_img = data1['ref_image'].numpy().squeeze()\n",
Binary file ./sold2/__pycache__/train.cpython-38.pyc matches
./sold2/train.py:238:            input_images = data["ref_image"].cuda()
./sold2/train.py:420:            input_images = data["ref_image"].cuda()

the keys expected here -

SOLD2/sold2/dataset/wireframe_dataset.py

Lines 932 to 943 in 3d7bcd6

    
               def __getitem__(self, idx): 
        
                   """Return data 
        
                   file_key: str, keys used to retrieve data from the filename dataset. 
        
                   image: torch.float, C*H*W range 0~1, 
        
                   junctions: torch.float, N*2, 
        
                   junction_map: torch.int32, 1*H*W range 0 or 1, 
        
                   line_map_pos: torch.int32, N*N range 0 or 1, 
        
                   line_map_neg: torch.int32, N*N range 0 or 1, 
        
                   heatmap_pos: torch.int32, 1*H*W range 0 or 1, 
        
                   heatmap_neg: torch.int32, 1*H*W range 0 or 1, 
        
                   valid_mask: torch.int32, 1*H*W range 0 or 1 
        
                   """

are matching with wireframe data that I downloaded from readme link.

rpautrat · 2023-05-04T21:34:12Z

Hi @abhiagwl4262 , I think there is a confusion regarding the wireframe dataset. When we train on it, we use our own exported ground truth, we don't use the original ground truth. So yes, wireframe is the default option but with exported ground truth. The possibility to use this 'official' ground truth is a legacy and was only used during evaluation, but it is not intended for you to use it. I hope this clarifies the confusion.

The right way to use this repo is to export your own ground truth with steps 2 and 3 (on the Wireframe dataset or any other dataset), then use this pseudo ground truth. The wireframe dataloader doesn't support the use of the 'official' ground truth and pair loading (i.e. to get this 'ref_image').

In case you are looking for it, the 'ref_image' comes from here:

SOLD2/sold2/dataset/wireframe_dataset.py

Line 755 in 3d7bcd6

outputs["ref_" + key] = val

It is available when you use an exported ground truth, together with the 'paired_desc' option.

abhiagwl4262 · 2023-05-05T05:15:35Z

@rpautrat Do I have to run step 2 and 3 with 'paired_desc' option ?

rpautrat · 2023-05-05T06:38:00Z

No, 'paired_desc' is only for step 5, when training the descriptor. That is why it is not activated in the default configuration.

abhiagwl4262 · 2023-05-05T06:39:55Z

@rpautrat But you said the keys expected in the notebook will be present when I use paired_desc option

Do I have to modify notebook to see step 2 and step 3 results ?

rpautrat · 2023-05-05T07:07:15Z

Ah maybe that's where the confusion came from. The 'paired_desc' option has to be put in the field 'return_type' of the config/wireframe_dataset.yaml config file. This was explained for step 5, but it was probably not clear that you have to do the same for the notebook.

Steps 2 and 3 do not need this 'paired_desc' and it will anyway be ignored if you use it. So you can reuse your current results.

abhiagwl4262 · 2023-05-05T07:40:22Z

So this notebook to visualize exported data is of no use ?

rpautrat · 2023-05-05T07:44:33Z

Mmmmh no? Again if you execute steps 2 and 3, then modify the config file config/wireframe_dataset.yaml by adding the 'paired_desc' option, then you can use the notebook.

I will add a line to the notebook to make it automatic in the future.

abhiagwl4262 · 2023-05-05T08:10:26Z

@rpautrat That would be nice. please do that.

And Also if you can add details of how to convert lines in LCNN format(JSONs) to holicity, then repo would become real good. I can raise PR for LCNN to wireframe conversion.

rpautrat · 2023-05-05T08:53:47Z

I updated the notebook.

A PR with a script converting from LCNN format to holicity one would be nice, yes! Thank you in advance.

xiannsa · 2023-05-05T11:10:33Z

Excuse me, can I use videos to draw lines besides drawing on images in Jupyter Notebook?

rpautrat · 2023-05-05T11:12:14Z

If you want to make a video with lines, one option is to draw lines on each frame, then combine all frames together in a video.

abhiagwl4262 · 2023-05-05T20:47:08Z

I updated the notebook.

A PR with a script converting from LCNN format to holicity one would be nice, yes! Thank you in advance.

@rpautrat
Did you check if the notebook works ?

I got following error on new_points = (homography @ new_points.T).T

ValueError: matmul: Input operand 0 does not have enough dimensions (has 0, gufunc core with signature (n?,k),(k,m?)->(n?,m?) requires 1)

xiannsa · 2023-05-08T02:13:09Z

,Excuse me ,how can i train it use colored original images instead of grayscale images?

nmanhong · 2023-05-08T09:51:47Z

@xiannsa 大哥，带我一个啊，教教我，我不会，现成的发给我看看，可以吗？

xiannsa · 2023-05-08T12:12:03Z

@nmanhong 你遇到什么问题了吗，训练里面的吗

xiannsa · 2023-05-08T12:13:10Z

@rpautrat ,Excuse me ,how can i train it use colored original images instead of grayscale images?

rpautrat · 2023-05-08T21:18:53Z

I updated the notebook.
A PR with a script converting from LCNN format to holicity one would be nice, yes! Thank you in advance.

@rpautrat Did you check if the notebook works ?

I got following error on new_points = (homography @ new_points.T).T

ValueError: matmul: Input operand 0 does not have enough dimensions (has 0, gufunc core with signature (n?,k),(k,m?)->(n?,m?) requires 1)

Yes I ran the notebook and it works fine for me.

It seems that the homography is empty in your case, not sure why. You may want to inspect the homography generation.

rpautrat · 2023-05-08T21:20:58Z

@rpautrat ,Excuse me ,how can i train it use colored original images instead of grayscale images?

In the config file, change the 'input_channel' to 3: https://github.com/cvg/SOLD2/blob/bbce15cca60ded06354106ffb66e3cb3ee23f2ad/sold2/config/train_detector.yaml#LL7C24-L7C24
In the dataloader, don't convert images to grayscale, but keep them in RGB. This should be enough to train on RGB images.

nmanhong · 2023-05-09T01:29:30Z

@xiannsa 大哥你是在搞训练自己的数据集吗？我也想训练自己的数据集，但是我不会搞，你搞好了可以发给我学习学学习吗

xiannsa · 2023-05-09T05:36:47Z

@rpautrat I changed the input_channel to 3 and modified the 'gray_scale: True' parameter in the holicity_dataset.yaml file to 'gray_scale: False', as you suggested. However, when I reached the fourth step of the training, I encountered the error 'ValueError: operands could not be broadcast together with remapped shapes (512, 512, 3, 1) (512, 512, 1)。Do you have any suggestions on how it ?

rpautrat · 2023-05-09T20:01:10Z

Hi, can you please send the full error that you get?

xiannsa · 2023-05-10T02:20:29Z

Here is the error I encountered while attempting to train with colored images using the checkpoint you provided for the second step. I hope you can give me some advice. Also, I have been able to use video for match lines, but the frame rate is a bit low. It takes 0.7 seconds per frame on my RTX 3090. I would like to know how many frames you were able to achieve in the video demonstration.

xiannsa · 2023-05-10T02:35:54Z

@rpautrat Excuse me, if I lower the network depth from 4 to 3 while training, can it improve the efficiency of the model to draw lines?

rpautrat · 2023-05-10T06:23:36Z

Hi, you cannot reuse the existing checkpoint for colored images, since it was trained for grayscale ones. You need to start training from scratch in your case.

Or another solution is to hack into the function loading the weights:

SOLD2/sold2/train.py

Lines 36 to 57 in bbce15c

    
           def restore_weights(model, state_dict, strict=True): 
        
               """ Restore weights in compatible mode. """ 
        
               # Try to directly load state dict 
        
               try: 
        
                   model.load_state_dict(state_dict, strict=strict) 
        
               # Deal with some version compatibility issue (catch version incompatible) 
        
               except: 
        
                   err = model.load_state_dict(state_dict, strict=False) 
        
                   # missing keys are those in model but not in state_dict 
        
                   missing_keys = err.missing_keys 
        
                   # Unexpected keys are those in state_dict but not in model 
        
                   unexpected_keys = err.unexpected_keys 
        
                   # Load mismatched keys manually 
        
                   model_dict = model.state_dict() 
        
                   for idx, key in enumerate(missing_keys): 
        
                       dict_keys = [_ for _ in unexpected_keys if not "tracked" in _] 
        
                       model_dict[key] = state_dict[dict_keys[idx]] 
        
                   model.load_state_dict(model_dict) 
        
               return model

and to load all the weights of the pretrained network, except the very first one, this conv1 raising the error. All the others should be identical between grayscale / rgb images.

rpautrat · 2023-05-10T06:27:44Z

@rpautrat Excuse me, if I lower the network depth from 4 to 3 while training, can it improve the efficiency of the model to draw lines?

This code has not been optimized for speed, it is a research code. There would be ways to make it much more efficient though. A good starting point would be to convert the line matching purely on GPU instead of partly GPU, partly CPU (e.g. here:

SOLD2/sold2/model/line_matching.py

Line 280 in bbce15c

def filter_and_match_lines(self, scores):

).

I would not reduce the depth of the backbone from 4 to 3, the results would probably severely decrease.

For the video, we processed the frames offline independently, then combined the frames into a video. So you can choose the frame rate as you wish with this method.

xiannsa · 2023-05-10T06:51:27Z

Thank you for your help. I previously tried training from scratch by modifying the 'input_channel' in 'train_detector.yaml' to 3 and training on a synthetic dataset. However, I encountered an error: 'RuntimeError: Given groups=1, weight of size..., expected input... to have 3 channels, but got 1 channel instead.' Could it be because the synthetic dataset itself is in grayscale?

rpautrat · 2023-05-10T07:04:19Z

Yes, of course everything is by default in grayscale, so the synthetic dataset is also in grayscale. Converting it into rgb might require more work though, especially if you want to create interesting colored texture inside the geometrical shapes...

xiannsa · 2023-05-12T02:40:44Z

I apologize for the interruption. May I know which parameters can be adjusted to improve the completeness of the waterline for the ship, as shown in the image? Currently, only a part of the waterline is visible.

rpautrat · 2023-05-12T06:52:32Z

Hi, here the issue is that there is no detected junction at the end of the water line, so you need to increase the number of detected junctions.

For this, you can reduce the 'detection_thresh' of the model config, and potentially also increase the 'max_num_junctions' as well, to increase the maximum number of junctions.

xiannsa · 2023-05-12T09:34:27Z

Thank you for your suggestion. However, I would like to know if after adjusting the parameters, do I need to retrain steps 4 and 5 and modify the corresponding parameters in the "lines_match" section to see the effect?

rpautrat · 2023-05-12T12:58:41Z

No, you don't need to retrain anything there. You can directly change the parameters at test time when using your already trained model.

VitoriaCarvalho · 2024-04-11T20:35:12Z

@rpautrat

Hello! I'm trying to use SOLD² to detect lines in a stack of objects. In this case, I need to detect specific lines, not all straight segments in the image. Therefore, I need to train the model with my ground truth, but I am having difficulty understanding the pipeline. Is it necessary to train with synthetic data even in the case of a problem where it is not necessary to detect all straight segments present in the image?

How to use my own dataset for training? #82

How to use my own dataset for training? #82

Comments

xiannsa commented Apr 20, 2023

rpautrat commented Apr 20, 2023

xiannsa commented Apr 24, 2023

abhiagwl4262 commented Apr 24, 2023 • edited Loading

rpautrat commented Apr 24, 2023

rpautrat commented Apr 24, 2023

xiannsa commented Apr 25, 2023

rpautrat commented Apr 25, 2023

xiannsa commented Apr 26, 2023

rpautrat commented Apr 26, 2023

xiannsa commented Apr 26, 2023

rpautrat commented Apr 26, 2023

xiannsa commented Apr 26, 2023

rpautrat commented Apr 26, 2023

abhiagwl4262 commented Apr 27, 2023 • edited Loading

rpautrat commented Apr 27, 2023

abhiagwl4262 commented Apr 28, 2023 • edited Loading

rpautrat commented Apr 28, 2023

rpautrat commented Apr 28, 2023

abhiagwl4262 commented May 2, 2023 • edited Loading

rpautrat commented May 2, 2023

abhiagwl4262 commented May 3, 2023 • edited Loading

abhiagwl4262 commented May 3, 2023

abhiagwl4262 commented May 3, 2023

abhiagwl4262 commented May 4, 2023 • edited Loading

rpautrat commented May 4, 2023

abhiagwl4262 commented May 5, 2023

rpautrat commented May 5, 2023

abhiagwl4262 commented May 5, 2023 • edited Loading

rpautrat commented May 5, 2023

abhiagwl4262 commented May 5, 2023 • edited Loading

rpautrat commented May 5, 2023

abhiagwl4262 commented May 5, 2023

rpautrat commented May 5, 2023

xiannsa commented May 5, 2023

rpautrat commented May 5, 2023

abhiagwl4262 commented May 5, 2023 • edited Loading

xiannsa commented May 8, 2023

nmanhong commented May 8, 2023

xiannsa commented May 8, 2023

xiannsa commented May 8, 2023

rpautrat commented May 8, 2023

rpautrat commented May 8, 2023

nmanhong commented May 9, 2023

xiannsa commented May 9, 2023

rpautrat commented May 9, 2023

xiannsa commented May 10, 2023

xiannsa commented May 10, 2023

rpautrat commented May 10, 2023

rpautrat commented May 10, 2023

xiannsa commented May 10, 2023

rpautrat commented May 10, 2023

xiannsa commented May 12, 2023

rpautrat commented May 12, 2023

xiannsa commented May 12, 2023

rpautrat commented May 12, 2023

VitoriaCarvalho commented Apr 11, 2024

abhiagwl4262 commented Apr 24, 2023 •

edited

Loading

abhiagwl4262 commented Apr 27, 2023 •

edited

Loading

abhiagwl4262 commented Apr 28, 2023 •

edited

Loading

abhiagwl4262 commented May 2, 2023 •

edited

Loading

abhiagwl4262 commented May 3, 2023 •

edited

Loading

abhiagwl4262 commented May 4, 2023 •

edited

Loading

abhiagwl4262 commented May 5, 2023 •

edited

Loading

abhiagwl4262 commented May 5, 2023 •

edited

Loading

abhiagwl4262 commented May 5, 2023 •

edited

Loading