Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Watershed segmentation of particles #18

Open
ivonindima opened this issue Jul 14, 2024 · 9 comments
Open

Watershed segmentation of particles #18

ivonindima opened this issue Jul 14, 2024 · 9 comments

Comments

@ivonindima
Copy link

Hello!
Do you have any plans to implement a distributed computing mode for the following functions: localmaxima(), cleanmaxima(), labelmaxima(), grow()?
This would significantly speed up the process of watershed segmentation of particles.
Thanks a lot!

@arttumiettinen
Copy link
Owner

Greetings from the summer holidays and sorry for the delay in replying,

Thanks for the suggestion. In principle, I'm interested in making distributed versions of all the functionalities. In this case, the localmaxima function is non-trivial to convert to distributed processing mode: it defines a local maximum as a connected region of pixels with the same pixel value, where all the neighbouring pixels have smaller pixel values. Because of this, the local maximum might span almost the entire image. The normal "process in smaller blocks + handle block edges separately" distribution strategies don't work directly in this case, and the distributed processing would be quite similar to what analyzeparticles does. I need to think about the ways to go around the problem.

How large are your images? Would you like to have a distributed computing possibility solely for processing speed, or do you need it to process larger images? Do you have a size limit for the maxima (number of pixels in maximum region, diameter, etc.)?

Best,
Arttu

@ivonindima
Copy link
Author

Hello and thanks for the reply!

I work with 3D images obtained with a microtomograph. Their typical size is 2000x2000x4000 voxels. For desktop processing, I select a zone of interest of 1000x1000x1000 voxels. On a workstation, the full image can also be processed with pi2. First of all, I would like to speed up the watershed segmentation process as much as possible at such sizes.

I did a small research with the profiler, which showed that the main computing time is taken by the localmaxima and grow functions in a 20/80% ratio. Both of these functions work in single-thread mode (the CPU load does not exceed 10%), while the dmap calculation takes a few seconds at 100% CPU load.

I attach an example image (one slice from a 3D array). The quality of the separation is excellent and I am looking for a way to speed up the computing process. In previous research, we used Dask to solve same problem (da.overlap.overlap and da.overlap.trim_internal). But other libraries can't provide similar performance to pi2, but pi2 is not compatible with Dask.

Hope this information is useful.

screenshot

@arttumiettinen
Copy link
Owner

Hi,

Thanks for more information. This changes things a lot! For the image size you are dealing with, it is probably not worth it to bother with distributed processing. At least not before the bottleneck you have identified, i.e. the grow command, is properly optimized and parallelized.

Indeed, there has been some work towards a faster grow command in the past. Going through the old code and finishing it should not be a big task. I'll check what we have already, and let you know in the coming days when there is something to test.

Best,
Arttu

@arttumiettinen
Copy link
Owner

Hi,

The latest commit in the experimental branch introduces a new version of the grow function. For my test data of 500^3 pixels, its runtime is approximately 10 % of the old version. There is a caveat, however. The results of the old grow and the new grow are not 100 % equal. They should be close, but not equal. This is because in the new version, the growing algorithm was changed slightly. The main change visible to the user is that now the order of the seed points should not affect the result, but previously it did.

Please give the new version a try and report back. I'm looking forward to hearing if things are faster now with your large dataset.

Best,
Arttu

@ivonindima
Copy link
Author

Hi!
The pi.grow() function in the experimental branch is much faster! I will do additional tests on different images to see how the results differ from previous versions, but it is already clear that the performance has improved significantly.
Thanks for your work!

P.S. The source code built without problems under Linux, but when I run the compiled version under Windows, I get the error: "Could not find module '\src\pi2\pi.dll' (or one of its dependencies)". Can I ask you to provide the binary versions for Windows? I think there is something wrong with my libjpeg build.

@arttumiettinen
Copy link
Owner

Hi,

Windows builds are sometimes problematic. The newest build is now available at https://github.com/arttumiettinen/pi2/releases/tag/v4.4.6

Best,
Arttu

@ivonindima
Copy link
Author

Hi,

Windows builds are sometimes problematic. The newest build is now available at https://github.com/arttumiettinen/pi2/releases/tag/v4.4.6

Best, Arttu

Thanks for the reply. But only two source code archives are attached to this release. Am I missing something?

@arttumiettinen
Copy link
Owner

Sorry, the .zip file was missing from the release for some reason. Now it should be there.

@ivonindima
Copy link
Author

Hello and apologies for the delayed response.

I tested the new algorithm on different images, and it works great. On my test image, the execution time of the pi.grow() function has decreased by more than 20 times. The total execution time of our workflow has been reduced by more than three times.

Returning to the discussion of computation time for the pi.localmaxima() and pi.grow() functions. In the previous version, the pi.grow() function took on average four times longer than pi.localmaxima(), but in the new version, the ratio has reversed: pi.grow() now runs five times faster than pi.localmaxima().
In our specific case, we could limit the maximum region size to, for example, 1/5 of the total image, or another value (in the future, I could suggest an algorithm for automatically estimating this value based on the properties of the input image). I would consider it optimal to keep the option to compute "precise localmaxima" (single-threaded) and "fast localmaxima", which works faster but has size limitations.
If you decide to work on speeding up localmaxima, I am ready to provide additional data.

Thank you for developing pi2 – it has no analogs at the moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants