-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Watershed segmentation of particles #18
Comments
Greetings from the summer holidays and sorry for the delay in replying, Thanks for the suggestion. In principle, I'm interested in making distributed versions of all the functionalities. In this case, the localmaxima function is non-trivial to convert to distributed processing mode: it defines a local maximum as a connected region of pixels with the same pixel value, where all the neighbouring pixels have smaller pixel values. Because of this, the local maximum might span almost the entire image. The normal "process in smaller blocks + handle block edges separately" distribution strategies don't work directly in this case, and the distributed processing would be quite similar to what analyzeparticles does. I need to think about the ways to go around the problem. How large are your images? Would you like to have a distributed computing possibility solely for processing speed, or do you need it to process larger images? Do you have a size limit for the maxima (number of pixels in maximum region, diameter, etc.)? Best, |
Hello and thanks for the reply! I work with 3D images obtained with a microtomograph. Their typical size is 2000x2000x4000 voxels. For desktop processing, I select a zone of interest of 1000x1000x1000 voxels. On a workstation, the full image can also be processed with pi2. First of all, I would like to speed up the watershed segmentation process as much as possible at such sizes. I did a small research with the profiler, which showed that the main computing time is taken by the localmaxima and grow functions in a 20/80% ratio. Both of these functions work in single-thread mode (the CPU load does not exceed 10%), while the dmap calculation takes a few seconds at 100% CPU load. I attach an example image (one slice from a 3D array). The quality of the separation is excellent and I am looking for a way to speed up the computing process. In previous research, we used Dask to solve same problem (da.overlap.overlap and da.overlap.trim_internal). But other libraries can't provide similar performance to pi2, but pi2 is not compatible with Dask. Hope this information is useful. |
Hi, Thanks for more information. This changes things a lot! For the image size you are dealing with, it is probably not worth it to bother with distributed processing. At least not before the bottleneck you have identified, i.e. the grow command, is properly optimized and parallelized. Indeed, there has been some work towards a faster grow command in the past. Going through the old code and finishing it should not be a big task. I'll check what we have already, and let you know in the coming days when there is something to test. Best, |
Hi, The latest commit in the experimental branch introduces a new version of the grow function. For my test data of 500^3 pixels, its runtime is approximately 10 % of the old version. There is a caveat, however. The results of the old grow and the new grow are not 100 % equal. They should be close, but not equal. This is because in the new version, the growing algorithm was changed slightly. The main change visible to the user is that now the order of the seed points should not affect the result, but previously it did. Please give the new version a try and report back. I'm looking forward to hearing if things are faster now with your large dataset. Best, |
Hi! P.S. The source code built without problems under Linux, but when I run the compiled version under Windows, I get the error: "Could not find module '\src\pi2\pi.dll' (or one of its dependencies)". Can I ask you to provide the binary versions for Windows? I think there is something wrong with my libjpeg build. |
Hi, Windows builds are sometimes problematic. The newest build is now available at https://github.com/arttumiettinen/pi2/releases/tag/v4.4.6 Best, |
Thanks for the reply. But only two source code archives are attached to this release. Am I missing something? |
Sorry, the .zip file was missing from the release for some reason. Now it should be there. |
Hello and apologies for the delayed response. I tested the new algorithm on different images, and it works great. On my test image, the execution time of the pi.grow() function has decreased by more than 20 times. The total execution time of our workflow has been reduced by more than three times. Returning to the discussion of computation time for the pi.localmaxima() and pi.grow() functions. In the previous version, the pi.grow() function took on average four times longer than pi.localmaxima(), but in the new version, the ratio has reversed: pi.grow() now runs five times faster than pi.localmaxima(). Thank you for developing pi2 – it has no analogs at the moment. |
Hello!
Do you have any plans to implement a distributed computing mode for the following functions: localmaxima(), cleanmaxima(), labelmaxima(), grow()?
This would significantly speed up the process of watershed segmentation of particles.
Thanks a lot!
The text was updated successfully, but these errors were encountered: