-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for iterative pruning based on user-specified discard conditions #214
base: main
Are you sure you want to change the base?
Conversation
Introduces the function `iteratively_prune_paths()` which will prune paths of type 1 and 3 until there are a user specified (via `min_skeleton`) number of paths left. * Documentation on pruning skeleton included. * Retains the `keep_images` option by adding it as an attribute if supplied. Required because the `.prune_paths()` method when applied returns the pruned skeleton but didn't retain this option (as far as I could work out). * Tests on dummy skeletons are included
Refines the steps taken in pruning such that large loops are retained as the target if present or if skeletons are linear then the longest path is retained. Tests are included for a range of dummy skeletons.
Refines the steps taken in pruning such that large loops are retained as the target if present or if skeletons are linear then the longest path is retained. Tests are included for a range of dummy skeletons.
I've made some changes to the code, still needs work but hopefully enough to get the gist. Ultimately, I think we need to be able to go from Skeleton to Graph and back losslessly. Like sparse array formats, each representation has strengths and weaknesses and by "allowing" either format we enable different algorithms. |
I was looking to convert NetworkX objects back to Skeletons and stumbled on the `nx.to_numpy_array()` function and the related `nx.from_numpy_array()`. Since `csr.skeleton_image` are typically Numpy Arrays I figured this would be a convenient and efficient method of converting between the two formats so have attempted to replace `skeleton_to_nx()` with this. I had written some tests for `skeleton_to_nx()` first before making undertaking refactoring. I found that `nx.from_numpy_array()` doesn't handle non-square arrays, ok not a problem we can pad them with zeros (wrote a small function to do this along with tests), but on applying `nx.from_numpy_array()` to these padded arrays the tests I wrote to originally failed, which surprised me as, in my naivety, I wouldn't expect padding with zero's to adjacency graphs to affect the resulting network representation in terms of the number of nodes and the number of edges.
@ns-rse looking at the description and implementation (not tests) of 0ad90df, I don't see how it could ever work, because a skeleton image is not what The naive way to produce the image is to make an empty image, then iterate over all the edges and set the edge coordinates to 1 in the image, before passing that to |
Yes, I'm realising I have deficiency in my udnerstanding of graph theory NetworkX and how Skan should be interacting with it. The tests are hopefully going to be useful and your code to go from |
Introduces conversion of Numpy arrays (_not_ adjacency matrices) to NetworkX graphs. This is done by assuming that each element of a 2-D boolean array[1] is a node connected by edges, we look at what values the neighbours are and > 0 the two points form an edge which is added to a NetworkX graph. The shape of the original Numpy array is saved within the `Networkx.graph` object which makes reconstructing the original Numpy array straight-forward (as we may be pruning some of the original co-ordinates may be removed). Thanks to @MaxGamill-Sheffield for guiding me to this approach and as noted a solution found on SO for how to find the co-ordinates of neighbouring cells.
@jni Spent some time on this today and have a method of going back and forth from Numpy <--> NetworkX (thanks to my colleague @MaxGamill-Sheffield who shared some code that I refactored abit). One of the problems I had cognitively was reconciling what the TopoStats work calls a skeleton with the Let me know what you think. |
Ah, sorry, I didn't have time to review this at the time. I don't think that's going to fly unfortunately. Actually, a bit of lore: all the main implementations of skan are in a file called So, I think that the 2D pixel graph is too low level to try to do a round trip — we will be swapping one performance issue for another. The correct level of abstraction for the Skeleton, imho, is the junction-junction graph, which is what you get when you make use the node id pairs (src, dst) from the |
I've been having a play around with this on ns-rse/networkx and had a go at passing the coordinates as edge attributes to NetworkX and reonconstructing paths between co-ordinates using Bresnham's Line algorithm. It works in some instances but in writing tests I discovered there is insufficient data retained. For example from skan import Skeleton, summarize
from skan._testdata import skeleton1
summarize(Skeleton(skeleton1))[['skeleton-id', 'coord-src-0', 'coord-src-1', 'coord-dst-0', 'coord-dst-1']]
skeleton-id coord-src-0 coord-src-1 coord-dst-0 coord-dst-1
0 0 2 1 3 3
1 0 2 1 3 3
2 0 2 1 4 0
3 0 3 3 4 6 ...and so we've no idea how those two points differ in terms of the path taken between them. This is going to be true of any non-linear edge and so whilst we could recombine the path co-ordinates of edges after pruning in Networkx as you suggested we would have no idea the path taken between those points. The only solution I can think of so far that doesn't involve using every point as a node is to pass the co-ordinates of the points that make up an edge as an attribute to NetworkX and concatenate those after pruning. Not tried this yet and have been asked to work on something else in the coming weeks but will think about it and try and find some time to try this out. |
@ns-rse sorry about the long wait here, as it has been very busy trying to get napari 0.4.19 out the door (still not done) and also family stuff. But I haven't forgotten this! Quick thoughts:
I don't think this should be needed because I think we should keep the path coordinates and they should be pixel-perfect — there should be no need to infer any pixels. You need to keep the info from both the summary and the skeleton using
Yes like that. But that still vastly reduces the number of nodes in the graph. |
I will try to come back to this in the next couple of weeks but didn't want to leave you hanging any longer, again, sorry about the wait! |
Don't worry at all @jni we all have lives. Hope you managed some time away from the computer. That is however super-useful I hadn't clocked I'll have a tinker next week and see how I get on. Thank you and Happy New Year |
@ns-rse would you mind rebasing this on main? I can do it if you are too busy right now though. |
Adapting to NetworkX graphs. Currently tests fail for src/skan/test/test_csr.py::test_iteratively_prune_[paths.multiple_paths] as there is no 'discard' argument. Will revisit and try and work this out, been so long I don't understand how this argument evaluates to boolean and how to construct the tests.
@jni I merged main in and solved the conflicts, tests failed and so I tried to sort those out locally. Fixed the failing ones but there were more that failed and I haven't got them all sorted yet though as I don't understand how the |
Even when there is only one edge, for consistency, every edge in a multi-graph also contains a key (starting at 0). We use the key everywhere, which makes the code a bit more complicated if you don't expect multigraph conventions, but more consistent overall.
btw @ns-rse it's a cool data sample — can you tell me what the xy pixel spacing/resolution is? We can incorporate that into the example. |
This is useful for debugging - can make a parameter later
Hi @jni
😆 I encountered this with the linear molecules though, it just kept on pruning everything back. I think I solved it by looking for the longest path, but not sure how that transfers to loops/closed structures where a break is then introduced.
The pixel to nanometre scaling on these images varies depending on the resolution used in the scanning. I think this image is from Thanks for continuing to hack away at this 👍 Let me know if there is anything I can help with. |
... nm/pixel? ... That's cool/crazy! 😃 |
First stab at iterative pruning.