Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

potential improvement of distance matrix calculation #60

Open
ohagen opened this issue May 2, 2022 · 3 comments
Open

potential improvement of distance matrix calculation #60

ohagen opened this issue May 2, 2022 · 3 comments
Labels
enhancement New feature or request

Comments

@ohagen
Copy link
Member

ohagen commented May 2, 2022

Joe at Uni Bristol looked into ways of improving distance storage and calculations. I past here some text.
Problem: I have sets of global climate data at a reasonably high spatial resolution (720x360), but when I first tried out gen3sis with these landscape grids, I quickly ran into vector size allocation problems. I've been running my simulations at a lower resolution (180x90) with great results.
Possible general solution: Ways in which the simulation capacity could be boosted and I found a potential solution in the tools I was developing.
Description: In short, it relies on representing the gridded landscape as a graph of local connections which needs much less storage space than a full sparse distance matrix for a landscape grid of a given size. Any pairwise distances that are needed in the other internal functions can then be found very quickly using the cppRouting R package, maintaining the efficiency of the simulation loop at run time. The graph method also has a few other potential benefits, which are what led me onto the problem of working with gridded landscapes in the first place I am still figuring out a few issues with my implementation (in particular, I am struggling with maintaining the optimality of the method in the tbdscan C++ functions), but if this would potentially be of interest to you, I would be very happy to send over some of the code I have put together. I have tried to make the edits as compatible with the current structure of gen3sis as possible, but they are still non-trivial so I thought it might be best to contact you via email, rather than submitting any edits via Github or similar.

@ohagen ohagen added the enhancement New feature or request label May 2, 2022
@ohagen
Copy link
Member Author

ohagen commented May 2, 2022

Joe, thank you very much!!!! Question, would this involve changing the landscape distance object? Are there further dependencies besides cppRouting?

@benj919
Copy link
Collaborator

benj919 commented May 2, 2022

I agree that the current implementation of the distance matrices is the part of the simulation with the worst scaling behaviour.

Part of this idea has already been implemented. The distances can be saved as a sparse pair-wise matrix (local_distances). Currently we can either save the local distances or the full all-to-all distances. We can calculate limited distance matrices during a simulation run only (for the moment). This will be updated to allow pre-computing those partial distance matrices "soonish".

Calculating pair-wise distances on demand will have tremendous impacts on performance and/or require a quite elaborate caching scheme. The distance matrices are used for different things, obviously in the dispersal, but also for the clustering in both the evolution and speciation functions.

Switching over to such a scheme would completely replace the tdbscan function calls and "should" work with the cpp implementation for the distances by setting all inhabited cells as starting cells for the distance calculations.

It should be "straight forward" to change those implementations over to a graph based approach, but the computational overhead is going to be tremendously expensive as those calculations will have to either be cached or repeated multiple times for each species in the simulation.

If you (Joe) are interested in exploring this in detail I suggest we do a direct call to discuss this upfront?

@ohagen
Copy link
Member Author

ohagen commented May 2, 2022

Tks Benji for jumping into this. We should have a meeting soon. I will let you know when.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants