-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dynamic coding for GridPROTEUS #162
Comments
I would suggest to hold off major work on this until we refactored the proteus (single) run code. My hope is to fold |
Hi, I edited tools/grid_proteus.py on my computer with new parameters that I wanted to test for the escape (Pxuv, efficiency and semimajoraxis) and I ran into the following error:
@nichollsh suggested having |
Do you know when the error was introduced? |
This might have been here since the script was created. I have not found this error on my system at any point in the past. This article (with the fix mentioned above) suggest that it depends on the platform in some strange way. https://medium.com/devopss-hole/python-multiprocessing-pickle-issue-e2d35ccf96a9 |
You could argue whether fork vs. spawn is better, but note that fork is not available on Windows, buggy on macs, and discouraged because forking has lots of undefined behaviour. This can lead to very hard to reproduce bugs. Looking at the code, my take is that this code should use threading module instead of multiprocessing (e.g. via a |
There's definitely a lot of room for improvement in this script. Using threadpools makes more sense than the current bespoke formulation for queueing. Ideally, it would also have an option to run the grid across multiple nodes using the Python SLURM library. This is not particularly high-priority at the moment though - as long as the current script works for everyone. |
I agree with your assesment that it is not very high priority. SLURM or subprocess is not the major development bottleneck here imo, if the rest of the code is well factored. |
Coming back to this a bit, because I believe dynamic coding for grid runs is quite essential to the functioning of the code. How could this be implemented with the use of the current config files? For example, instead of If not, what are the alternatives? |
I like this idea a lot. It's intuitive and doesn't require the user to learn a new configuration file format. Seems quite doable, since TOML already supports arrays/lists. I am not so much a fan of the automatic detection thing, since the user should be clear on what exactly is going to be run when they pass a configuration file. Maybe we could have |
This is a good idea and will simplify the implementation. I want to shortly connect this to #204. Assuming we would use pymultinest for an inverse-PROTEUS implementation, the call function to run a PROTEUS grid will be a critical choice to also be passed to an external module. Alternatively, pymultinest could become a standard PROTEUS sub-module, and the config file would get an additional "retrieval" section of parameters that can be retrieved. Once the appropriate flag is passed (for example, via |
I like the idea as well from a user point of view, but overloading the config object in this way will be a hugely complex task. There are libraries that can help with this sort of work (e.g. for use in uncertainty quantification studies). |
I can see how this might get complicated under the current formulation. Which libraries did you have in mind? |
Here are a bunch from when I last looked for them. There might be more, these are in the context of UQ work: They can all help with setting up a cartesian grid, hypercube sampling, and plotting the results if you are interested in UQ work. All we need to do is code the interface. Let's say you want to vary Some of these tools have support for submitting them to a cluster as well. I know that at least easyvvuq works with SLURM. The point is, that libraries exist to help with this sort of work and I think it is worth investing a bit of time there. |
Currently, a large amount of the configuration of GridPROTEUS.py is hardcoded in the
if __name__
statement. This would ideally be configured in another manner, such as by a configuration file or with a CLI.The text was updated successfully, but these errors were encountered: