-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Basic test setup #80
Comments
I'm trying to set it up so that Meson can run a few test cases (to begin with, simply run the binary). One thing that is not clear to me is what the |
At the moment the |
Thanks for the quick reply @AleksiNummelin Reading through the responses in your PR it seems that this is something that needs to be decided before I can move forward with testing. |
Alternatively, for testing we would just need a minimal workable |
We discussed this today in the BLOM-core meeting. For now, one can use for example the channel setup for testing, but we agreed that a first proper test case should be built upon the fuk95 case. In this context, I'd imagine testing would imply building, running, and checking diagnostics against a reference (+these cases could be used for testing scalability etc.). Therefore a full-flexed test case should also include checks for the physical parameters (tracer conservation, matching a reference kinetic and potential energy budgets etc.). Related to this discussion, we thought that there is also a need to move some of the idealized cases to another folder. I created another issue for this #86 since that discussion might be a bit different from the focus here. |
Could you help me set this up? I don't know what would be needed, but I have the general setup for testing and generating an individual |
I have placed a Fortran namelist (“limits”) for the fuk95 test case here: https://gist.github.com/matsbn/718c1419cc1ecc064d78d18f5687439f This test case does not need any other input files. To shorten the integration time, "NDAY2" can be reduced from 10 to say 1. |
I downloaded the Thread 1 "fuk95_blom" received signal SIGSEGV, Segmentation fault.
0x00000000005016df in mod_dia::diaacc (m=<error reading variable: Cannot access memory at address 0x7fffff46f978>,
n=<error reading variable: Cannot access memory at address 0x7fffff46f970>,
mm=<error reading variable: Cannot access memory at address 0x7fffff46f968>,
nn=<error reading variable: Cannot access memory at address 0x7fffff46f960>,
k1m=<error reading variable: Cannot access memory at address 0x7fffff46f958>,
k1n=<error reading variable: Cannot access memory at address 0x7fffff46f950>) at ../phy/mod_dia.F:972
972 subroutine diaacc(m,n,mm,nn,k1m,k1n) I'm developing the unit tests on the branch |
I was able to overcome this locally by increasing the stack size with The next step is probably to create some scripts that can be run to check the results of the run. I can implement this if I know what the different files mean, their expected values and their type. |
To follow up on this. Now that the initial test can be run we need to check that the output is as expected. For me it would be easiest to write a small script in Python that could check the output, but I need some help with the file type and what the expected output should look like. |
There is a tool called "cprnc" that has been created for comparison of output netCDF files for CESM. The cime source that is bundled with NorESM is a bit old, but it seems to work. The most recent version is available at There is also a python version of this tool, but it does not seem to be maintained Basically, the tool takes two netCDF files as input and creates a report on the difference in the data, while ignoring any information related to the specific run. I compiled a version of the tool on Betzy (using the source code from NorESM2.0.4), which is available from Maybe something along this line would be useful as a check on the output? |
A bit late with a response here, but for a more comprehensive test in the future, would be maybe interesting to use a package like xarray (if we can afford having a conda environment that is fast to install). There is some nice functionality (coming from numpy) that allows for very basic checks as well, for example checking if two variables are equal to a tolerance http://xarray.pydata.org/en/stable/generated/xarray.testing.assert_allclose.html With xarray it would be easy to implement checks for dynamical consistency (energy levels etc. we've talked about). |
Also very late with my response here, sorry for that! If one wants to test for bit-identical simulations, I think the available checksum functionality in BLOM should work well. Actually each BLOM simulation dumps " chksum: dp: 0x ... " to stdout at the end of the simulation that I have found very reliable in detecting simulation differences. This could be extended, e.g by adding a checksum for a sensitive iHAMOCC field. There is of course a value in actually checking the output since the generation of output can also be erroneous. For detecting simulation differences within an acceptable tolerance, I promised to implement some energy diagnostics in BLOM that would dump say global kinetic and potential energy sums to stdout. For "simple" metrics like that, I believe this approach would be easier to integrate in a CI framework compared to relying on external tools for obtaining these metrics. More sophisticated metrics might surely be more convenient to develop in something other than Fortran. Unfortunately, I have had no time to implement these energy metrics yet, but hopefully I can make a stab at it very soon. |
I think both of these tracks should be followed. Bit-identical checksums are excellent for CI where we just need to verify that changes do not affect the output of simulation. However, for day-to-day development it would be better with tolerance based tests so that one can better gauge the effect of changes while developing. Tolerance based testing is also essential for moving to GPUs where bit-identical will be difficult (maybe even impossible) and it would here be good to be able to measure the difference in accuracy. |
It would be nice to setup some basic testing for BLOM, both for continuous integration (to ensure that the output of the different compilers actually work) and for assurance when updating the code that nothing is broken. With a good test setup it would also be easier to make performance improvements since there would be ways to ensure that the new code is up to specification.
Meson has built in support for unit testing which we could leverage, however, the structure of these unit tests are quite free. Ideally the executable used for the test should be self-contained (meaning it should test a few relevant factors), deterministic (the data which is tested against should not be affected by changes to the code) and quite quick to run.
Several avenues are open to us when implementing this:
I would be available to help with this, especially the Meson and CI integration, but I would require some help defining test cases and figuring out how to implement it in BLOM.
The text was updated successfully, but these errors were encountered: