Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Major updates of the arithmetic PCV code #564

Merged
merged 14 commits into from
Sep 5, 2023
Merged

Conversation

HanatoK
Copy link
Member

@HanatoK HanatoK commented Aug 2, 2023

The new code has following improvements with respect to the old code:

  1. Correctly calculate s and z with logsumexp and shifted softmax to
    avoid overflow and underflow;
  2. Correctly calculate the derivatives of s and z with respect to vector
    sub CVs (also use logsumexp and shifted softmax);
  3. Implement debug gradients for s and z with respect to sub CVs that
    have no explicit gradients;
  4. Add new tests for derivatives;

The new aspath and azpath also implement the PCV calculation using Parrinello's formula in atomic Cartesian space.

@HanatoK
Copy link
Member Author

HanatoK commented Aug 3, 2023

This implementation compute $s$ using $\exp(\mathrm{LSE}_0 - \mathrm{LSE}_1)$, which seems OK in all my tests and my simulation crashed with direct calculations. An alternative way could be summing up the shifted-softmax. According to this paper, in case of numerical stability, using

$$ g_j = \frac{{{e}}^{x_j-a}}{ \sum_{i=1}^n{{e}}^{x_i-a}} $$

is better than

$$ g_j = \exp\left(x_j - \log\sum_{i=1}^n{{e}}^{x_i}\right) $$

I am not sure if it is better in the case of calculating $s$.

@jhenin
Copy link
Member

jhenin commented Aug 20, 2023

I think our main concern here is stability and performance. Since we use double precision, accuracy should be ok once we avoid under/overflow.

@HanatoK
Copy link
Member Author

HanatoK commented Aug 20, 2023

I think our main concern here is stability and performance. Since we use double precision, accuracy should be ok once we avoid under/overflow.

Thanks @jhenin ! The sum of shifted-softmax is used in this PR. As for performance I think the two ways are basically the same.

Using LogSumExp instead of calculating the equations directly to get rid
of NaN
The new code has following improvements with respect to the old code:
1. Correctly calculate s and z with logsumexp and shifted softmax to
   avoid overflow and underflow;
2. Correctly calculate the derivatives of s and z with respect to vector
   sub CVs (also use logsumexp and shifted softmax);
3. Implement debug gradients for s and z with respect to sub CVs that
   have no explicit gradients;
4. Add new tests for derivatives;
Save the intermediate results from the calculation of s and z, and then
use them to accelerate the calculation of derivatives, and also remove
separate logsumexp and softmax functions.
Copy link
Member

@giacomofiorin giacomofiorin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great changes.

It would be better if the tests were included in the top folder rather than being specific to the NAMD sub-folder. However, this would require some more effort to handle the additional files.

@giacomofiorin giacomofiorin merged commit e1bc65a into Colvars:master Sep 5, 2023
15 checks passed
@giacomofiorin giacomofiorin mentioned this pull request Aug 5, 2024
9 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants