Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

to_openmm_positions includes virutal sites but Interchange.positions.to_openmm() does not #1054

Closed
IAlibay opened this issue Sep 19, 2024 · 4 comments
Labels
documentation Improvements or additions to documentation ergonomics openmm

Comments

@IAlibay
Copy link

IAlibay commented Sep 19, 2024

Expectation

  1. Positions obtained from an interchange object should include all the particles in the system, including virtual sites.
  2. Various API points for getting positions should all lead to the same outcomes

Description

  1. Calling Interchange.positions.to_openmm() yields positions without virtual sites.
  2. Calling to_openmm_positions yields positions with virtual sites.

Reproduction

from openff.toolkit import Molecule, Topology, ForceField
from openff.interchange.components._packmol import solvate_topology
from openff.interchange.interop.openmm import to_openmm_positions
from openff.interchange import Interchange

water = Molecule.from_smiles('O')
water.assign_partial_charges(partial_charge_method='gasteiger')

ligand = Molecule.from_smiles('CCCCC')
ligand.generate_conformers()

off_top = Topology.from_molecules(ligand)
solvated_off_top = solvate_topology(topology=off_top)

ff = ForceField('openff-2.2.0.offxml', 'opc.offxml')
inter = Interchange.from_smirnoff(topology=solvated_off_top, force_field=ff)
print("Number of particles with .positions: ", len(inter.positions.to_openmm()))
print("Number of particles with to_openmm_positions: ", len(to_openmm_positions(inter)))

Output

Number of particles with .positions:  2035
Number of particles with to_openmm_positions:  2707

Software versions

Interchange v0.3.29

@mattwthompson mattwthompson added documentation Improvements or additions to documentation ergonomics labels Sep 20, 2024
@mattwthompson
Copy link
Member

mattwthompson commented Sep 20, 2024

Hmm, yeah that's a bit of a surprises. Each of these things have some reason to behave the way they do but the inconsistency isn't great.

Positions obtained from an interchange object should include all the particles in the system, including virtual sites.

Unfortunately, this is the heart of the matter. This is a consequence of the first-class view mirroring the toolkit's behavior of not knowing about virtual sites but other API points knowing about them. Right now, Interchange.topology can't store virtual sites1, and I'm not sure how I feel about Interchange.positions having a different shape than the number of atoms.2

Various API points for getting positions should all lead to the same outcomes

Lots of API points have arguments like include_virtual_sites wired through them since there are some use cases that only want the atomic particles; I'm not sure how this would implemented on something that's a pydantic.Field, but maybe there's a way to wrap something up that handles both cases?

At very least we should document all of this better, but I'm not sure how best to do that since this touches so many parts of the codebase.

Footnotes

  1. This is because it's a Topology object from the toolkit, and its representation of molecules does not store virtual sites. I actually advocated for this change, since molecules don't have virtual sites and I'm not sure how one would usefully describe virtual sites in the absence of a force field.

  2. In the past, we've talked about the need for a "post-parametrization" representation of a topology. Including virtual sites, which are stored elsewhere in exporters and in pretty fly-by-night ways, is a key use case that would be enabled by this change. I think some of the particle accounting would also be made easier, both internally and for downstream users.

@IAlibay
Copy link
Author

IAlibay commented Sep 20, 2024

At very least we should document all of this better,

To me this sounds like a good starting solution. I think there's a bit of inconsistency in the docs (or maybe I'm looking at the wrong code examples), so if everything encourages to_openmm_positions, then it shouldn't be as easy for folks to fall into this trap.

@mattwthompson
Copy link
Member

Wow, I see I interpreted this originally to be about the difference between .to_openmm_positions and .topology.positions.to_openmm() (even though the API is slightly different) but you were actually getting caught on .positions.to_openmm(). I'm punting on really fixing that since, as mentioned above, just documenting things is a better short-term solution.

In the future, we ought to have some sort of InterchangeTopology subclass that papers over these issues, and maybe something similar with the positions attribute.

In any case, I've changed the default behavior of the OpenMM positions getter to exclude virtual sites by default - arguably what it should have always done - to avoid this particular API inconsistency. I agree that including virtual sites by default is the better of the two options, but flipping that switch requires more internal refactors that I'm unlikely to be able to get to soon.

@mattwthompson
Copy link
Member

Calling this resolved (for now) with #1074, but see above for ideas of future work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation ergonomics openmm
Projects
None yet
Development

No branches or pull requests

2 participants