-
Notifications
You must be signed in to change notification settings - Fork 227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable Metal, Facet, site and terrace specification in Molecule #2452
Conversation
2a4c236
to
a813ff0
Compare
Codecov Report
@@ Coverage Diff @@
## main #2452 +/- ##
==========================================
+ Coverage 48.15% 48.26% +0.11%
==========================================
Files 110 110
Lines 30627 30726 +99
Branches 7989 8032 +43
==========================================
+ Hits 14748 14830 +82
- Misses 14350 14365 +15
- Partials 1529 1531 +2
... and 2 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made some comments and edits.
My main concern is lack of unit tests and documentation.
rmgpy/molecule/adjlist.py
Outdated
|
||
if line.split()[0] == 'metal': | ||
if group and "[" in line: | ||
out = line.split('[')[1][:-1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would have used a regex, and perhaps with more error handling. But my main concern is this sort of thing (splits, selecting by index, slicing) is a devil to review for correctness by reading the code (I got it wrong the first 4 times I drafted this comment), especially without any comments saying what its goal is. Can we have some unit tests? (codecov says it's not tested) and should it be documented?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, added regex checking and error handling. Added unittests and I've added it to the adjacency list documentation.
rmgpy/molecule/adjlist.py
Outdated
if line.split()[0] == 'metal': | ||
if group and "[" in line: | ||
out = line.split('[')[1][:-1] | ||
metal = out.split(',') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean it must be [Cu,Sn,Ag]
and not [Cu, Sn, Ag]
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just noticed we have a check above in mistake1
, but that only checks for spaces inside {}
and not []
(despite the example in the comment on line 541)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We now loop through the split and strip all of the items so any whitespace there should be fine.
def __init__(self, element=None, radical_electrons=0, charge=0, label='', lone_pairs=-100, coords=np.array([]), | ||
id=-1, props=None): | ||
def __init__(self, element=None, radical_electrons=0, charge=0, label='', lone_pairs=-100, site='', morphology='', | ||
coords=np.array([]), id=-1, props=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just noticed the coords=np.array([])
. I know it's not being changed in this PR, but it did catch my eye. Isn't having a mutable (like an array) default argument in the def
statement a classic code smell? (eg.). Do we have a good reason for it?
Update: I guess a numpy array can't be appended to, and if it's starting empty then it is therefore immutable after all. And by keeping it always an array type, (never a NoneType
) we're perhaps better able to cythonize. So you can ignore this comment.
rmgpy/molecule/group.py
Outdated
""" | ||
if wildcards: | ||
for order in self.order: | ||
if abs(order) <= 1e-9: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why "1e-9"? Can you add explanation in the docstring or comments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this was part of an existing bug that I copied over where the H bonds used to be 0 value, but then they were changed to 0.1 and check was still left against order instead of (order - 0.1). I've fixed that issue here now for H and R bonds.
@@ -873,7 +884,7 @@ def get_bond_string(self): | |||
the atom labels in alphabetical order (i.e. 'C-H' is possible but not 'H-C') | |||
:return: str | |||
""" | |||
bond_symbol_mapping = {0.1: '~', 1: '-', 1.5: ':', 2: '=', 3: '#'} | |||
bond_symbol_mapping = {0.05: '~', 0.1: '~', 1: '-', 1.5: ':', 2: '=', 3: '#'} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does 0.1 stand for currently? Is it intended for 0.05 reaction bond to share the same symbol as the 0.1 bond?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0.1 is an H bond, 0 is vdW bond, now 0.05 is R bond. Yes that's intentional.
@@ -524,11 +524,17 @@ def from_adjacency_list(adjlist, group=False, saturate_h=False, check_consistenc | |||
multiplicity = int(line.split()[1]) | |||
if len(lines) == 0: | |||
raise InvalidAdjacencyListError('No atoms specified in adjacency list: \n{0}'.format(adjlist)) | |||
|
|||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you run an auto-formatter to remove all these unnecessary blanks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's inappropriate to run the auto-formatter on RMG files that are commonly modified, because I think that will create very large nasty merge conflicts if anyone modified the file and tries to rebase?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Besides what @rwest mentioned, if you use VSCode, there are options where you can choose only run auto-formatter on the lines you modified as well!
# Iterate over the remaining lines, generating Atom or GroupAtom objects | ||
for line in lines: | ||
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unnecessary blanks
@@ -249,8 +249,8 @@ def get_features(self): | |||
benzene=[0], lone_pairs=[0]) | |||
# Occupied surface site: | |||
ATOMTYPES['Xo'] = AtomType('Xo', generic=['X'], specific=[], | |||
single=[0, 1], all_double=[0, 1], r_double=[], o_double=[], s_double=[], triple=[0, 1], | |||
quadruple=[0, 1], benzene=[0], lone_pairs=[0]) | |||
single=[0, 1, 2, 3, 4, 5, 6, 7, 8], all_double=[0, 1, 2, 3, 4, 5, 6, 7, 8], r_double=[], o_double=[], s_double=[], triple=[0, 1, 2, 3, 4, 5, 6, 7, 8], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this mean? Do we allow up to 8 single bonds on surface site or something? Can you add comments to this change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this makes it possible for the surface site to have up to 8 bonds...the intention is that you can represent local surface structure (what sites neighbor other sites) which affects adsorbate energies and reaction rates on surfaces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some questions and comments.
@@ -772,7 +836,7 @@ def from_adjacency_list(adjlist, group=False, saturate_h=False): | |||
Saturator.saturate(atoms) | |||
|
|||
# Consistency checks | |||
if not group: | |||
if not group and check_consistency: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why adding the option to skip consistency check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you allow consistency checking for a molecule with reaction bonds, it will get upset because it has bonds RMG doesn't expect it to have, it naturally won't match RMG atomtypes properly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might merge this PR anyway, but should we instead (or as well) make the consistency checkers cope with reaction bonds? Currently you could create an adjacancy list with reaction bonds AND other inconsistencies, and not notice the other inconsistencies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is true, it would be good to at least consistency check parts of the Molecule that don't have reaction bonds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll let you open a new issue (or mental note) to fix this :)
I assume the final goal is to enable RMG to generate mechanisms with metal, facet, site, and terrace specification? Can one use these specifications to generate a RMG model at this point, or is that intended for another PR? If the former, can you add a regression test? |
Not right now on my end at least. My intention for now is to make it possible to generate good 2D representations for SIDT of surface configurations. |
a813ff0
to
d52a2b6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good. (I mentioned a few minor questions)
d52a2b6
to
b43d423
Compare
…up objects consistency check
reaction bonds
b43d423
to
c66f580
Compare
This passes all unit, functional and database tests locally and no regression changes are expected. Should be ready. If we're comfortable merging this without CI that would be appreciated, but is not absolutely necessary. |
I restarted the CI and all the unit tests etc ran, but quite a few of the regression tests have different core and edges -- is this to be expected by random fluctuations? what should we do? tests we are in the habit of ignoring are not helpful.
See (Unfortunately there seems to be a bug causing a crash when comparing RMS_liquidSurface_ch4o2cat |
It looks like it doesn't print the differences anymore so we can't tell what's changed? |
The only way I can currently imagine this changing anything would be if it broke isomorphism comparison, but that would change nearly every rate coefficient from families, which should break all or nearly all regression tests on both core and edge. |
Comparing the model edge for the liquid_oxidation
Perhaps an instance of something like #2010 ? This may be related #1027 but is 6 years old and quite likely outdated. We need to decide what counts as a regression worth failing the tests for! |
Regardless I don't think this is related to this PR. |
Merged, since I think all Hao-Wei's comments have been addressed, as well as mine.
|
Also adds a "Reaction" bond order for creating bonds between atoms participating in a reaction so the structure can be considered as a whole without specific labels and enables sites to be bonded to each other.
This enables creation of molecules from adjacency lists such as: