Update OptimizationResultCollection.create_basic_dataset
to preserve molecule IDs
#303
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
The goal of this PR is a long-term fix to #297. In short, the current
OptimizationResultCollection.create_basic_dataset
implementation creates a new dataset by round-tripping through OpenFFMolecule
s, which causes issues with the hashing/equivalence checks in QCArchive, leading to "the same molecules" being considered different. In turn, this causesOptimizationResultCollection.to_basic_result_collection
to return fewer records than expected.The fix is simply to pass qcelemental
Molecule
objects along directly instead of reconstructing OpenFFMolecule
s. As it turns out, this was essentially already built into qcsubmit and just required a two-line change to update some keyword arguments in theBasicDataset.add_molecule
call.Todos
create_basic_dataset
to preserve molecule hashes from the optimization datasetStatus