Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split MMSpec and MLSpec from QCSpec #152

Open
jthorton opened this issue Jun 29, 2021 · 5 comments
Open

Split MMSpec and MLSpec from QCSpec #152

jthorton opened this issue Jun 29, 2021 · 5 comments

Comments

@jthorton
Copy link
Contributor

To allow for more specific validation of calculation keywords and settings it would be good to split the current QCSpec class into 3 separate objects with their own validation. QCSpec for QM calculations MMSpec for MM calculations and MLSpec for ML calculations.

@jthorton
Copy link
Contributor Author

jthorton commented Feb 1, 2024

Thinking about this some more I think it might be better to have a unique spec for each supported program to allow for very specific validation of inputs for example psi4 supports calculating MBIS charges where as other QM programs like pySCF or ORCA don't and they each have different implicit solvation models. I am not sure how backwards compatible this would be with old datasets this will need testing or some converter may need to be implemented.

These models will also be needed in bespokefit, one issue is that bespokefit does not need the spec_name and spec_description fields which qcarchive does which might complicate creating a shared model which can be used by both. It might work to split the specs to have a base spec which handles the validation of the program inputs and then for qcsubmit to wrap this in a new class composed of the name description and the program spec, something like

# Psi4 base spec with validation
class Psi4Spec(_BaseSpec):
    program: Literal["psi4"]
    method: str
    basis: Optional[str] = None
    implicit_solvent: Optiona[Union[PCMSettings, DDXSettings]] = None
    scf_properties: List[str] 
    ....

class NammedSpec(_BaseSpec):
     spec_name: str
     spec_description: str
     spec: Union[Psi4Spec, ANISpec, GFNSpec, SmirnoffSpec ...]
     ...

@mattwthompson
Copy link
Member

How many programs would this be? (Not shooting the idea down, just need to be reminded what "program" means in this context)

@jthorton
Copy link
Contributor Author

jthorton commented Feb 5, 2024

Yeah that's the main issue is that it could quickly grow out of control we currently have validation for 4 (Psi4, Torchani, xTB, OpenMM). Here the program refers to the engine that does the calculation so there is a massive range of possibilities for example qcengine will soon support MACE and AIMNET2 which we also want to be able to use.

@j-wags
Copy link
Member

j-wags commented Feb 5, 2024

Without knowing much about the specifics of this ask, when I was refactoring qcsubmit a lot of the pain was caused by the fact that many QCS objects were just wrappers around QCF/QCEl objects. This meant that, when the QCF/QCEl objects changed, the QCS wrappers needed to change in lockstep.

So I'm nervous about this potentially putting more layers of complexity/behavior between the QCSubmit API and the underlying QCEl interface. Like, if we helpfully validated user input and raised a validation error if they tried to ask ORCA for MBIS charge, and then one day ORCA added support for MBIS charges, we'd need to hustle to make a release to get the validation up to date. It'd be this way for every feature that we wrap/validate.

Is there a clear sweet spot for user experience versus maintenance cost for this sort of thing?

@jthorton
Copy link
Contributor Author

Like, if we helpfully validated user input and raised a validation error if they tried to ask ORCA for MBIS charge, and then one day ORCA added support for MBIS charges, we'd need to hustle to make a release to get the validation up to date.

Yeah that was always the plan with qcsubmit to have strict validation on the programs we are interested in and then everything else we just let through see here and even then it's only for a subset of features which we often use. Ideally each program would tell use what features are available before hand but outside of that I am not sure I see many other options but open to ideas as I would like to get this or someting similar working for bespokefit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants