Adds MLE estimates on gather command #495

richardjgowers · 2023-07-19T18:25:26Z

Developers certificate of origin

I certify that this contribution is covered by the MIT License here and the Developer Certificate of Origin at https://developercertificate.org/.

richardjgowers · 2023-07-20T11:56:03Z

openfecli/commands/gather.py

+    # 5a) write out MLE values
+    for ligA, DG, unc_DG in MLEs:
+        DG, unc_DG = dp2(DG), dp2(unc_DG)
+        output.write(f'DGbind({ligA})\tDG(MLE)\tZero\t{ligA}\t{DG}\t{unc_DG}\n')


@hannahbaumann this will print a line like:

measurement type ligand_i ligand_j estimate (kcal/mol) uncertainty (kcal/mol) DGbind(lig_1234) DG(mle) Zero lig_1234 -4.2 0.23

would this make sense to you? "zero" is meant to be the zero state reference point, not sure if there's a better label for that

It's not a zero state dG though? Unless you normalise against experiment your reference state is the first ligand passed to your MLE - or are you saying that "Zero" is followed by the reference state?

In any case, if it's possible to inspect the original SMCs for stored experimental values this becomes a lot more useful. Otherwise what I would do is return a list of values with the leading entry being the 0 kcal/mol entry. That way it's clear what the central point is.

Yes, I think from that print statement I wouldn't know what zero means, so what the reference is and therefore what meaning the ("relative") dG has. Or is that described elsewhere?

dwhswenson

Rather than trying to cram everything into one table, why don't we make gather take flags to indicate what data to output? We're really offering 3 different tables. Instead of trying for force them into a single table, why not just own it and offer 3 tables?

Gather is also pretty fast, isn't it? So I don't think we save that much time by only walking the filesystem once.

Something like:

$ openfe gather --report ddg results/
$ openfe gather --report dg results/  # MLE, this is probably the default
$ openfe gather --report leg results/  # raw leg DG

If the theory is that we're expecting someone to parse this after, the current approach requires understanding what the special columns in our format mean, which, IMO, takes more teaching than a CLI option that switches between more intuitive output formats for each type of data.

To implement: https://click.palletsprojects.com/en/8.1.x/options/#choice-options

codecov · 2023-08-02T17:11:21Z

Codecov Report

Patch coverage: 93.10% and project coverage change: -0.18% ⚠️

Comparison is base (e797d52) 92.07% compared to head (8c36577) 91.90%.
Report is 1 commits behind head on main.

❗ Current head 8c36577 differs from pull request most recent head 0b4d51a. Consider uploading reports for the commit 0b4d51a to get more accurate results

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #495      +/-   ##
==========================================
- Coverage   92.07%   91.90%   -0.18%     
==========================================
  Files         113      116       +3     
  Lines        6938     6952      +14     
==========================================
+ Hits         6388     6389       +1     
- Misses        550      563      +13

Files Changed	Coverage Δ
openfecli/commands/gather.py	`85.25% <90.19%> (-6.79%)`	⬇️
openfecli/clicktypes/__init__.py	`100.00% <100.00%> (ø)`
openfecli/clicktypes/hyphenchoice.py	`100.00% <100.00%> (ø)`
openfecli/tests/clicktypes/test_hyphenchoice.py	`100.00% <100.00%> (ø)`
openfecli/tests/commands/test_gather.py	`100.00% <100.00%> (ø)`
openfecli/tests/test_rbfe_tutorial.py	`100.00% <100.00%> (ø)`

... and 5 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

dwhswenson · 2023-08-03T20:19:22Z

Most of the missing coverage would get hit if we had RHFE example data to test. But still is still good enough that codecov isn't complaining. Should be ready for review.

dwhswenson · 2023-08-04T04:13:38Z

Raising my own concerns on this as of now:

in the MLE DG estimate, reference ligand doesn't seem to show exactly 0.0 value. This makes things ambiguous and suggests a problem. I'm highly concerned about this.
I haven't seen anyone do a proper cinnabar plot of numbers from this vs. experiment. There's no check in anything in this PR to ensure that our numbers are actually accurate.

I'm gone for a week as of now (well, a few of you might be able to get me in the next 13 hours.) So I'm logging these concerns so someone else can carry them through and fix problems in the PR.

hannahbaumann · 2023-08-04T08:03:38Z

Here a cinnabar plot of the DG values (after centralizing as done in cinnabar). I would say this looks reasonable?!
https://drive.google.com/file/d/1VJnrpbIOIRjDRS_2nrOhEMqLs71Fj4K2/view?usp=sharing
This was created using the data from the opencli directory

openfecli/tests/commands/test_gather.py

make room for MLEs to be at top of output

now to update to make output cleaner

Co-authored-by: Irfan Alibay <[email protected]>

richardjgowers · 2023-08-24T08:35:52Z

openfecli/tests/commands/test_gather.py

            t = tarfile.open(f, mode='r')
            t.extractall('.')

        yield

+_EXPECTED_DG = b"""


@hannahbaumann are you happy with how these three outputs look?

Yes, that looks good to me!

richardjgowers · 2023-08-24T08:59:12Z

@dwhswenson re: there's no ligand at DG=0.0, I think that's an error in the cinnabar MLE docs, I think it returns values with a mean of 0.0, rather than one ligand arbitrarily being 0.0

richardjgowers changed the title ~~Mle on gather~~ [wip] Mle on gather Jul 19, 2023

richardjgowers changed the title ~~[wip] Mle on gather~~ Adds MLE estimates on gather command Jul 20, 2023

richardjgowers commented Jul 20, 2023

View reviewed changes

dwhswenson reviewed Jul 20, 2023

View reviewed changes

dwhswenson mentioned this pull request Jul 24, 2023

Split gather #502

Merged

1 task

IAlibay added this to the Release 0.12.0 milestone Jul 31, 2023

dwhswenson requested review from IAlibay, hannahbaumann and RiesBen August 3, 2023 20:19

richardjgowers commented Aug 4, 2023

View reviewed changes

openfecli/tests/commands/test_gather.py Show resolved Hide resolved

richardjgowers and others added 16 commits August 24, 2023 09:35

cli: gather: move writing to outside of DDG calculation

7dd4b4b

make room for MLEs to be at top of output

add cinnabar to ci env

b4d53b1

cli: gather: first pass of MLE in gather

9ad0d87

cli: gather: more paranoid ligand ordering to mle

e4288f3

rough draft refactor gather

5d36ee1

version that seems to work

6e6c320

now to update to make output cleaner

improve estimate/uncertainty rounding/reporting

1b92cb4

update to simplify output tables

3d29bed

use 1 digit in uncertainty; leg => legs

cc8b921

Apply suggestions from code review

2298b00

Co-authored-by: Irfan Alibay <[email protected]>

Tuple => tuple

edc251a

add tests for uncertainty calc stuff

e6fc574

Add HyphenAwareChoice

6477aba

clean up a bit of gather

bc2c979

update gather tests

e5b23ce

update tutorial test

0b4d51a

richardjgowers force-pushed the MLE_on_gather branch from 8c36577 to 0b4d51a Compare August 24, 2023 08:35

richardjgowers commented Aug 24, 2023

View reviewed changes

richardjgowers merged commit b027cff into main Aug 24, 2023
6 of 7 checks passed

richardjgowers deleted the MLE_on_gather branch August 24, 2023 08:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adds MLE estimates on gather command #495

Adds MLE estimates on gather command #495

richardjgowers commented Jul 19, 2023 •

edited

Loading

richardjgowers Jul 20, 2023

IAlibay Jul 20, 2023 •

edited

Loading

IAlibay Jul 20, 2023

hannahbaumann Jul 20, 2023

dwhswenson left a comment

codecov bot commented Aug 2, 2023 •

edited

Loading

dwhswenson commented Aug 3, 2023

dwhswenson commented Aug 4, 2023 •

edited

Loading

hannahbaumann commented Aug 4, 2023 •

edited

Loading

richardjgowers Aug 24, 2023

hannahbaumann Aug 24, 2023

richardjgowers commented Aug 24, 2023

Adds MLE estimates on gather command #495

Adds MLE estimates on gather command #495

Conversation

richardjgowers commented Jul 19, 2023 • edited Loading

Developers certificate of origin

richardjgowers Jul 20, 2023

Choose a reason for hiding this comment

IAlibay Jul 20, 2023 • edited Loading

Choose a reason for hiding this comment

IAlibay Jul 20, 2023

Choose a reason for hiding this comment

hannahbaumann Jul 20, 2023

Choose a reason for hiding this comment

dwhswenson left a comment

Choose a reason for hiding this comment

codecov bot commented Aug 2, 2023 • edited Loading

Codecov Report

dwhswenson commented Aug 3, 2023

dwhswenson commented Aug 4, 2023 • edited Loading

hannahbaumann commented Aug 4, 2023 • edited Loading

richardjgowers Aug 24, 2023

Choose a reason for hiding this comment

hannahbaumann Aug 24, 2023

Choose a reason for hiding this comment

richardjgowers commented Aug 24, 2023

richardjgowers commented Jul 19, 2023 •

edited

Loading

IAlibay Jul 20, 2023 •

edited

Loading

codecov bot commented Aug 2, 2023 •

edited

Loading

dwhswenson commented Aug 4, 2023 •

edited

Loading

hannahbaumann commented Aug 4, 2023 •

edited

Loading