Skip to content

Commit

Permalink
Merge branch 'main' of github.com:Becksteinlab/SAMPL7_logP_data into …
Browse files Browse the repository at this point in the history
…main
  • Loading branch information
Shujie Fan committed Jun 17, 2021
2 parents f40d0a2 + bbfe3ce commit d0ef169
Show file tree
Hide file tree
Showing 4 changed files with 617 additions and 0 deletions.
152 changes: 152 additions & 0 deletions submissions/logP_prediction_Iorga_Beckstein_CGenFF.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,152 @@
# OCTANOL TO WATER (ΔG_octanol - ΔG_water) TRANSFER FREE ENERGY PREDICTIONS
#
# This file will be automatically parsed. It must contain the following four elements:
# predictions, name of method, software listing, and method description.
# These elements must be provided in the order shown with their respective headers.
#
# Any line that begins with a # is considered a comment and will be ignored when parsing.
#
#
# PREDICTION SECTION
#
# It is mandatory to submit water to octanol (ΔG_octanol - ΔG_water) transfer free energy (TFE) predictions for all 22 molecules.
# Incomplete submissions will not be accepted.
# The energy units must be in kcal/mol.
#
# Please report the general molecule `ID tag` in the form of `SMXX` (e.g. SM25, SM26, etc).
# Please indicate the microstate(s) used in the `Molecule ID/IDs considered (no commas)` section (e.g. `SM25_micro000`, SM25_extra001`)
# Please report TFE standard error of the mean (SEM) and TFE model uncertainty.
#
# The data in each prediction line should be structured as follows:
# ID tag, Molecule ID/IDs considered (no commas), TFE, TFE SEM, TFE model uncertainty, (optional) logD, (optional) SEM logD
#
# Your transfer free energy prediction for the neutral form does NOT have to be `SMXX_micro000` (which is the challenge provided neutral microstate).
# If you use a microstate other than the challenge provided microstate, please fill out the `Molecule ID/IDs considered (no commas)` section using a molecule ID in the form of `SMXX_extra001` (number can vary) and please list the molecule ID and it's SMILES string in your methods description in the `METHOD DESCRIPTION SECTION`.
#
# You may optionally provide predicted logD values; these will be used as a consistency check on our estimated logD values if you submit both logP and pKa values.
#
# Only one entry in the second column (`Molecule ID/IDs considered (no commas)`) is required, but you should list all IDs considered/input to your calculations. See challenge instructions.
#
# If you have evaluated additional microstates then the molecule ID used in the `Molecule ID/IDs considered (no commas)` section needs to be in the format: `SMXX_extra001` (number can vary).
# If multiple microstates are used, please report the order of population in the aqueous phase in descending order.
# Please list microstate populations, SMILES strings and the molecule IDs in the `METHOD DESCRIPTION SECTION` section further below.
#
# The list of predictions must begin with the 'Predictions:' keyword as illustrated here.
Predictions:
SM25,SM25_micro004,-6.55,0.18,1.50
SM26,SM26_micro000,-2.39,0.11,1.50
SM27,SM27_micro000,-3.31,0.13,1.50
SM28,SM28_micro000,-2.52,0.13,1.50
SM29,SM29_micro000,-2.64,0.15,1.50
SM30,SM30_micro000,-5.73,0.14,1.50
SM31,SM31_micro000,-4.16,0.13,1.50
SM32,SM32_micro000,-5.95,0.16,1.50
SM33,SM33_micro000,-7.44,0.16,1.50
SM34,SM34_micro000,-6.08,0.14,1.50
SM35,SM35_micro000,-2.70,0.36,1.50
SM36,SM36_micro000,-3.65,0.60,1.50
SM37,SM37_micro000,-3.31,0.75,1.50
SM38,SM38_micro000,-2.26,0.41,1.50
SM39,SM39_micro000,-5.10,0.41,1.50
SM40,SM40_micro000,-3.54,0.31,1.50
SM41,SM41_micro000,-3.53,0.18,1.50
SM42,SM42_micro003,-7.01,0.13,1.50
SM43,SM43_micro004,-5.10,0.16,1.50
SM44,SM44_micro000,-1.15,0.22,1.50
SM45,SM45_micro000,-5.15,0.41,1.50
SM46,SM46_micro000,-3.13,0.29,1.50
#
#
# Please list your name, using only UTF-8 characters as described above. The "Participant name:" entry is required.
Participant name:
Bogdan I. Iorga/Oliver Beckstein
#
#
# Please list your organization/affiliation, using only UTF-8 characters as described above.
Participant organization:
ICSN, CNRS, Gif-sur-Yvette, France/Arizona State University, USA
#
#
# NAME SECTION
#
# Please provide an informal but informative name of the method used.
# The name must not exceed 40 characters.
# The 'Name:' keyword is required as shown here.
Name:
# SAMPL7_logP_MDPOW_CGenFF
MD (CGenFF/TIP3P)
#
#
# COMPUTE TIME SECTION
#
# Please provide the average compute time across all of the molecules.
# For physical methods, report the GPU and/or CPU compute time in hours.
# For empirical methods, report the query time in hours.
# Create a new line for each processor type.
# The 'Compute time:' keyword is required as shown here.
Compute time:
20,000 hours, CPU
#
# COMPUTING AND HARDWARE SECTION
#
# Please provide details of the computing resources that were used to train models and make predictions.
# Please specify compute time for training models and querying separately for empirical prediction methods.
# Provide a detailed description of the hardware used to run the simulations.
# The 'Computing and hardware:' keyword is required as shown here.
Computing and hardware:
All the simulations were performed in parallel (8 cores for each simulation) on cluster nodes running with CentOS6 and 4 CPU Intel Xeon E5-4627 v3 @ 2.60GHz.
#
# SOFTWARE SECTION
#
# List all major software packages used and their versions.
# Create a new line for each software.
# The 'Software:' keyword is required.
Software:
Gromacs 2020.3
MDPOW 0.7.0-dev
CGENFF 2.2.0
#
# METHOD CATEGORY SECTION
#
# State which method category your prediction method is better described as:
# `Physical (MM)`, `Physical (QM)`, `Empirical`, or `Mixed`.
# Pick only one category label.
# The `Category:` keyword is required.
Category:
Physical (MM)
#
# METHOD DESCRIPTION SECTION
#
# Methodology and computational details.
# Level of details should be roughly equivalent to that used in a publication.
# Please include the values of key parameters with units.
# Please explain how statistical uncertainties were estimated.
#
# If you have evaluated additional microstates, please report their SMILES strings and populations of all the microstates in this section.
# If you used a microstate other than the challenge provided microstate (`SMXX_micro000`), please list your chosen `Molecule ID` (in the form of `SMXX_extra001`) along with the SMILES string in your methods description.
#
# Use as many lines of text as you need.
# All text following the 'Method:' keyword will be regarded as part of your free text methods description.
Method:
Alchemical free energy calculations were performed in explicit
solvent, following the protocol described in [1,2]. Parameters were generated with the
PARAMCHEM CGENFF program via the server at
https://cgenff.umaryland.edu/ for CHARMM/CGenFF with the CHARMM TIP3P
water model. Files were prepared for Gromacs 2020.3. The alchemical data were
analyzed with thermodynamic integration. Errors are reported as errors
of the mean (see [1,2]). The
model uncertainty was estimated on the basis of the results from [2].
[1] Kenney, I. M., Beckstein, O., and Iorga, B. I. (2016) Prediction of cyclohexane-water
distribution coefficients for the SAMPL5 data set using molecular dynamics simulations with
the OPLS-AA force field, J. Comput. Aided Mol. Des. 30(11):1045-1058 DOI: 10.1007/s10822-016-9949-5.
[2] Fan, S., Iorga, B. I., and Beckstein, O. (2020) Prediction of octanol-water partition coefficients for
the SAMPL6-logP molecules using molecular dynamics simulations with OPLS-AA, AMBER and CHARMM force fields,
J Comput Aided Mol Des 34(5):543-560 DOI: 10.1007/s10822-019-00267-z.
#
# All submissions must either be ranked or non-ranked.
# Only one ranked submission per participant is allowed.
# Multiple ranked submissions from the same participant will not be judged.
# Non-ranked submissions are accepted so we can verify that they were made before the deadline.
# The "Ranked:" keyword is required, and expects a Boolean value (True/False)
Ranked:
True
157 changes: 157 additions & 0 deletions submissions/logP_prediction_Iorga_Beckstein_GAFF.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,157 @@
# OCTANOL TO WATER (ΔG_octanol - ΔG_water) TRANSFER FREE ENERGY PREDICTIONS
#
# This file will be automatically parsed. It must contain the following four elements:
# predictions, name of method, software listing, and method description.
# These elements must be provided in the order shown with their respective headers.
#
# Any line that begins with a # is considered a comment and will be ignored when parsing.
#
#
# PREDICTION SECTION
#
# It is mandatory to submit water to octanol (ΔG_octanol - ΔG_water) transfer free energy (TFE) predictions for all 22 molecules.
# Incomplete submissions will not be accepted.
# The energy units must be in kcal/mol.
#
# Please report the general molecule `ID tag` in the form of `SMXX` (e.g. SM25, SM26, etc).
# Please indicate the microstate(s) used in the `Molecule ID/IDs considered (no commas)` section (e.g. `SM25_micro000`, SM25_extra001`)
# Please report TFE standard error of the mean (SEM) and TFE model uncertainty.
#
# The data in each prediction line should be structured as follows:
# ID tag, Molecule ID/IDs considered (no commas), TFE, TFE SEM, TFE model uncertainty, (optional) logD, (optional) SEM logD
#
# Your transfer free energy prediction for the neutral form does NOT have to be `SMXX_micro000` (which is the challenge provided neutral microstate).
# If you use a microstate other than the challenge provided microstate, please fill out the `Molecule ID/IDs considered (no commas)` section using a molecule ID in the form of `SMXX_extra001` (number can vary) and please list the molecule ID and it's SMILES string in your methods description in the `METHOD DESCRIPTION SECTION`.
#
# You may optionally provide predicted logD values; these will be used as a consistency check on our estimated logD values if you submit both logP and pKa values.
#
# Only one entry in the second column (`Molecule ID/IDs considered (no commas)`) is required, but you should list all IDs considered/input to your calculations. See challenge instructions.
#
# If you have evaluated additional microstates then the molecule ID used in the `Molecule ID/IDs considered (no commas)` section needs to be in the format: `SMXX_extra001` (number can vary).
# If multiple microstates are used, please report the order of population in the aqueous phase in descending order.
# Please list microstate populations, SMILES strings and the molecule IDs in the `METHOD DESCRIPTION SECTION` section further below.
#
# The list of predictions must begin with the 'Predictions:' keyword as illustrated here.
Predictions:
SM25,SM25_micro004,-4.76,0.33,1.50
SM26,SM26_micro000,-1.60,0.20,1.50
SM27,SM27_micro000,-3.84,0.35,1.50
SM28,SM28_micro000,-3.08,0.34,1.50
SM29,SM29_micro000,-3.18,0.24,1.50
SM30,SM30_micro000,-5.30,0.42,1.50
SM31,SM31_micro000,-4.34,0.25,1.50
SM32,SM32_micro000,-4.57,0.24,1.50
SM33,SM33_micro000,-5.64,0.24,1.50
SM34,SM34_micro000,-5.10,0.32,1.50
SM35,SM35_micro000,-2.44,0.49,1.50
SM36,SM36_micro000,-4.92,0.47,1.50
SM37,SM37_micro000,-3.78,0.42,1.50
SM38,SM38_micro000,-2.59,0.53,1.50
SM39,SM39_micro000,-5.06,0.54,1.50
SM40,SM40_micro000,-4.77,0.49,1.50
SM41,SM41_micro000,-3.37,0.38,1.50
SM42,SM42_micro003,-5.75,0.28,1.50
SM43,SM43_micro004,-3.91,0.24,1.50
SM44,SM44_micro000,-3.11,0.25,1.50
SM45,SM45_micro000,-4.91,0.28,1.50
SM46,SM46_micro000,-4.10,0.19,1.50
#
#
# Please list your name, using only UTF-8 characters as described above. The "Participant name:" entry is required.
Participant name:
Bogdan I. Iorga/Oliver Beckstein
#
#
#
# Please list your organization/affiliation, using only UTF-8 characters as described above.
Participant organization:
ICSN, CNRS, Gif-sur-Yvette, France/Arizona State University, USA
#
#
#
# NAME SECTION
#
# Please provide an informal but informative name of the method used.
# The name must not exceed 40 characters.
# The 'Name:' keyword is required as shown here.
Name:
# SAMPL7_logP_MDPOW_GAFF
MD (GAFF/TIP3P)
#
#
#
# COMPUTE TIME SECTION
#
# Please provide the average compute time across all of the molecules.
# For physical methods, report the GPU and/or CPU compute time in hours.
# For empirical methods, report the query time in hours.
# Create a new line for each processor type.
# The 'Compute time:' keyword is required as shown here.
Compute time:
20,000 hours, CPU
#
#
# COMPUTING AND HARDWARE SECTION
#
# Please provide details of the computing resources that were used to train models and make predictions.
# Please specify compute time for training models and querying separately for empirical prediction methods.
# Provide a detailed description of the hardware used to run the simulations.
# The 'Computing and hardware:' keyword is required as shown here.
Computing and hardware:
All the simulations were performed in parallel (8 cores for each simulation) on cluster nodes running with CentOS6 and 4 CPU Intel Xeon E5-4627 v3 @ 2.60GHz.
#
# SOFTWARE SECTION
#
# List all major software packages used and their versions.
# Create a new line for each software.
# The 'Software:' keyword is required.
Software:
Gromacs 2020.3
MDPOW 0.7.0-dev
AmberTools
ACPYPE 0 (2017)
#
# METHOD CATEGORY SECTION
#
# State which method category your prediction method is better described as:
# `Physical (MM)`, `Physical (QM)`, `Empirical`, or `Mixed`.
# Pick only one category label.
# The `Category:` keyword is required.
Category:
Physical (MM)
#
# METHOD DESCRIPTION SECTION
#
# Methodology and computational details.
# Level of details should be roughly equivalent to that used in a publication.
# Please include the values of key parameters with units.
# Please explain how statistical uncertainties were estimated.
#
# If you have evaluated additional microstates, please report their SMILES strings and populations of all the microstates in this section.
# If you used a microstate other than the challenge provided microstate (`SMXX_micro000`), please list your chosen `Molecule ID` (in the form of `SMXX_extra001`) along with the SMILES string in your methods description.
#
# Use as many lines of text as you need.
# All text following the 'Method:' keyword will be regarded as part of your free text methods description.
Method:
Alchemical free energy calculations were performed in explicit
solvent, following the protocol described in [1,2]. Parameters were generated with
Antechamber from AmberTools and ACPYPE for AMBER (GAFF) with the TIP3P
water model. Files were prepared for Gromacs 2020.3. The alchemical data
were analyzed with thermodynamic integration. Errors are reported as
errors of the mean (see [1,2]). The
model uncertainty was estimated on the basis of the results from [2].
[1] Kenney, I. M., Beckstein, O., and Iorga, B. I. (2016) Prediction of cyclohexane-water
distribution coefficients for the SAMPL5 data set using molecular dynamics simulations with
the OPLS-AA force field, J. Comput. Aided Mol. Des. 30(11):1045-1058 DOI: 10.1007/s10822-016-9949-5.
[2] Fan, S., Iorga, B. I., and Beckstein, O. (2020) Prediction of octanol-water partition coefficients for
the SAMPL6-logP molecules using molecular dynamics simulations with OPLS-AA, AMBER and CHARMM force fields,
J Comput Aided Mol Des 34(5):543-560 DOI: 10.1007/s10822-019-00267-z.
#
#
# All submissions must either be ranked or non-ranked.
# Only one ranked submission per participant is allowed.
# Multiple ranked submissions from the same participant will not be judged.
# Non-ranked submissions are accepted so we can verify that they were made before the deadline.
# The "Ranked:" keyword is required, and expects a Boolean value (True/False)
Ranked:
False
Loading

0 comments on commit d0ef169

Please sign in to comment.