Predicate is a command line version application of the PREDICtor of Antiviral TargEts (PREDICATE) application introduced in: journal.pcbi.1010903. The origianl code can be found at: pymCADRE repository. The main purpose of this application is to:
- Calculate a virus biomass function and integrate it into an SBML metabolic network.
- Target selection through reaction-knockout and host-derived enforcement.
Predicate depends on pymCADRE, which requires a Python >= 3.8.5 version.
Predicate can be installed via pip package manager:
pip install git+https://github.com/alexOarga/predicate
🔗 ➡️ [This example can be found here]
To calculate a virus biomass function, predicate requires a metabolic model in SBML format, a virus genome in FASTA format and the protein sequences in FASTA format.
To run predicate you will need to create a config.yml
file. The file should look as follows:
settings:
variant_name: variant_name # This is the name that will be used to name the output files.
reference_files:
fasta_sequences: '<Path to the fasta file with the virus genome>'
protein_sequences: '<Path to the fasta file with the protein sequences>'
metabolic_network:
model: '<Path to the SBML file with the metabolic model>'
objective: '<Id of the current biomass reaction of the model>' # The growth produced by this reaction will be compared with the growth produced by the virus biomass function.
To run predicate, run the following command:
predicate config.yml
This will generate a new SBML metabolic network with the virus biomass function integrated into the model. The id of this reactions will be VBOF
.
🔗 ➡️ [This example can be found here]
The previous examples assumes that each proteins has a copy number of 1 which is generally not the case. To calculate the virus biomass function with copy numbers defined for each protein,
first, create a config.yml
file as follows:
settings:
variant_name: variant_name
reference_files:
fasta_sequences: '<Path to the fasta file with the virus genome>'
protein_sequences: '<Path to the fasta file with the protein sequences>'
metabolic_network:
model: '<Path to the SBML file with the metabolic model>'
objective: '<Id of the current biomass reaction of the model>'
structural_proteins:
- { id: 'YOUR_PROTEIN_1_ID', copy_number: <Protein 1 copy number> }
- { id: 'YOUR_PROTEIN_2_ID', copy_number: <Protein 2 copy number> }
other_copy_numbers:
Cnp: 1 # Copy number of non-structural proteins
Cg: 1 # Copy number of the genome
In this file, you can specify the copy number of each protein in the structural_proteins
section.
For each structural protein, you need to provide the id of the protein and the copy number. other_copy_numbers
section.
By default, all proteins that do not have a copy number defined in the structural_proteins
section will be assumed to be non-structural proteins, and hence, will have the copy number defined in the other_copy_numbers.Cnp
section.
Once you have created the config.yml
file, you can run predicate as follows:
predicate config.yml
As before, this will generate a new SBML metabolic network with the virus biomass function integrated into the model. The id of this reactions will be VBOF
.
🔗 ➡️ [This example can be found here]
To run host-derived enforcement on the generated virus biomass model, simply create a config file as before and add the following line: run_hde: True
in the settings
section:
settings:
variant_name: variant_name
run_hde: True
...
🔗 ➡️ [An example containing all the following sections can be found here]
Predicate uses by default metabolites identifiers given by the BiGG database. If your SBML model uses different identifiers, you need to manually specify the IDs of all aminoacids, nucleotides (ATP, CTP, GTP, UTP) and
other energy requirements (ADP, H, H2O, PI, PPI). This can be done by adding the metabolites_ids
section in the metabolic_network
section of the config.yml
file:
metabolic_network:
model: <your path>
objective: <your biomass objective reaction>
metabolites_ids:
#energy requirements
- {'name': 'ADP', 'metabolite': 'adp_c'}
- {'name': 'H_atom', 'metabolite': 'h_c'}
- {'name': 'H2O', 'metabolite': 'h2o_c'}
- {'name': 'PI', 'metabolite': 'pi_c'}
- {'name': 'PPI', 'metabolite': 'ppi_c'}
#nucleotides
- {'name': 'ATP', 'metabolite': 'atp_c'}
- {'name': 'CTP', 'metabolite': 'ctp_c'}
- {'name': 'GTP', 'metabolite': 'gtp_c'}
- {'name': 'UTP', 'metabolite': 'utp_c'}
#amino acids
- {'name': 'A', 'metabolite': 'ala__L_c'}
- {'name': 'R', 'metabolite': 'arg__L_c'}
- {'name': 'N', 'metabolite': 'asn__L_c'}
- {'name': 'D', 'metabolite': 'asp__L_c'}
- {'name': 'C', 'metabolite': 'cys__L_c'}
- {'name': 'E', 'metabolite': 'glu__L_c'}
- {'name': 'Q', 'metabolite': 'gln__L_c'}
- {'name': 'G', 'metabolite': 'gly_c'}
- {'name': 'H', 'metabolite': 'his__L_c'}
- {'name': 'I', 'metabolite': 'ile__L_c'}
- {'name': 'L', 'metabolite': 'leu__L_c'}
- {'name': 'K', 'metabolite': 'lys__L_c'}
- {'name': 'M', 'metabolite': 'met__L_c'}
- {'name': 'F', 'metabolite': 'phe__L_c'}
- {'name': 'P', 'metabolite': 'pro__L_c'}
- {'name': 'S', 'metabolite': 'ser__L_c'}
- {'name': 'T', 'metabolite': 'thr__L_c'}
- {'name': 'W', 'metabolite': 'trp__L_c'}
- {'name': 'Y', 'metabolite': 'tyr__L_c'}
- {'name': 'V', 'metabolite': 'val__L_c'}
For example, if ADP is represented in your model as my_adp
you can change the following line: - {'name': 'ADP', 'metabolite': 'adp_c'}
to - {'name': 'ADP', 'metabolite': 'my_adp'}
.
If you want to add additional metabolites to the virus biomass function, you can do so by adding the additional_metabolites
section in the metabolic_network
section of the config.yml
file:
metabolic_network:
model: <your path>
objective: <your biomass objective reaction>
additional_metabolites:
- {'metabolite': 'pchol_hs_c', 'stoichiometry': 0.038400}
- {'metabolite': 'pe_hs_c', 'stoichiometry': 0.014566}
- {'metabolite': 'pail_hs_c', 'stoichiometry': 0.006621}
- {'metabolite': 'ps_hs_c', 'stoichiometry': 0.001986}
- {'metabolite': 'chsterol_c', 'stoichiometry': 0.000012}
- {'metabolite': 'sphmyln_hs_c', 'stoichiometry': 0.001986}
For example, the above example adds 6 more metabolites to the virus biomass function.
To change the number of grams of nucleotide per mole of virus, you can add the nucleotides_per_virus_mol
section of the config.yml
file:
nucleotides_per_virus_mol: {
A: 135.13,
U: 112.09,
G: 151.13,
C: 111.1
}
In the above example, the number of grams of nucleotide A per virus mole is 135.13, the number of grams of nucleotide U per virus mole is 112.09, the number of grams of nucleotide G per virus mole is 151.13 and the number of grams of nucleotide C per virus mole is 111.1.
To change the number of ATP molecules required for the polymerization of a nucleotide, or to change the number of molecules of PPI required to bond 2 nucleotides, you can add the other_parameters
section of the config.yml
file:
other_parameters:
kATP: 4
kPPi: 1
In the above example, 4 ATP molecules are required for the polymerization of a nucleotide and 1 PPI molecule is required to bond 2 nucleotides.
Although this might not be frequent, you can change the molecular weights of the amino acids by adding the amino_acids_and_weights
section of the config.yml
file:
amino_acids_and_weights:
- { amino_acid: 'A', molecular_weight: 89.1 }
- { amino_acid: 'R', molecular_weight: 174.2 }
- { amino_acid: 'N', molecular_weight: 132.1 }
- { amino_acid: 'D', molecular_weight: 133.1 }
- { amino_acid: 'C', molecular_weight: 121.2 }
- { amino_acid: 'E', molecular_weight: 147.1 }
- { amino_acid: 'Q', molecular_weight: 146.2 }
- { amino_acid: 'G', molecular_weight: 75.1 }
- { amino_acid: 'H', molecular_weight: 155.2 }
- { amino_acid: 'I', molecular_weight: 131.2 }
- { amino_acid: 'L', molecular_weight: 131.2 }
- { amino_acid: 'K', molecular_weight: 146.2 }
- { amino_acid: 'M', molecular_weight: 149.2 }
- { amino_acid: 'F', molecular_weight: 165.2 }
- { amino_acid: 'P', molecular_weight: 115.1 }
- { amino_acid: 'S', molecular_weight: 105.1 }
- { amino_acid: 'T', molecular_weight: 119.1 }
- { amino_acid: 'W', molecular_weight: 204.2 }
- { amino_acid: 'Y', molecular_weight: 181.2 }
- { amino_acid: 'V', molecular_weight: 117.1 }
In the above example, the molecular weight of amino acid A is 89.1, the molecular weight of amino acid R is 174.2, etc.
Mutation visualization and analysis as in the original Predicate code is not implemented yet.
This code was directly taked from the pymCADRE repository and shares the same license. Please see the LICENSE file for details.