feat: single ec-code annotation per reaction #319

edkerk · 2022-06-05T17:05:09Z

Description of the issue:

Many reactions do not have any ec-code, or are annotated with multiple ec-codes. Instead, each reaction should be annotated with one ec-code, and if not full ec-code can be defined (with 4 sets of digits), then wild-cards can be given.

This might involve manual curation, although parsing gene associations through Uniprot might be helpful.

These single ec-codes can then be used when constructing GECKO models.

Expected feature/value/output:

model.eccodes(:) should give a single ec-code entry for each reaction.

Current feature/value/output:

>> model.eccodes([1,3,6,17,22]) % Some random example reactions)

ans =

  5×1 cell array

    {'1.1.2.4;1.1.99.-'                              }
    {'1.1.1.4'                                       }
    {0×0 char                                        }
    {'1.14.13.-;2.1.1.114;2.1.1.201;2.1.1.64;2.7.-.-'}
    {0×0 char                                        }

I hereby confirm that I have:

Tested my code with all requirements for running the model
Done this analysis in the main branch of the repository
Checked that a similar issue does not exist already
If needed, asked first in the Gitter chat room about the issue

The text was updated successfully, but these errors were encountered:

hongzhonglu · 2022-06-06T07:12:53Z

This is a good idea. It should be also noted that the reaction is mainly found based on sequence, not by EC number as one EC number could be mapped mutiple reactions(in some cases). Currently, EC number for each protein can be found from UniProt or SGD. Previously I found the EC number annotation from different database is not the same. We can still make this step automatic by refererring some standard rxn database like Rhea, Metnetx or ModelSeed.

edkerk · 2022-06-06T14:03:32Z

But by definition, there should really be only one EC number per reaction, although perhaps with some wildcards if the exact substrate or cofactor has not been given a specific EC number (in that case, something like EC1.4.2.-). But indeed, we can automatically pull these from databases, probably MetaNetX will be useful. Manual curation will be required to resolve when we find multiple EC numbers for the same reaction.

mihai-sysbio · 2022-06-08T09:44:59Z

I fully support/encourage automatic mapping of EC numbers. A good way to map to MNX would be via Rhea IDs; however, these are missing from the model annotation, so then I would suggest using the BridgeDB API.

hongzhonglu · 2022-06-09T09:49:51Z

I just checked EC number for each reaction in yeast-GEM. Most reactions have one unique EC number. About 490 reactions have multiple EC number. It will be wonderful to have some automatic way to curate these EC number.

edkerk · 2022-06-09T10:07:00Z

Indeed, there are many that are already fine.

I'm not 100% certain how automated curation would work, because many gene/protein based approaches (e.g. BridgeDB, Uniprot) are not ideal as it is not uncommon for enzymes to have been assigned multiple ec-numbers (due to them being multifunctional enzymes), or because they are part of a complex where subunits perform dedicated functions. GECKO is however doing this based on Uniprot and KEGG.

But probably a mixture of using GECKO's function, supplemented with looking at other annotations. And most likely we'd manually have to define EC numbers with wild-cards for those reactions that have not specific EC number (we want the EC-numbers to exactly match the reaction: if it doesn't match, use a wild-card, until all reactions (except exchange and non-active transport) have assigned (partial) EC-number).

This all does not have very high priority, but would be good to address.

edkerk self-assigned this Jun 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: single ec-code annotation per reaction #319

feat: single ec-code annotation per reaction #319

edkerk commented Jun 5, 2022

hongzhonglu commented Jun 6, 2022 •

edited

Loading

edkerk commented Jun 6, 2022

mihai-sysbio commented Jun 8, 2022

hongzhonglu commented Jun 9, 2022 •

edited

Loading

edkerk commented Jun 9, 2022

feat: single ec-code annotation per reaction #319

feat: single ec-code annotation per reaction #319

Comments

edkerk commented Jun 5, 2022

Description of the issue:

Expected feature/value/output:

Current feature/value/output:

hongzhonglu commented Jun 6, 2022 • edited Loading

edkerk commented Jun 6, 2022

mihai-sysbio commented Jun 8, 2022

hongzhonglu commented Jun 9, 2022 • edited Loading

edkerk commented Jun 9, 2022

hongzhonglu commented Jun 6, 2022 •

edited

Loading

hongzhonglu commented Jun 9, 2022 •

edited

Loading