Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spurious annotations #89

Open
mephenor opened this issue Jan 27, 2020 · 4 comments
Open

Spurious annotations #89

mephenor opened this issue Jan 27, 2020 · 4 comments

Comments

@mephenor
Copy link
Collaborator

During a quick look through the models I found that H2O is annotated as hydroxide additionally, among other things. The question here is whether this is correct or this should be fixed.
Problem can be reproduced by e.g. polishing iCHOv1.json from https://github.com/SBRG/bigg_models_data/tree/master/models.

We need to check whether similar things happen for other species/reactions/etc. and queries need to be adapted to be more restrictive or rewritten in a different.
However this might be quite time consuming, as annotations would need to be checked manually for plausibility.

@mephenor mephenor added the bug label Jan 27, 2020
@mephenor mephenor self-assigned this Jan 27, 2020
@mephenor
Copy link
Collaborator Author

Upon a bit of further investigation annotations for some species reference the same entity, but in different organisms and, as the code to retrieve a BiGGId from annotations currently cannot retrieve the correct compartment, also across different compartments in some cases.

Two things can be done here:

  • check if queries can be adjusted to only get annotations for the correct organism
  • check if the code to retrieve the BiGGId from annotations can somehow retrieve compartment information

@glucksfall
Copy link

Hi @mephenor,

I think I'm late for the party.

I have found the same as you doing a small task to have the same IDs for different models. In the case of H2O and OH-, both have the same annotation (Also ammonia and ammonium). The problem is deeper when we consider that some annotations refer specifically to water or OH- (e.g. KEGG C00001 vs C01328) or unspecifically refer to both (e.g. XLYOFNOQVPJJNP-UHFFFAOYSA-M is the inchikey for both).

Additionally, some annotation refers erroneously to water (e.g. MNXM2 = OH-) or simply wrong, such as META:OXONIUM (OH3+).

OK... If you would like, we could collaborate to take a deeper look at the issue. Moreover, I would like to add that some models at BIGG have metabolites with the same ID, same name, same molecular formula, but different charges.

Best regards,
Rodrigo

@mephenor
Copy link
Collaborator Author

mephenor commented Jul 28, 2021

Hi @glucksfall and sorry for the very late response, I started a new job, did not get the notification and haven't had that much time to look into this issue, so the whole Polisher is currently a bit stuck in limbo with this being the current major issue to block a new release.

I have not found a solution yet, however, regarding your observation:

Moreover, I would like to add that some models at BIGG have metabolites with the same ID, same name, same molecular formula, but different charges.

After having another look at the database, BIGG only seems to store charge information in the model_compartmentalized_component table, which then references the component table, where the bigg_id and name are stored. So the bigg_id actually does not discriminate between different charge states and the obvious solution would be to add a filter on the annotations obtained. However, this would require to resolve those and reliably retrieve their charge information.

@Schmoho
Copy link
Collaborator

Schmoho commented Jul 26, 2022

see here for a list of all the annotations that are added to a minimal water species

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants