-
Notifications
You must be signed in to change notification settings - Fork 218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Metadata fbc3 #1237
base: devel
Are you sure you want to change the base?
Metadata fbc3 #1237
Conversation
Okay. I'll keep that in mind for next issues. Is this what you meant with the creator for a simpler API? |
Yep, pretty much. I would probably allow the dates to be provided directly as string and have init convert it to datetime. And maybe have some things initialized to default dates. For instance, the creation date could just be the current time by default. Might also be helpful if the creator info can be provided in the configuration and is then used by default. |
History accepts str or datetime as a format for the dates. Creation time now() by default - I can do that. |
sbml document history is saved added relevant tests
Hi all, For clarification with the key/value store. This is basically the new fbc-version3 package which just needs a second implementation (which is this implementation) to be officially accepted. I.e. this is basically official as soon as we merge this in develop. So the key-value pair has to stay in. A lot of discussion here. Are there any open things which have to be decided or is everything down to implementation details at this point? |
Hi Matthias,
As far as I remember, there are still some open discussions
1) Should we rename CVTerm to something like standardized?
2) Should creators have only one name, or stay with family name and given
name? Should one name be an option?
#1237 (comment)
3) Renaming the qualifiers, see
#1237 (comment). I
think they should stay as is.
4) What should the JSON schema include? See
#1237 (comment) and
#1240 (reply in thread)
The rest is just tidying, code implementation. I'm still working on it, but
if you have comments now that would be helpful.
Thank you,
Uri David
…On Wed, Jul 13, 2022 at 4:22 AM Matthias König ***@***.***> wrote:
Hi all,
I could review/update the code.
For clarification with the key/value store. This is basically the new
fbc-version3 package which just needs a second implementation (which is
this implementation) to be officially accepted. I.e. this is basically
official as soon as we merge this in develop. So the key-value pair has to
stay in.
A lot of discussion here. Are there any open things which have to be
decided or is everything down to implementation details at this point?
Best Matthias
—
Reply to this email directly, view it on GitHub
<#1237 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACQYYZRBY2TPYODYFZSF4NLVTZ4DZANCNFSM5ZGT4REA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Please don't set any current times! This will create diffs on your SBML models and model files every time you run your scripts resulting in large git diffs! I.e. don't set a current time as default. Perhaps an option for that, but please not the default behavior. The HistoryElement can be set on all core objects, so having diffs on all objects every time a script is executed is a bad idea. |
Right now, the document (libsbml.SBMLDocument), is set to be created at
current time if there is no document creation time.
Nothing else is created, because having diffs is a bad idea. No entities,
and not the libsbml.Model entity either.
I can remove this, but this doesn't seem harmful to me.
…On Wed, Jul 13, 2022 at 11:54 AM Matthias König ***@***.***> wrote:
Yep, pretty much. I would probably allow the dates to be provided directly
as string and have init convert it to datetime. And maybe have some things
initialized to default dates. For instance, the creation date could just be
the current time by default. Might also be helpful if the creator info can
be provided in the configuration and is then used by default.
Please don't set any current times! This will create diffs on your SBML
models and model files every time you run your scripts resulting in large
git diffs! I.e. don't set a current time as default. Perhaps an option for
that, but please not the default behavior. The HistoryElement can be set on
all core objects, so having diffs on all objects every time a script is
executed is a bad idea.
—
Reply to this email directly, view it on GitHub
<#1237 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACQYYZUZKCSJJCE646NPQYDVT3RBVANCNFSM5ZGT4REA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Okay. I think if this is what the community wants to do, we should define a
standard of
- What is a valid cobray ID?
- How to deal with weird characters like :, that are currently converted to
ASCII code. We can decide to have something like SBML_COLON, SBML_SEMICOLON
(it might exist, I'm not sure)
And get it done **before** cobrapy 1.0
@cdiener @Midnighter @synchon - any comments?
…On Wed, Jul 13, 2022 at 11:48 AM Matthias König ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In src/cobra/io/schema_v2.json
<#1237 (comment)>:
> + }
+ },
+ "required": [
+ "key"
+ ],
+ "additionalProperties": false
+ }
+ }
+ }
+ }
+ },
+ "type": "object",
+ "properties": {
+ "id": {
+ "type": "string",
+ "pattern": "^[a-zA-Z|_][a-zA-Z0-9_]*$"
Yes, SIds are valid SBML ids. This can be checked very easily with a regex.
Like mentioned above the only id changes are on
- JSON import (IDs are fixed to be valid SIds)
- MAT import (IDS are fixed to be valid SIds)
- OTHER_FORMAT_NOT_SBML (IDS are fixed to be valid SIds)
There would be a single function fixing ids, e.g. fix_invalid_ids used in
both/all importers to have a consistent behavior across formats. Nothing
would be replaced fixed on any exports, because SIds are valid JSON and MAT
Ids.
—
Reply to this email directly, view it on GitHub
<#1237 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACQYYZWNYJZ5L32SFOKOXDTVT3QONANCNFSM5ZGT4REA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Key-value pairs will be in there. They are super useful and a regularly requested feature. The current date was only supposed to use current time as default value for the constructor. So it would only be set the first time the history is created and only if the user does not specify a date. EDIT: see reply below. Only set on writin gif no date is set. The other discussion was whether we should limit the format of IDs for the v2 JSON schema. I don't think there is a good solution here, since the SBML and COBRAPY ID requirements are not compatible. So it will come down to whether we want the JSON schema 100% SBML compliant, with the limitation that not all COBRAPY models can be saved to that format. The rest is just naming of classes etc. to make it a bit more accessible to devs and users without dep knowledge of SBML abstractions. |
Right now, History() is called as part of every annotation, so modifying any default there would lead to every entity having a created date. That's a bad idea.
That's can be removed or stay. |
I would highly prefer to have SBML SIds as valid cobrapy ids.
The only exceptions are some databases such as BiGG (no longer maintained, which also provides SBML with valid IDs). These would allow:
I strongly advocate for this solution and just enforce valid SIds and move id_replacement in helper functions which would be applied for import besides SBML (and as toolbox for people to fix there invalid models). edit: fixed regex |
@cdiener @matthiaskoenig |
Yes, the documentation is better. But please use a string enum for such enums.
|
I think this is hard because of backwards compatibility as mentioned above, and it would be a bit inconsistent for instance when reading models with JSON v1 schema. But I so see your point and agree that all the SBML-like formats IDs should be read as is and there should be helpers to convert them to that format. Point 2 is not fully correct since there are some IDs that are SBML-compliant but don't work in optlang in some solver interfaces (IDs longer 255 chars in GLPK, or metabolites named "st" in Gurobi for instance). So basically we push the defaults toward SBML-compliant IDs without breaking other formats such as Matlab etc. |
…ist - need examples (TODO) Qualifiers as string enum CVTermList will output list of dicts, not a dict update_pickles.py minor renaming changed sbml.py according to previous changes. cleaned up processing of history in sbml.py changed test_metadata.py according to these changes
improve equality of History object some tidying of metadata.py and adding __eq__ function added test_old_style_annotation to test_metadata.py
updated some data and tests
Some minor modifications and comments on sbml and test_sbml.py
Updated version of #988. Tried to simplify and tidy the code in addition to merging. Modified according to @cdiener requests
Original description of #988 follows
Metadata Classes: A separate directory for handling metadata information is made inside cobra/core directory. Every object derived from SBase can have meta information like annotation (CVTerms), notes, history attached to it.
The CVTerms class for storing externally linked resources to each component derived from SBase. This class is maintaining the new format annotation as well as old format annotation simultaneously, and both are kept synchronized with each other. Changing one will modify the other accordingly. This new class for annotation can handle any type of annotation data (be it the case of nested annotation or alternative annotation). It can read the old format as well as the new format annotation data from JSON and other formats. At the time of writing back the model, the new data format is used because it contains the complete data organized in the same way as SBML.
The History class used for storing the history, validating dates, etc, is now attached to each component derived from SBase.
The KeyValuePair class for storing key-value pairs, defined by fbc-v3.
The last three metadata objects (i.e CVTerms, History, KeyValuePair) are present inside a single attribute of SBase (Object) class and can be accessed via object.annotation.cvterms, object.annotation.history and object.annotation.key_value_data attributes. Calling simply the annotation attribute (object.annotation) will return the annotation data in old format (making it backward compatible).
Group to JSON: The support of the group package is extended to JSON.
JSON schema v2: The version2 of JSON schema has been added which defines the new format annotation, history, key-value pair, notes, group package data, user-defined constraints data and basic SBML info.
@akaviaLab comment - not sure if the v2 schema works or is done. Seems unclear from the code.
Issues Fixed
fixes issue #954: The support of group package has been extended to JSON and other formats.
fixes issue #856: The infinity values are also enabled for storing bounds.
fixes issue #810: The sbml_info storing basic information of SBML is written to JSON to store the basic SBML document information like packages, level, version, notes, annotation attached to the SBML component etc.
fixes issue #736: If annotation is present in the form of list of list, it is first modified and then data is read.
fixes issue #706: Single annotation resource are now put into a string. While reading from JSON also, if a single resource attribute is in the form of string, it is first fixed to list and then read into the model.
fixes issue #684: The complete metadata structure (CVTerms, Notes, History) has been redesigned with backward compatibility. Old format data is read, fixed and then written in new format metadata structures. The fbc-v3 "KeyValuePair" class is also a part of this new format annotation. Its corresponding class and its support to JSON and other format has been added. However, the SBML parser for it has to be updated when libsbml adds support for fbc-v3.
fixes issue #937: The annotation format has been updated, which is backward compatible too.
JSON schema v1: The JSON schema version 1 is modified to resolve some pre-existing issues.
Tests
Tests for all the newly implemented features are added to check the functionalities. A few old tests are also modified accordingly. Some tests which were initially marked 'xfail' are now working dew to modified formats.