-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Alignment: inherent
property
#84
Comments
Or, if I may be so bold: the opposite way around? What I like about |
We read it as "should xyz be in GA4GH Digest keys?", and "what is the GA4GH Digest prefix?". What is useful for our purposes is tying together all of the properties associated with the GA4GH digest into one object (see above examples). That said, if |
Standardization is obviously what GA4GH is all about, so that is way more important than my personal preferences. So one way or another, I think your suggestion is a good one! But just to add another argument for I am in any case only speaking for myself. Let's hear what others have to say! |
@andrewyatz this is a key GKS / LSG alignment point that would be good to have you weigh in on; commenting here to bump this in your inbox. |
Sorry we did speak about this at the latest meeting for refget. At first I didn't quite get the concept being described here but now I can see it. Also I didn't twig it was vrs 2.0. I'm personally on the side for more alignment here since the two emerging products are both conveying the same concept. I suspect my comment will be similar to what @sveinugu has said here that alignment is important. I think perhaps there is one difference though, which on the whole might not be important and it does also build on @sveinugu that the inherent property tries to flag the inherent attributes. I guess we're reusing this concept to reflect keys. One is talking about inherent identity the other about keys for a hash. So overall I'll put my hat on the side of alignment in the hope we can align even though semantically they're not 100% the same. They will be confusing for others. Unless we duplicate. That sounds worse |
I think that using So, I would propose that both specifications use something like: So in JSON Schema a class definition may look something like: "$defs": {
"MyDataClass": {
"ga4gh":{
"inherent":["inherentPropertyA", "inherentPropertyB"]
},
"type": "object",
"properties": {
"inherentPropertyA": { ... },
"inherentPropertyB": { ... },
"otherPropertyC": { ... }
}
}
} Would this work for Seq Col? |
@andrewyatz @nsheff @sveinugu @tcezard our next VRS 2.0 PRC meeting is on 10/31. One of our points of feedback was to document the ga4gh keys attribute (ga4gh/vrs#569); we would like to align with you before presenting this to the PRC. Would it be possible to get a decision on the above proposal before then so we may update our documentation accordingly? |
Really the only change we would make, then would be: description: "A collection of biological sequences."
type: object
properties:
lengths:
type: array
collated: true
description: "Number of elements, such as nucleotides or amino acids, in each sequence."
items:
type: integer
names:
type: array
collated: true
description: "Human-readable labels of each sequence (chromosome names)."
items:
type: string
sequences:
type: array
collated: true
items:
type: string
description: "Refget sequences v2 identifiers for sequences."
accessions:
type: array
collated: true
items:
type: string
description: "Unique external accessions for the sequences"
required:
- names
- lengths
ga4gh:
inherent:
- lengths
- names
- sequences I can work with this, but I don't like how At the end of the day, we can make it work however, but what is the rationale for the added complexity? I guess to collect all the related terms? Assuming we add description: "A collection of biological sequences."
type: object
properties:
lengths:
type: array
collated: true
description: "Number of elements, such as nucleotides or amino acids, in each sequence."
items:
type: integer
names:
type: array
collated: true
description: "Human-readable labels of each sequence (chromosome names)."
items:
type: string
sequences:
type: array
collated: true
items:
type: string
description: "Refget sequences v2 identifiers for sequences."
accessions:
type: array
collated: true
items:
type: string
description: "Unique external accessions for the sequences"
...
required:
- names
- lengths
ga4gh:
inherent:
- lengths
- names
- sequences
transient:
- sorted_name_length_pairs
- sorted_sequences
passthru:
- alias
- author Edit: fix yaml typo to correctly specify ga4gh section is an object, not an array. |
What about just namespacing the qualifier names, e.g. |
The rationale for the While I think this would also be achievable using a |
I wouldn't do it with the local modifiers. But I'm fine with grouping them under |
Yeah, not sure I thought it was a good idea myself, just wanted to consider it. I think your suggestion looks good with the global qualifiers. It feels slightly off that local qualifiers are exempt, but that is not a good argument to not organise the global qualifiers in this way. |
Alright. Just to be very clear here, there is a small difference between what I proposed in this comment and what followed. I am proposing that @nsheff can you comment? Was |
No, that's just a typo. The way I wrote that is actually not correct yaml, and I'll edit that now :) |
The VRS 2.0 specification uses a property (
ga4ghDigest.keys
) to indicate keys that are used in creating VRS computed identifiers. This has the same functionality as the SeqColinherent
attribute.This has been implemented across 16 VRS classes. The
ga4ghDigest
object is also used for other digest-related keywords, includingga4ghDigest.prefix
for VRS objects that are identifiable.Here are some examples:
This is an opportunity to align these terms before either standard is finalized. Is it feasible to reuse the VRS
ga4ghDigest
structure for Sequence Collections in lieu ofinherent
?The text was updated successfully, but these errors were encountered: