Skip to content

Commit

Permalink
Update vrs (#49)
Browse files Browse the repository at this point in the history
* update MSP / submodule
* update submodule to vrs/common ballot.
  • Loading branch information
larrybabb authored Sep 9, 2024
1 parent 2a8f4ef commit 02d3e20
Show file tree
Hide file tree
Showing 17 changed files with 116 additions and 113 deletions.
2 changes: 1 addition & 1 deletion .requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,6 @@ pytest
sphinx ~= 7.2
sphinx-rtd-theme ~= 1.2
pyyaml
ga4gh.gks.metaschema==0.3.0b11
ga4gh.gks.metaschema==0.3.0b12
jsonschema
referencing
88 changes: 45 additions & 43 deletions schema/cat-vrs/cat-vrs-source.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -20,8 +20,9 @@ $defs:

CategoricalVariation:
inherits: gks.core-im:DomainEntity
maturity: draft
description: >-
A representation of a categorically-defined domain for variation, in which individual
A representation of a categorically-defined domain for variation, in which individual
contextual variation instances may be members of the domain.
oneOf:
- $ref: "#/$defs/CanonicalAllele"
Expand All @@ -37,19 +38,19 @@ $defs:
this categorical variant.
items:
oneOf:
- $ref: "/ga4gh/schema/vrs/2.x/json/Variation"
- $ref: "/ga4gh/schema/gks-common/1.x/data-types/json/IRI"
- $ref: "/ga4gh/schema/vrs/2.0.0-ballot.2024-08.1/json/Variation"
- $ref: "/ga4gh/schema/gks-common/1.0.0-ballot.2024-08.1/data-types/json/IRI"

ProteinSequenceConsequence:
maturity: draft
type: object
inherits: CategoricalVariation
maturity: draft
description: >-
A change that occurs in a protein sequence as a result of genomic changes. Due to the degenerate nature
of the genetic code, there are often several genomic changes that can cause a protein sequence consequence.
The protein sequence consequence, like a :ref:`CanonicalAllele`, is defined by an
`Allele <https://vrs.ga4gh.org/en/2.0/terms_and_model.html#variation>` that is representative of a collection
of congruent Protein Alleles that share the same altered codon(s).
`Allele <https://vrs.ga4gh.org/en/2.0.0-ballot.2024-08/concepts/MolecularVariation/Allele.html#>`
that is representative of a collection of congruent Protein Alleles that share the same altered codon(s).
properties:
type:
extends: type
Expand All @@ -58,21 +59,21 @@ $defs:
description: 'MUST be "ProteinSequenceConsequence"'
definingContext:
oneOf:
- $ref: "/ga4gh/schema/vrs/2.x/json/Allele"
- $ref: "/ga4gh/schema/gks-common/1.x/data-types/json/IRI"
- $ref: "/ga4gh/schema/vrs/2.0.0-ballot.2024-08.1/json/Allele"
- $ref: "/ga4gh/schema/gks-common/1.0.0-ballot.2024-08.1/data-types/json/IRI"
description: >-
The `VRS Allele <https://vrs.ga4gh.org/en/2.0/terms_and_model.html#allele>`_
object that is congruent with (projects to the same codons) as alleles on other protein reference
The `Allele <https://vrs.ga4gh.org/en/2.0.0-ballot.2024-08/concepts/MolecularVariation/Allele.html#>`_
object that is congruent with (projects to the same codons) as alleles on other protein reference
sequences.
required:
- definingContext

CanonicalAllele:
maturity: draft
inherits: CategoricalVariation
maturity: draft
description: >-
A canonical allele is defined by an `Allele <https://vrs.ga4gh.org/en/2.0/terms_and_model.html#variation>`
that is representative of a collection of congruent Alleles, each of which depict the same nucleic acid
A canonical allele is defined by an `Allele <https://vrs.ga4gh.org/en/2.0.0-ballot.2024-08/concepts/MolecularVariation/Allele.html#>`_
that is representative of a collection of congruent Alleles, each of which depict the same nucleic acid
change on different underlying reference sequences. Congruent representations of an Allele often exist
across different genome assemblies and associated cDNA transcript representations.
type: object
Expand All @@ -84,25 +85,25 @@ $defs:
description: 'MUST be "CanonicalAllele"'
definingContext:
oneOf:
- $ref: "/ga4gh/schema/vrs/2.x/json/Allele"
- $ref: "/ga4gh/schema/gks-common/1.x/data-types/json/IRI"
- $ref: "/ga4gh/schema/vrs/2.0.0-ballot.2024-08.1/json/Allele"
- $ref: "/ga4gh/schema/gks-common/1.0.0-ballot.2024-08.1/data-types/json/IRI"
description: >-
The `VRS Allele <https://vrs.ga4gh.org/en/2.0/terms_and_model.html#allele>`_
The `Allele <https://vrs.ga4gh.org/en/2.0.0-ballot.2024-08/concepts/MolecularVariation/Allele.html#>`_
object that is congruent with variants on alternate reference sequences.
required:
- definingContext

CategoricalCnv:
maturity: draft
type: object
inherits: CategoricalVariation
maturity: draft
description: >-
A categorical variation domain is defined first by a sequence derived from a canonical `Location
<https://vrs.ga4gh.org/en/2.0/terms_and_model.html#Location>`_ , which is representative of
a collection of congruent Locations. The change or count of this sequence is also described, either
by a numeric value (e.g. "3 or more copies") or categorical representation (e.g. "high-level gain").
Categorical CNVs may optionally be defined by rules specifying the location match characteristics for
member CNVs.
A categorical variation domain is defined first by a sequence derived from a canonical `SequenceLocation
<https://vrs.ga4gh.org/en/2.0.0-ballot.2024-08/concepts/LocationAndReference/SequenceLocation.html>`_ ,
which is representative of a collection of congruent Locations. The change or count of this sequence is
also described, either by a numeric value (e.g. "3 or more copies") or categorical representation
(e.g. "high-level gain"). Categorical CNVs may optionally be defined by rules specifying the location
match characteristics for member CNVs.
properties:
type:
extends: type
Expand All @@ -111,43 +112,43 @@ $defs:
description: 'MUST be "CategoricalCnv"'
location:
oneOf:
- $ref: "/ga4gh/schema/vrs/2.x/json/SequenceLocation"
- $ref: "/ga4gh/schema/gks-common/1.x/data-types/json/IRI"
- $ref: "/ga4gh/schema/vrs/2.0.0-ballot.2024-08.1/json/SequenceLocation"
- $ref: "/ga4gh/schema/gks-common/1.0.0-ballot.2024-08.1/data-types/json/IRI"
description: >-
A `VRS Location <https://vrs.ga4gh.org/en/2.x/concepts/location/SequenceLocation.html>`_
object that represents a sequence derived from that location, and is congruent with locations
A `SequenceLocation <https://vrs.ga4gh.org/en/2.0.0-ballot.2024-08/concepts/LocationAndReference/SequenceLocation.html>`_
object that represents a sequence derived from that location, and is congruent with locations
on alternate reference sequences.
locationMatchCharacteristic:
type: string
enum: ['exact', 'partial', 'subinterval', 'superinterval']
description: >-
The characteristics of a valid match between a contextual CNV location (the query) and the
Categorical CNV location (the domain), when both query and domain are represented on the same
reference sequence. An `exact` match requires the location of the query and domain to be identical.
The characteristics of a valid match between a contextual CNV location (the query) and the
Categorical CNV location (the domain), when both query and domain are represented on the same
reference sequence. An `exact` match requires the location of the query and domain to be identical.
A `subinterval` match requires the query to be a subinterval of the domain. A `superinterval` match
requires the query to be a superinterval of the domain. A `partial` match requires at least 1 residue
of overlap between the query and domain.
copyChange:
type: string
enum: [ "EFO:0030069", "EFO:0020073", "EFO:0030068", "EFO:0030067", "EFO:0030064", "EFO:0030070", "EFO:0030071", "EFO:0030072" ]
description: >-
A representation of the change in copies of a sequence in a system. MUST be one of "EFO:0030069" (complete
genomic loss), "EFO:0020073" (high-level loss), "EFO:0030068" (low-level loss), "EFO:0030067" (loss),
"EFO:0030064" (regional base ploidy), "EFO:0030070" (gain), "EFO:0030071" (low-level gain), "EFO:0030072"
A representation of the change in copies of a sequence in a system. MUST be one of "EFO:0030069" (complete
genomic loss), "EFO:0020073" (high-level loss), "EFO:0030068" (low-level loss), "EFO:0030067" (loss),
"EFO:0030064" (regional base ploidy), "EFO:0030070" (gain), "EFO:0030071" (low-level gain), "EFO:0030072"
(high-level gain).
copies:
oneOf:
- type: integer
- $ref: "/ga4gh/schema/vrs/2.x/json/Range"
- $ref: "/ga4gh/schema/vrs/2.0.0-ballot.2024-08.1/json/Range"
description: >-
The integral number of copies of the subject in a system.
required:
- location

DescribedVariation:
maturity: draft
type: object
inherits: CategoricalVariation
maturity: draft
description: >-
Some categorical variation concepts are supported by custom nomenclatures or text-descriptive
representations for which a categorical variation model does not exist. DescribedVariation is
Expand All @@ -162,12 +163,12 @@ $defs:
label:
extends: label
description: >-
A primary label for the categorical variation. This required property should provide a
A primary label for the categorical variation. This required property should provide a
short and descriptive textual representation of the concept.
description:
extends: description
description: >-
A textual description of the domain of variation that should match the categorical
A textual description of the domain of variation that should match the categorical
variation entity.
required:
- label
Expand Down Expand Up @@ -221,12 +222,12 @@ $defs:


NumberCount:
maturity: draft
# ga4ghDigest:
# keys:
# - numberCount
type: object
# inherits: QuantityVariance
maturity: draft
description: >-
The absolute count of a discrete assayable unit (e.g. chromosome, gene, or sequence).
properties:
Expand All @@ -239,19 +240,19 @@ $defs:
count:
oneOf:
- type: integer
- $ref: "/ga4gh/schema/vrs/2.x/json/Range"
- $ref: "/ga4gh/schema/vrs/2.0.0-ballot.2024-08.1/json/Range"
description: >-
The integral quantity or quantity range of the subject in a system
required: [ "count" ]


NumberChange:
maturity: draft
# ga4ghDigest:
# keys:
# - numberChange
# prefix: CX
type: object
maturity: draft
description: >-
A quantitative assessment of a unit within a system (e.g. genome, cell,
etc.) relative to a baseline quantity.
Expand All @@ -265,7 +266,7 @@ $defs:
change:
oneOf:
- type: integer
- $ref: "/ga4gh/schema/vrs/2.x/json/Range"
- $ref: "/ga4gh/schema/vrs/2.0.0-ballot.2024-08.1/json/Range"
- copyChange:
type: string
enum: [ "EFO:0030069", "EFO:0020073", "EFO:0030068", "EFO:0030067", "EFO:0030064", "EFO:0030070",
Expand All @@ -278,9 +279,9 @@ $defs:
required: [ "change" ]

QuantityVariance:
maturity: draft
type: object
# inherits: NumberCount
maturity: draft
description: >-
The Quantity Variance class captures one axis of variation in the generalized model of categorical
variation. It is used to model quantitative measure changes in a given biological level of
Expand Down Expand Up @@ -322,7 +323,7 @@ $defs:
# copies:
# oneOf:
# - type: integer
# - $ref: "/ga4gh/schema/vrs/2.x/json/Range"
# - $ref: "/ga4gh/schema/vrs/2.0.0-ballot.2024-08.1/json/Range"
# description: >-
# The integral number of copies of the subject in a system.
# required:
Expand All @@ -340,6 +341,7 @@ $defs:

CategoricalVariant:
inherits: gks.core-im:DomainEntity
maturity: draft
description: >-
A top-level representation of a categorically-defined domain for variation across one or multiple biological levels in which individual contextual variants may be members of the domain.
# type: object
Expand Down
12 changes: 6 additions & 6 deletions schema/cat-vrs/def/CanonicalAllele.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
**Computational Definition**

A canonical allele is defined by an `Allele <https://vrs.ga4gh.org/en/2.0/terms_and_model.html#variation>` that is representative of a collection of congruent Alleles, each of which depict the same nucleic acid change on different underlying reference sequences. Congruent representations of an Allele often exist across different genome assemblies and associated cDNA transcript representations.
A canonical allele is defined by an `Allele <https://vrs.ga4gh.org/en/2.0.0-ballot.2024-08/concepts/MolecularVariation/Allele.html#>`_ that is representative of a collection of congruent Alleles, each of which depict the same nucleic acid change on different underlying reference sequences. Congruent representations of an Allele often exist across different genome assemblies and associated cDNA transcript representations.

**Information Model**

Expand Down Expand Up @@ -33,22 +33,22 @@ Some CanonicalAllele attributes are inherited from :ref:`CategoricalVariation`.
- 0..m
- Alternative name(s) for the Entity.
* - extensions
- `Extension </ga4gh/schema/gks-common/1.x/data-types/json/Extension>`_
- :ref:`Extension`
- 0..m
- A list of extensions to the Entity, that allow for capture of information not directly supported by elements defined in the model.
* - mappings
- `ConceptMapping </ga4gh/schema/gks-common/1.x/data-types/json/ConceptMapping>`_
- :ref:`ConceptMapping`
- 0..m
- A list of mappings to concepts in terminologies or code systems. Each mapping should include a coding and a relation.
* - members
- `Variation </ga4gh/schema/vrs/2.x/json/Variation>`_ | `IRI </ga4gh/schema/gks-common/1.x/data-types/json/IRI>`_
- :ref:`Variation` | :ref:`IRI`
- 0..m
- A non-exhaustive list of VRS variation contexts that satisfy the constraints of this categorical variant.
* - type
- string
- 1..1
- MUST be "CanonicalAllele"
* - definingContext
- `Allele </ga4gh/schema/vrs/2.x/json/Allele>`_ | `IRI </ga4gh/schema/gks-common/1.x/data-types/json/IRI>`_
- :ref:`Allele` | :ref:`IRI`
- 1..1
- The `VRS Allele <https://vrs.ga4gh.org/en/2.0/terms_and_model.html#allele>`_ object that is congruent with variants on alternate reference sequences.
- The `Allele <https://vrs.ga4gh.org/en/2.0.0-ballot.2024-08/concepts/MolecularVariation/Allele.html#>`_ object that is congruent with variants on alternate reference sequences.
18 changes: 9 additions & 9 deletions schema/cat-vrs/def/CategoricalCnv.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
**Computational Definition**

A categorical variation domain is defined first by a sequence derived from a canonical `Location <https://vrs.ga4gh.org/en/2.0/terms_and_model.html#Location>`_ , which is representative of a collection of congruent Locations. The change or count of this sequence is also described, either by a numeric value (e.g. "3 or more copies") or categorical representation (e.g. "high-level gain"). Categorical CNVs may optionally be defined by rules specifying the location match characteristics for member CNVs.
A categorical variation domain is defined first by a sequence derived from a canonical `SequenceLocation <https://vrs.ga4gh.org/en/2.0.0-ballot.2024-08/concepts/LocationAndReference/SequenceLocation.html>`_ , which is representative of a collection of congruent Locations. The change or count of this sequence is also described, either by a numeric value (e.g. "3 or more copies") or categorical representation (e.g. "high-level gain"). Categorical CNVs may optionally be defined by rules specifying the location match characteristics for member CNVs.

**Information Model**

Expand Down Expand Up @@ -33,34 +33,34 @@ Some CategoricalCnv attributes are inherited from :ref:`CategoricalVariation`.
- 0..m
- Alternative name(s) for the Entity.
* - extensions
- `Extension </ga4gh/schema/gks-common/1.x/data-types/json/Extension>`_
- :ref:`Extension`
- 0..m
- A list of extensions to the Entity, that allow for capture of information not directly supported by elements defined in the model.
* - mappings
- `ConceptMapping </ga4gh/schema/gks-common/1.x/data-types/json/ConceptMapping>`_
- :ref:`ConceptMapping`
- 0..m
- A list of mappings to concepts in terminologies or code systems. Each mapping should include a coding and a relation.
* - members
- `Variation </ga4gh/schema/vrs/2.x/json/Variation>`_ | `IRI </ga4gh/schema/gks-common/1.x/data-types/json/IRI>`_
- :ref:`Variation` | :ref:`IRI`
- 0..m
- A non-exhaustive list of VRS variation contexts that satisfy the constraints of this categorical variant.
* - type
- string
- 1..1
- MUST be "CategoricalCnv"
* - location
- `SequenceLocation </ga4gh/schema/vrs/2.x/json/SequenceLocation>`_ | `IRI </ga4gh/schema/gks-common/1.x/data-types/json/IRI>`_
- :ref:`SequenceLocation` | :ref:`IRI`
- 1..1
- A `VRS Location <https://vrs.ga4gh.org/en/2.x/concepts/location/SequenceLocation.html>`_ object that represents a sequence derived from that location, and is congruent with locations on alternate reference sequences.
- A `SequenceLocation <https://vrs.ga4gh.org/en/2.0.0-ballot.2024-08/concepts/LocationAndReference/SequenceLocation.html>`_ object that represents a sequence derived from that location, and is congruent with locations on alternate reference sequences.
* - locationMatchCharacteristic
- string
- 0..1
- The characteristics of a valid match between a contextual CNV location (the query) and the Categorical CNV location (the domain), when both query and domain are represented on the same reference sequence. An `exact` match requires the location of the query and domain to be identical. A `subinterval` match requires the query to be a subinterval of the domain. A `superinterval` match requires the query to be a superinterval of the domain. A `partial` match requires at least 1 residue of overlap between the query and domain.
- The characteristics of a valid match between a contextual CNV location (the query) and the Categorical CNV location (the domain), when both query and domain are represented on the same reference sequence. An `exact` match requires the location of the query and domain to be identical. A `subinterval` match requires the query to be a subinterval of the domain. A `superinterval` match requires the query to be a superinterval of the domain. A `partial` match requires at least 1 residue of overlap between the query and domain.
* - copyChange
- string
- 0..1
- A representation of the change in copies of a sequence in a system. MUST be one of "EFO:0030069" (complete genomic loss), "EFO:0020073" (high-level loss), "EFO:0030068" (low-level loss), "EFO:0030067" (loss), "EFO:0030064" (regional base ploidy), "EFO:0030070" (gain), "EFO:0030071" (low-level gain), "EFO:0030072" (high-level gain).
- A representation of the change in copies of a sequence in a system. MUST be one of "EFO:0030069" (complete genomic loss), "EFO:0020073" (high-level loss), "EFO:0030068" (low-level loss), "EFO:0030067" (loss), "EFO:0030064" (regional base ploidy), "EFO:0030070" (gain), "EFO:0030071" (low-level gain), "EFO:0030072" (high-level gain).
* - copies
- integer | `Range </ga4gh/schema/vrs/2.x/json/Range>`_
- integer | :ref:`Range`
- 0..1
- The integral number of copies of the subject in a system.
Loading

0 comments on commit 02d3e20

Please sign in to comment.