You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We decided to start with two schemas: a minimal schema that we would post now as what we should implement, and then an extended schema, which is in evaluation stage to see if it should end up in the minimal schema. Here are some drafts of these for comment and revision:
Minimal seqcol schema
description: "A collection of biological sequences, defined by the GA4GH Sequence Collections standard."$id: "/schemas/seqcol_base"version: 0.1.0type: objectproperties:
lengths:
type: arraycollated: truedescription: "Number of elements, such as nucleotides or amino acids, in each sequence."items:
type: integernames:
type: arraycollated: truedescription: "Human-readable identifiers of each sequence (e.g. chromosome names or accessions)."items:
type: stringsequences:
type: arraycollated: truedescription: "Digests of sequences computed using the GA4GH digest algorithm (sha512t24u)."items:
type: stringsorted_name_length_pairs:
type: arraydescription: "Sorted digests of names+lengths pairs, computed following the seqcol specification."items:
type: stringrequired:
- lengths
- namesinherent:
- lengths
- names
- sequences
Extended seqcol schema
$ref: "/schemas/seqcol_base"$id: "/schemas/seqcol_extended"properties:
masks:
type: arraycollated: truedescription: "Digests of subsequence masks indicating subsequences to be excluded from an analysis, such as repeats"items:
type: stringpriorities:
type: arraycollated: truedescription: "Annotation of whether each sequence is a primary or secondary component in the collection."items:
type: booleantopologies:
type: arraycollated: truedescription: "Annotation of whether each sequence represents a linear or other topology."items:
type: stringenum: ["circular", "linear"]default: "linear"molecule_types:
type: arraycollated: truedescription: "Designation of the type of molecule for each sequence, such as RNA, DNA, or protein."items:
type: stringalphabets:
type: arraycollated: truedescription: "The set of characters actually present in each sequence"items:
type: stringalphabet_domains:
type: arraycollated: truedescription: "The set of characters that could be included in each sequence"items:
type: string
The text was updated successfully, but these errors were encountered:
We decided to start with two schemas: a minimal schema that we would post now as what we should implement, and then an extended schema, which is in evaluation stage to see if it should end up in the minimal schema. Here are some drafts of these for comment and revision:
Minimal seqcol schema
Extended seqcol schema
The text was updated successfully, but these errors were encountered: