Deal with multiple CDS IDs for the same transcript #120

nsoranzo · 2019-09-25T01:11:02Z

in the gstf_preparation tool.

Biologically, a single mRNA can lead to different CDSs (and therefore protein translations) due to alternative translational start sites. This is in fact allowed in the GFF3 standard: https://github.com/The-Sequence-Ontology/Specifications/blob/master/gff3.md (look for "alternative translational start sites"). If a CDS is discountinuous, its fragments must use the same ID, so the ID can be used to group the fragments composing the various alternative CDSs.

Ensembl seem to enforce the "one CDS per transcript" rule in its databases, but we don't have to.

Additional problem: same GFF3 files (e.g. the one in the gstf_preparation tool help!) use different IDs for fragments of the same CDS, which I think is non-standard.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deal with multiple CDS IDs for the same transcript #120

Deal with multiple CDS IDs for the same transcript #120

nsoranzo commented Sep 25, 2019

Deal with multiple CDS IDs for the same transcript #120

Deal with multiple CDS IDs for the same transcript #120

Comments

nsoranzo commented Sep 25, 2019