-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
genomeToProtein doesn't match protein if input CDS range contains a stop codon #137
Comments
AFAIK stop codons don't get translated. Thus I ignore the stop codon as any (RNA) position within the stop codon can not be mapped to an amino acid in the protein sequence (such positions would basically be outside the protein sequence). Does this make sense? Would you suggest an alternative behaviour? |
Thanks for the response. You are correct that stop codon is not translated. As my example shows, if the input range contains stop codon, it's not excluded from the query and the whole range doesn't map to protein. Actually, if a range has even a single base outside AA coding region, it doesn't map to protein. The big question - is it possible to do automatically a partial mapping of a genomic range to protein ignoring non-coding regions? Ideally, if I provide a large range covering multiple exons, I'd like to receive a list of ranges with protein coordinates (probably grouped by transcripts). This would be really cool. |
the main use case for the If you would provide an example with input and expected output I could start thinking about how to implement - but I'm currently busy with other topics, so that might be more a long term solution. Alternatively, you could also implement that code part (I would maybe suggest as a new, separate function) and contribute it to |
I am working on a script for partial mapping on a gene-by-gene basis. I'll think about how to wrap it into a function. Thanks for the suggestion. |
Below is an example and the output. The input sequence was obtains by overlap with CDS segments of the gene.
I looked at the segment in the Genome Browser and found the stop codon.
Can genomeToProtein ignore stop codon?
Created on 2022-06-14 by the reprex package (v2.0.0)
The text was updated successfully, but these errors were encountered: