Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature location in a join, how to correnctly translate #85

Open
albodrug opened this issue Apr 17, 2024 · 3 comments
Open

feature location in a join, how to correnctly translate #85

albodrug opened this issue Apr 17, 2024 · 3 comments

Comments

@albodrug
Copy link

Hello,

Thanks for the great tool.

I have downloaded a .gb file in a similar way to article_examples/0_download_the_plasmid.py
I have a gene with several isoforms and each isoform is summarized in a single CDS feature with a join for positions, like this:

CDS             complement(join(754..944,1514..1784,1858..2045,2923..3103,
                     4147..4728))

When viewing the gene with dna_feature_viewer, it looks like only the first coordinate and the last coordinate are taken into account, making the CDS span from 754 to 4728, when I would like 5 chunks corresponding to the five CDS ranges specified in the join.

Is this possible with DnaFeatureViewer, or should I split these CDS manually?

Here is the picture I obtain:
Screenshot from 2024-04-17 15-46-49

Thanks!
Alex

@veghp
Copy link
Member

veghp commented Apr 19, 2024

Thanks for trying out this tool.

Indeed, join() Genbank features are translated into one feature with the first start and last end locations. In detail, .translate_record("plasmid.gb") calls dna_features_viewer.biotools.load_record() which returns features with CompoundLocations, which are then translated by BiopythonTranslatorBase.translate_record() into GraphicFeatures.

You can try creating a new set of features using a loop:

in the example plasmid file I added this to FEATURES:

     CDS             complement(join(754..944,1514..1784,1858..2045,2923..3103,
                     4147..4728))
                     /gene="test(complement(join)"

using the example in A_linear_plot.py:

from dna_features_viewer import BiopythonTranslator
class CustomTranslator(BiopythonTranslator):

    # Label fields indicates the order in which annotations fields are
    # considered to determine the feature's label
    label_fields = ["label", "note", "name", "gene"]

    def compute_feature_legend_text(self, feature):
        return feature.type
    
    def compute_feature_color(self, feature):
        return {
            "rep_origin": "yellow",
            "CDS": "#ffd383",  # light orange
            "regulatory": "red",
            "misc_recomb": "#fbf3f6",  # pink
            "misc_feature": "#d1e9f1",  # light blue
            "backbone": "darkblue",
        }[feature.type]
    
    def compute_feature_box_color(self, feature):
        return "white"
    
    def compute_feature_box_linewidth(self, feature):
        return 0

# ADDED NEW SECTION:
import copy
from dna_features_viewer import biotools

plasmid = biotools.load_record("plasmid.gb")
for feature in plasmid.features:
    if len(feature.location.parts) != 1:  # implicit check, compound locations will have multiple parts
        for part in feature.location.parts:
            new_feature = copy.deepcopy(feature)  # otherwise overwrites the original feature
            new_feature.location = copy.deepcopy(part)
            plasmid.features += [new_feature]

translator = BiopythonTranslator()
graphic_record = translator.translate_record(plasmid)  # note this is the record, not the file
ax, _ = graphic_record.plot(figure_width=13, strand_in_label_threshold=7)
graphic_record.plot_legend(ax=ax, loc=1, ncol=3, frameon=False)
ax.figure.savefig("A_linear_plot.svg", bbox_inches="tight")

A_linear_plot

@veghp
Copy link
Member

veghp commented Apr 19, 2024

There may also be room for code improvement: BiopythonTranslatorBase.translate_record() calls self.compute_filtered_features() but this is defined only in the child class BiopythonTranslator. Proposed solution is to move compute_filtered_features() to the parent class.

@albodrug
Copy link
Author

Thanks a ton for the insight! This is super helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants