-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a pre snakefile to collect NRPS hmm files, add the generted hmm to the repo, and add a rule to scan for NRPS genes #5
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Approved with a few non-critical inline comments. In general I wonder if it'd be worth writing short docstrings for the rules in download_nrps_hmm_profiles
. This is just a thought; it seems like this snakefile is meant primarily to document how the final nrps.hmm
file was produced, so it may not be worth it.
PR checklist
conda
environments.PR Description
This PR adds a rule to scan the input protein data for non-ribosomal peptide synthetase (NRPS) proteins. NRPSs make peptides via an assembly line. The HMM file that I added scans for domains that are commonly a part of the NRPS enzymes.
I haven't decided completely where to take this tasks yet. There are tools that can predict peptide sequences produced by NRPS genes once you have candidate NRPS sequences. However, without genomic context or co-expression analysis (neither of which im keen to add at the moment), it's not super clear to me if we will have all of the info we need (e.g. all of the domains that belong together) in a single transcript to predict these things. I plan to test this eventually, but @borgesadair1 let me know that this is the least important module to include right now. As such, I'm including the work i've done so far and will return to this later after i've implemented peptide annotation/characterization steps in the peptigate workflow.
Tests conducted to confirm expected behavior
I confirmed that the
nrps
target runs on the demo files.Documentation updates
Will be done in a future PR.