Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a pre snakefile to collect NRPS hmm files, add the generted hmm to the repo, and add a rule to scan for NRPS genes #5

Merged
merged 16 commits into from
Feb 20, 2024

Conversation

taylorreiter
Copy link
Member

PR checklist

  • Describe the changes you've made.
  • Describe any tests you have conducted to confirm that your changes behave as expected.
  • If you've added new software dependencies, make sure that those dependencies are included in the appropriate conda environments.
  • If you encountered bugs or features that you won't address, but should be addressed eventually, create new issues for them.

PR Description

This PR adds a rule to scan the input protein data for non-ribosomal peptide synthetase (NRPS) proteins. NRPSs make peptides via an assembly line. The HMM file that I added scans for domains that are commonly a part of the NRPS enzymes.

I haven't decided completely where to take this tasks yet. There are tools that can predict peptide sequences produced by NRPS genes once you have candidate NRPS sequences. However, without genomic context or co-expression analysis (neither of which im keen to add at the moment), it's not super clear to me if we will have all of the info we need (e.g. all of the domains that belong together) in a single transcript to predict these things. I plan to test this eventually, but @borgesadair1 let me know that this is the least important module to include right now. As such, I'm including the work i've done so far and will return to this later after i've implemented peptide annotation/characterization steps in the peptigate workflow.

Tests conducted to confirm expected behavior

I confirmed that the nrps target runs on the demo files.

Documentation updates

Will be done in a future PR.

Copy link
Member

@keithchev keithchev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Approved with a few non-critical inline comments. In general I wonder if it'd be worth writing short docstrings for the rules in download_nrps_hmm_profiles. This is just a thought; it seems like this snakefile is meant primarily to document how the final nrps.hmm file was produced, so it may not be worth it.

download_nrps_hmm_profiles.snakefile Outdated Show resolved Hide resolved
download_nrps_hmm_profiles.snakefile Outdated Show resolved Hide resolved
download_nrps_hmm_profiles.snakefile Show resolved Hide resolved
download_nrps_hmm_profiles.snakefile Show resolved Hide resolved
download_nrps_hmm_profiles.snakefile Show resolved Hide resolved
download_nrps_hmm_profiles.snakefile Outdated Show resolved Hide resolved
download_nrps_hmm_profiles.snakefile Outdated Show resolved Hide resolved
Snakefile Outdated Show resolved Hide resolved
@taylorreiter taylorreiter merged commit 639f91e into main Feb 20, 2024
2 checks passed
@taylorreiter taylorreiter deleted the ter/nrps branch February 20, 2024 14:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants