Evaluating the results of the peptigate peptide prediction pipeline

Purpose

This repository assesses the accuracy of the peptigate pipeline by comparing peptide predictions from the human transcriptomes against orthogonal data sets (ribosome profiling, peptide databases, and peptidomics mass spectrometry).

For more information, see the pub, "Predicting bioactive peptides from transcriptome assemblies with the peptigate workflow.".

Installation and Setup

This repository uses conda to manage software environments and installations. You can find operating system-specific instructions for installing miniconda here. After installing conda and mamba, run the following command to create the pipeline run environment.

mamba env create -n pepeval --file envs/dev.yml
conda activate pepeval

The arcadiathemeR R package isn't available to install via conda. After activating the conda environment, use the following Rscript to install it.

Rscript scripts/install_arcadiathemer.R

The notebooks can also be run using the same environment.

Overview

This reposity assess whether the peptigate pipeline predicts real peptides from the human transcriptome assembly. It does this by comparing the peptigate peptide predictions against four orthogonal data sources: ribosome profiling, peptide databases, bona fide long non-coding RNAs, and strength of translation initiation sequences (Kozak sequences). See the README and notebook in each sub-folder for a description of the analysis and results of each comparison. Note that each notebook name is prepended with its creation date.

Description of the folder structure

LICENSE: specifies terms for re-use of the code in this repo.
README.md: describes the contents of this repo and how to interact with it.
envs/: documents conda software environments used for analyses in this repo.
evaluation/: contains code, notebooks, documentation, and results for comparing the peptigate results against orthogonal data sets.
- kozak_scores/: compares the strength of Kozak sequences (translation initiation sequences) in peptigate-predicted peptides against TransDecoder-predicted open reading frames in the human transcriptome.
- noncoding_rnas/: tests whether peptigate predicted peptides from any bona fide long non-coding RNAs.
- peptipedia/: compares the peptigate peptide predictions against Peptipedia, a large database of bioactive peptide sequences.
- riborf/: compares the human transcriptome sORF-encoded peptides predicted by peptigate against open reading frames predicted by the tool ribORF from over 600 human ribosomal profiling data sets.
peptigate/: contains documentation of how we ran peptigate on the human RefSeq transcriptome as well as results files output by peptigate.
.github/, .vscode/, .gitignore, .pre-commit-config.yml, Makefile, pyproject.toml: Control the developer behavior of the repository.

Data

This repository predicts peptides in the human RefSeq transcriptome. All peptide predictions (the results of running peptigate) are in the peptigate results folder. Download instructions for other auxiliary files required to reproduce the results in this repository are located in analysis-specific READMEs.

Compute Specifications

Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: macOS Big Sur ... 10.16
Ram: 64 GB

Contributing

See how we recognize feedback and contributions to our code.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Evaluating the results of the peptigate peptide prediction pipeline

Purpose

Installation and Setup

Overview

Description of the folder structure

Data

Compute Specifications

Contributing

About

Releases 1

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 62 Commits
.github		.github
.vscode		.vscode
envs		envs
evaluation		evaluation
peptigate		peptigate
scripts		scripts
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml

License

Arcadia-Science/2024-peptigate-evaluation

Folders and files

Latest commit

History

Repository files navigation

Evaluating the results of the peptigate peptide prediction pipeline

Purpose

Installation and Setup

Overview

Description of the folder structure

Data

Compute Specifications

Contributing

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Languages

Packages