ProteinCartography v0.5.0

Overview

This release includes a number of minor improvements and also introduces a new organization for the output directories generated by the pipeline. Because snakemake is a file-based workflow engine, this change unfortunately means that this version of the pipeline is not compatible with previous versions. In other words, it will not be possible to re-run the new version of the pipeline with output directories that were initially generated by prior versions of the pipeline. Instead, it will be necessary to re-run the pipeline from scratch.

New features and improvements

Reorganize the directory of output files to improve clarity and more clearly distinguish the final outputs of the pipeline from intermediate outputs. (This is a breaking change; see above.)
Merge Snakefile_ff (the "cluster" mode of the pipeline) into the main Snakefile and add a config parameter to specify whether to run the pipeline in "search" or "cluster" mode.
Update and clarify some sections of the main README.
Add developer docs.

Fixes

Generate TM scores for each of the input proteins versus all of the query proteins (previously, some input-query protein pairs did not have a TM score due to Foldseek's filtering).
Fix a bug that may have prevented the pipeline from running when only input FASTA files (rather than PDBs) are provided.
Use unverified requests to query the ESMFold API as a work-around for ESMFold's expired SSL certs (from external contributor @naailkhan28).
Add integration tests for the "cluster" mode of the pipeline.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.5.0

ProteinCartography v0.5.0

Overview

New features and improvements

Fixes

Contributors