Highly significant improvement of protein sequence alignments with AlphaFold2

Data, documentation, analysis and nextflow pipeline for the manuscript "Highly significant improvement of protein sequence alignments with AlphaFold2".

Credits

This work has been carried out in Notredame Lab at the Centre for Genomic Regulation - CRG

The authors who contributed to the analysis and manuscript are:

Athanasios Baltzis
Leila Mansouri
Suzanne Jin
Bjorn Langer
Ionas Erb
Cedric Notredame

Notebooks

This repository contains a series of Jupyter Notebooks that contain the steps for replicating the analysis, tables and figures in the manuscript using R.

Pipeline and containers

The pipeline for predicting the AF2 models and producing the MSAs is built using Nextflow. It comes with a singularity container (the recipe is available here) for running AF2 and a docker container (available on DockerHub here).

Usage

Download the genetic databases required for AlphaFold2 using the provided script.
Download and format the database used for PSI-Coffee blast search (by default Uniref50).
Make sure you have singularity installed in your system.
Install the Nextflow runtime by running the following command:
```
 curl -fsSL get.nextflow.io | bash
```
You can launch the pipeline execution by entering the command shown below:
```
 nextflow run athbaltzis/msa-af2-nf
```

By default the pipeline is executed against the provided example dataset. You can modify the input data as well as the other available parameteres listed below:

`--input_fasta`

Input sequences (FASTA)

`--list`

Input lists of sequences

`--template`

Input template lists

`--pdbs`

Input experimentally determined PDB structures

`--db`

Input path to Database for PSI-Coffee

`--predict`

Predict structures with AF2 [true or false(default)]

`--AF2`

Path to AF2 predicted models (if --predict false)

`--pdb_for_dssp`

Input PDB structures for secondary structure assignment

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
bin		bin
containers		containers
data/PF00004		data/PF00004
modules		modules
notebook		notebook
templates		templates
LICENSE		LICENSE
README.md		README.md
main.nf		main.nf
nextflow.config		nextflow.config

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Highly significant improvement of protein sequence alignments with AlphaFold2

Credits

Notebooks

Pipeline and containers

Usage

`--input_fasta`

`--list`

`--template`

`--pdbs`

`--db`

`--predict`

`--AF2`

`--pdb_for_dssp`

About

Releases

Packages

Languages

License

cbcrg/msa-af2-nf

Folders and files

Latest commit

History

Repository files navigation

Highly significant improvement of protein sequence alignments with AlphaFold2

Credits

Notebooks

Pipeline and containers

Usage

--input_fasta

--list

--template

--pdbs

--db

--predict

--AF2

--pdb_for_dssp

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

`--input_fasta`

`--list`

`--template`

`--pdbs`

`--db`

`--predict`

`--AF2`

`--pdb_for_dssp`

Packages