If git LFS, please intall it using the following steps:
mkdir -p gitLFS
cd gitLFS/
wget https://github.com/git-lfs/git-lfs/releases/download/v2.9.0/git-lfs-linux-amd64-v2.9.0.tar.gz
tar -xf git-lfs-linux-amd64-v2.9.0.tar.gz
chmod 755 install.sh
sudo ./install.sh
Get CANVAS project using:
git clone https://github.com/jorgeMFS/canvas.git
cd canvas/
To perform installation correctly, docker and docker compose must be installed in the system (see https://docs.docker.com/engine/install/ubuntu/).
Then follow these instructions:
git clone https://github.com/jorgeMFS/canvas.git
cd canvas
docker-compose build
docker-compose up -d && docker exec -it canvas bash && docker-compose down
Give run permissions to the files and Install Tools:
chmod +x *.sh
bash Make.sh;
To run the pipeline and obtain all the Reports in the folder reports, use the following commands. Note that it is not required to perform database reconstruction and feature recreation to perform any other tasks. However, if you wish to recreate the features reports, you must perform the database reconstruction task.
To obtain the Human Herpesvirus plot run:
cd scripts || exit;
python compare_cmix_hhv.py
To obtain the Compression Benchmark plots run:
cd python || exit;
python select_best_nc_model.py;
To perform the synthetic sequence test run:
cd scripts || exit;
bash Stx_seq_test.sh;
To perform classification run the following code:
cd python || exit;
python prepare_classification.py; #recreate classification dataset
python classifier.py; #perform classifications
To perform the complete IR analysis and create:
- boxplots;
- 2d scatter plots;
- 3d scatter plots;
- top taxonomic group lists;
- Occurrence of each Genus.
Execute this code:
cd python || exit;
python ir_analysis.py; # Performs complete IR analysis
To obtain the Human Herpesvirus plot run:
cd scripts || exit;
bash Herpesvirales.sh;
To obtain the Human Herpesvirus plot run:
cd scripts || exit;
python compare_cmix_hhv.py
If you wish to reconstruct the Viral database, run the following script:
cd scripts || exit;
bash Build_DB.sh;
To create the features for analysis and classification (very time consuming, may take several days) run:
cd scripts || exit;
bash Process_features.sh;
To recreate the compression reports used for benchmark (very time consuming, may take several days) run:
cd scripts || exit;
bash Compress.sh;
The cladograms require GUI application. As such, the reproduction of the trees has to be performed outside of the docker on the Ubuntu system on the /canvas folder:
chmod +x *.sh
bash so_dependencies.sh #install Ubuntu system dependencies required for the script to run and Anaconda
conda create -n canvas python=3.6
conda activate canvas
bash Make.sh #install python libs
bash Install_programs.sh #install tools using conda
Afterwards, to obtain the cladogram plots run:
cd python || exit;
python phylo_tree.py;
Check out the website of this project: https://asilab.github.io/canvas/
Please cite the followings, if you use CANVAS:
Jorge Miguel Silva, Diogo Pratas, Tânia Caetano, Sérgio Matos, The complexity landscape of viral genomes, GigaScience, Volume 11, 2022, giac079, https://doi.org/10.1093/gigascience/giac079
@article{10.1093/gigascience/giac079,
author = {Silva, Jorge Miguel and Pratas, Diogo and Caetano, Tânia and Matos, Sérgio},
title = "{The complexity landscape of viral genomes}",
journal = {GigaScience},
volume = {11},
year = {2022},
month = {08},
issn = {2047-217X},
doi = {10.1093/gigascience/giac079},
url = {https://doi.org/10.1093/gigascience/giac079},
note = {giac079},
eprint = {https://academic.oup.com/gigascience/article-pdf/doi/10.1093/gigascience/giac079/45332144/giac079.pdf},
}
- Ubunto 18.0 or higher
- Docker and docker-compose
- Anaconda
- Python3.6
Please let us know if there is any issues.
CANVAS is under MIT license. For more information, click here.