Skip to content

Releases: eead-csic-compbio/get_homologues

get_homologues-est

28 Aug 13:31
Compare
Choose a tag to compare

Updates of release 20180828 (v3.1.3):

06062018: improved ANI computation by skipping self-taxon BLAST hits
06062018: fixed POCP computation; now it is [100(C1 + C2)/(T1 + T2)]
06062018: fixed POCS computation; now it is [100(C1 + C2)/(T1 + T2)] and T1/T2 are #nr seqs
23072018: updated Grid Engine instructions in manuals
28082018: bug fixed: get_homologues-pl -X now removes previous DIAMOND results of one vs others when new genomes are added

get_homologues-est

24 May 11:28
Compare
Choose a tag to compare

These are the relevant updates release 20180524 (v3.1.2):

  • sequences longer than $MAXSEQLENGTH are skipped (by default $MAXSEQLENGTH=25kb)
  • compare_clusters.pl -m now produces also pangenome_genes file listing sequences names in each cluster
  • compare_clusters.pl -m now sorts taxon names when printing pangenome matrices
  • added -P option to compute perc of conserved proteins (POCP) in get_homologues.pl
  • added -P option to compute perc of conserved sequences (POCS) in get_homologues-est.pl
  • simplified headers in output FASTA files of get_homologues-est down to: >first_non-blank_word [source_taxon.fna]
  • modified phyTools::check_variants_FASTA_alignment to compute also private variants if listsA & B are passed
  • annotate_cluster.pl now can take -A/-B lists of taxa to compute private variants in the aligned sequences of the cluster
  • updated descriptions of annotate_cluster.pl in manuals

get_homologues-est

13 Mar 10:53
Compare
Choose a tag to compare

These are the relevant updates release 20180313 (v3.1.1):

modified sort_blast_results so that it now can compress individual BLAST result files with global binary $SORTBIN
get_homologues.pl and get_homologues-est.pl now compress individual BLAST files by default ($COMPRESSBLAST=1)
added global $MAXSEQLENGTH to get_homologues-est.pl to warn of long sequences, which often cause downstream problems

binaries

01 Feb 07:09
Compare
Choose a tag to compare

Compressed TAR file with binaries to be downloaded after cloning the source repository. This should be done with install.pl

get_homologues-est

03 Jan 13:49
Compare
Choose a tag to compare

Release 20180103 (v3.1.0) ships with the following changes:

added hclustering of ANdist matrix in plot_matrix_heatmap.sh for convenient cluster delimitations at distance cutoffs of 6,5,4 which correspond to ANI values of 94%, 95% and 96%, respectively
compare_clusters.pl -m now produces also pangenome CSV file for Scoary GWAS analyses with Fisher's Exact test
updated manuals with option to compute cluster intersection matrices with parse_pangenome_matrix -x
explained transposed CSV pangenome matrix for software Scoary in manual
added option 'force' to install.pl so that it can install with no supervision
added option 'no_databases' to install.pl for building docker images
removes the invariant (core-genome) and singleton (cloud-genome) columns before computing distances @ hcluster_matrix.sh
updated example figure created with plot_matrix_heatmap.sh in the manual
renamed hcluster_matrix.sh to hcluster_pangenome_matrix.sh
added links to GET_PHYLOMARKERS

get_homologues-est

23 Oct 11:13
Compare
Choose a tag to compare

Release 20171023 (v3.0.9) ships with the following changes:

removed especial chars >,<,& from cluster names in get_homologues.pl and get_homologues-est.pl
updated table of occupancy classes in the manual
added options -a and -X and improved documentation of hcluster_matrix.sh & plot_matrix_heatmap.sh
despite the increase in size, updated BLAST+ to ncbi-blast-2.6.0+ as it handles better than 2.2.27+ alignments with low complexity

get_homologues-est

18 Sep 12:06
Compare
Choose a tag to compare

Release 20170918 (v3.0.8) ships with the following bug fixes and changes:

added options -x , -c <0|1> and -f to hcluster_matrix.sh
added oneliner to transpose matrix to compare_clusters.pl
fixed parsing of filenames with -I in cases where input files are like numbers.faa

Cheers

get_homologues-est

28 Aug 14:59
Compare
Choose a tag to compare

Release 20170828 (v3.0.7) ships with the following bug fixes and changes:

compare_clusters.pl now prints lists of genes in intersections of two sets when comparing 3 cluster sets
compare_clusters.pl -m now produces a FASTA version of the binary pangenome matrix so that fully-labelled trees can be inferred with software such as IQ-TREE
added question to FAQ section in manuals explaining a way to compute ML pangenome tress with boostrap and aLRT support (Thanks Uriel Alonso and Ruben Sancho)
updated manuals and plot_matrix_heatmap.sh with options -r (remove column names and cell contents) and -k (set name for color key X-axis)
added options -d (max no. decimals) and -x (filter matrix with regex) to plot_matrix_heatmap.sh
added parse_pangenome_matrix.pl -x to compute cluster intersection between taxa in a pangenome matrix
fixed bug in compare_clusters.pl when .cluster_list file is not parsed, due to previous changes in find_taxa_FASTA_array_headers

get_homologues-est

07 Aug 10:57
Compare
Choose a tag to compare

Release 20170807 (v3.0.6) ships with the following bug fixes and changes:

parse_pangenome_matrix.pl -S now takes an integer to indicate the minimum occupancy requested of clusters
improved description of annotate_cluster.pl -h
increased length of sequence names in annotate_cluster.pl
added parsimony-informative sites in headers of blunt-end clusters produced by in annotate_cluster.pl -b
annotate_cluster.pl now shows unaligned cluster sequences and prints number of taxa in the alignment
annotate_cluster.pl now removes temporary files
added sub collapse_taxon_alignments to lib/phyTools.pm
added annotate_cluster.pl -c 40 to collapse alignments of sequences from same taxon
added Pfam domains to collapsed sequences
initialize compartments in parse_pangenome_matrix.pl to zero if empty before plotting (thanks Felipe Lira!)
transcripts2cdsCPP.pl & transcripts2cdsCPP.pl now print name of offending files with '+' chars
temp blastdb file closed properly in annotate_clusters.pl
corrected intergenic clusters produced with get_homologues.pl -g when using prokka-annotated GenBank files
updated get_homologues.pl -g and checked this section in the manual
extract_*_genbank subs in lib/phyTools.pm now parse LOCUS when accession is not available in GenBank files, such as those made with PROKKA

get_homologues-est

09 Jun 08:24
Compare
Choose a tag to compare

Release 09062017 (v3.0.5) ships with the following changes:
parse_pangenome_matrix.pl -I now supports -B to produce lists of absent genes in pan-genome matrix subsets
added message to make_nr_pangenome_matrix.pl to warn users that BLAST files must be removed with different refs
added option -r to annotate_cluster.pl so that external sequences can be used to drive cluster alignment
updated lib/ForkManager.pm to v1.19 [http://search.cpan.org/perldoc?Parallel%3A%3AForkManager]
fixed bug in get_homologues.pl -X which produced zero clusters when new sequences files were added to previous results