Releases: nextgenusfs/funannotate
Releases · nextgenusfs/funannotate
funannotate v1.4.1
- bug fix for
funannotate predict
during parsing the soft masked genome -- for large genomes this was slow and used too much memory. it is now multithreaded and has lower memory footprint. #197 - bug fix for
ncRNA
models are now listed as full length, should no longer cause NCBI errors #195 - support multiple inputs to
--other_gff
#191 - make
augustus
use--softmasking=1
option - default value for
--soft_mask
is now set to 2 kb2000
- output fasta files are now wrapped at 80 characters
tbl2asn
is now multithreaded on large genomes or those with more than 10000 contigs- several updates to parsing of GenBank files to deal with unexpected formatting #196
funannotate v1.4.0
- support for long-read RNA-seq data:
funannotate train
andfunannotate update
can take PacBio isoSeq (--pacbio_isoseq
), Nanopore cDNA reads (--nanopore_cdna
), and Nanopore direct mRNA (--nanopore_mrna
). - fix for important bug in transcript alignments in
funannotate predict
-- bug in previous versions related to multi-exon crick alignments not getting correctly parsed into GFF3 alignments - soft masking is now decoupled from
funannotate predict
, this is now done withfunannotate mask
. Reason for this switch is to allow more flexibility in how the assembly is soft masked -- this can be done externally with another program. This change will allow users that don't have access to RepBase to use an alternative from RepeatMasker/RepeatModeler. One alternative is RED -- I wrote a wrapper for called RedMask funannotate predict
can now run without GeneMark being installed -- again to accommodate users that may be unable to use GeneMark due to licensing. Note you can pass gene predictions from any external program to--other_gff
and they will be handed off to Evidence Modeler.- spaces in either strain or isolate name will be stripped #180
- default program for
funannotate clean
changed to minimap2 #176 - fix errors in partial gene models derived from using EVM script to generate proteins, this is now done internally using exact coordinates #184
- added
--soft_mask
option tofunannotate predict
which will control the option with same name in GeneMark, i.e. default is--soft_mask 5000
which means that repeat regions less than 5 kb will be ignored for GeneMark prediction, those greater than 5 kb will be fed to Genemark. #185 - bug fixes for
tbl
file generation. all tRNA models will be partial #184 - improvement to how data from
funannotate train
is used in prediction steps - Slight changes for clarity to
funannotate predict
flags for evidence alignments:
--protein_evidence Proteins to map to genome (prot1.fa prot2.fa uniprot.fa). Default: uniprot.fa
--protein_alignments Pre-computed exonerate protein alignments (see docs for format)
--transcript_evidence mRNA/ESTs to align to genome (trans1.fa ests.fa trinity.fa). Default: none
--transcript_alignments Pre-computed transcript alignments in GFF3 format
- added
funannotate util bam2gff3
script to convert coordsorted RNA-seq BAM alignments to GFF3 compatible alignment file. - fix bug for input of files+weight in
funannotate predict
-- script would get hung up if you passed--other_gff snap_alignemnts.gff3:5
#191 - allow for non-standard LocusTags - will now split on last underscore #191
funannotate v1.3.4
- bug fixes for sec met cluster output files and corresponding MiBIG cluster mapping
- add
tRNAscan-SE
tofunannotate check
andpredict
- update menu with some params that were missing
funannotate v1.3.3
- bug fix for
funannotate compare
where GO enrichment not being run in parallel from last update - use diamond blastp search for ortholog detection --> speed increased.
- don't run seqclean if file present
- update docker release to newest version of funannotate as well as newest version of Trinity, PASA
funannotate v1.3.2
- added several utility scripts --> accessible by
funannotate util
submenu. This includesfunannotate util compare
which will compare multiple annotations to a reference.
$ funannotate util
Usage: funannotate util <arguments>
version: 1.3.2
Commands: compare Compare annotations to reference (GFF3 or GBK annotations)
tbl2gbk Convert TBL format to GenBank format
gbk2parts Convert GBK file to individual components
gff2proteins Convert GFF3 + FASTA files to protein FASTA
gff2tbl Convert GFF3 format to NCBI annotation table (tbl)
- bug fix for
funannotate remote
moving logfile - bug fix for mapping proteins to genome where tmp folder wasn't being properly removed
- run GO enrichment in parallel in
funannotate compare
- update colors in some graphs from
funannotate compare
to 24-pack Crayola colors - add option to use
iqtree
to draw ML phylogeny infunannotate compare
- bug fix for
funannotate database
command where it was not displaying table correctly.
funannotate v1.3.1
- bug fix for
funannotate setup
added missing shutil library import
funannotate v1.3.0
- bug fix for weights being set for Augustus HiQ models in
funannotate predict
- bug fix for download_buscos function
- bug fix for
funannotate annotate
where tbl file was occasionally not being parsed correctly --> re-write of parsing function - fix bug in antiSMASH/MiBIG parsing
- add method to try to recover from failed GeneMark run
- several bug fixes for
funannotate update
related to UTRs and multiple transcripts per locus. - added missing dependencies to
funannotate check
- updated code to work with PASA > v2.3 - this is important PASA update that allows SQLite usage instead of MySQL
- improved terminal log output to tell user which files (with locations) are being re-used if they are found.
funannotate v1.2.0
- v1.2.0 now supports multiple transcripts per gene locus. The funannotate pipeline will only generate multiple transcripts per locus if given evidence in the form of RNA-seq data, this is done in the
funannotate update
command. It should also now support input with multiple transcripts as well. - move installation of busco models to
funannotate setup
- added annotation edit distance (AED) to
funannotate update
to record the changes in annotation. As well the PASA annotation update text file is changed to incorporate these changes as well - accessory script
util/compare2annotations.py
can compare multiple annotations in either GFF3 or GBK format to a reference, generating summary stats as well as individual gene stats (AED per mRNA and CDS) - added a
--drop
option tofunannotate fix
that you can remove unwanted gene model annotations, to use pass a file containing locus_tag (1 per line) to the--drop
parameter - fix bug in finding high-quality Augustus predictions (HiQ) models in
funannotate predict
funannotate predict
will now detect if atraining
folder exists in output directory, if it does it will find the correct PASA, BAM, and Trinity output and use automatically during the prediction step.
funannotate v1.1.1
- fix for braker to work on docker. For some reason (I don't know why) the symlinks that braker tries to create cause an error when run on docker. The error references too many levels of symlinks essentially. To circumvent this, I modified
braker.pl
code to copy instead of symlink. Also fixed thebraker.pl --version
option which was broke in most recent release. - Note for a "normal" system, v1.1.0 should work fine. The updated braker code was run on both docker and Mac native and runs fine on those, hopefully also working well on linux.
funannotate v1.1.0
- bumping version to 1.1.0 to highlight that v1.0.X versions have a bug in the tbl annotation file and will not pass GenBank specs. This was derived from dropping GAG from funannotate I had the tbl spec wrong for adding transcript_id and protein_id to both CDS and mRNA features.
- fixes for
funannotate update
and properly filtering overlapping genes - fix for
funannotate annotate
that was switching the 5' and 3' partial gene designations on crick orientated gene models, causing them to look correct after predict step and then become errors after annotate step - added Braker 2.0.3 to funannotate.... this was necessary as
braker.pl --version
doesn't display the version number so I can't enforce a version requirement. The larger issue has to do with how the different versions of braker save the output data, there are at least 3 different behaviors in the last 4 or 5 versions which makes impossible for funannotate to determine where output will be.