Skip to content

Releases: ksahlin/strobealign

v0.6

20 Feb 07:01
a8b7852
Compare
Choose a tag to compare

Version 0.6 fixes a crucial bug introduced in v0.5 and has two additional bug fixes that improve accuracy. It is highly recommended to update to this version.

  1. Crucial bugfix to v0.5 causing rare but occasional alignments to very long reference regions due to bug in coordinate. This becomes detrimental to speed.
  2. Identifying symmetrical hash collisions and in those cases test the reverse orientation. This leads to a further slight bump in alignment accuracy over previous versions, particularly for shorter read lengths.
  3. Fix to rare but occasional uninitialized joint alignment score S calculation that would cause suboptimal alignment
  4. Fixes reporting of template len field in SAM output if deletion in alignment.

v0.5

16 Feb 17:43
Compare
Choose a tag to compare

Added features, some improvements in alignment (accuracy), and minor bugfixes.

  1. Added parameter -N [INT] to output secondary alignments
  2. Base level alignment parameters can now be specified from command line -A -B -E -O
  3. Improved MAPQ calculation: calculating them from alignments (if alignment mode) instead of from seeds.
  4. Update default base-level alignment parameters for better alignments around indels.
  5. Added Quality values, AS:i and NM:i tags to SAM output.

See INDEL/SNV calling benchmark in README.

v0.4

16 Jan 12:08
Compare
Choose a tag to compare
  • Implemented bitpacking of reference ID and strobe offset. Gives about 15-20% better Peak memory. Use one int for both values. 24 bits for ref_id (can handle up to 2^24 = 16,777,216 unique references), and 8 bits for strobe offset (meaning 255+k as maximum seed length)
  • Implemented max_sites and max_score_droppoff as parameters instead of hardcoded

v0.3

13 Jan 10:37
Compare
Choose a tag to compare
  • ~10% faster indexing by skipping some unnecessary computations of unique seeds.
  • Several improvements to base level alignments using ssw. Fixed known bugs that happened occasionally with the flag, cigar string, and the NM flag in the sam file.
  • Changed to reporting Eq/X cigar strings instead of M.

v0.2.1

09 Jan 10:37
Compare
Choose a tag to compare
  1. Introduced a max seed size constraint when sampling seeds, only active in few regions where syncmers are sparsely sampled.
  2. Parameter -r can now take any integer value.

v0.2

30 Dec 12:48
Compare
Choose a tag to compare

Important bugfix [1] and added ssw for rescue alignment [2] since ksw is only for extension.

These fixes improve accuracy in paired-end alignment mode to v0.1. I also observe further increased speed (~15-20%) on all my test data sets aligning to hg38.

v0.1

27 Dec 11:02
Compare
Choose a tag to compare

Major update of strobealign. This version comes with an improvement in accuracy (and the number of aligned reads) around lengths 100-125nt reads, and it is also faster than older versions for these lengths. Most notable changes:

  • Algorithm changes

    • Using xxhash instead of no hash for strobes. Gives a better pseudorandom generation of hashes for linking.
    • Linking strobes using bitcount( (h_1 ^ h_2) ^ q) which creates a skewed seed length distribution towards shorter seeds in the window. This improves mapping candidate read detection particularly for shorter reads (100nt).
  • Parameters

    • Adding the option to customize sampling window of second strobe with -l and -u.
    • Adding a parameter -r [INT] for approximate read length (default 150). This will make strobealign customize parameters -l -u, and -k
  • Also cuts the reference accessions at first space, which fixes issue #4

v0.0.3.2

30 Nov 22:03
Compare
Choose a tag to compare

Takes care of both known bugs stated in the v0.0.3 release

v0.0.3

03 Nov 13:50
Compare
Choose a tag to compare

Version 0.0.3

  1. Has paired-end alignment mode
  2. Implements a rescue mode both in SE and PE alignment modes (described in preprint).
  3. Changed to symmetrical strobemer hash values due to inversions (described in preprint).

Known bugs:

  • Negative SAM coordinate bug in Single-end alignment mode. Observed once in 150M simulated reads
  • Segfault in paired-end mapping mode (never in alignment mode). Observed for the shortes reads (100nt) three times in 150M simulated reads

v0.0.2

27 Sep 04:03
Compare
Choose a tag to compare

StrobeAlign is now parallelized with OpenMP and can read fastq and gzipped fastq files with kseqpp.

TODO

  • PE-alignment mode and joint scoring
  • Separate creation and storage of reference index