Releases: ksahlin/strobealign
Releases · ksahlin/strobealign
v0.6
Version 0.6 fixes a crucial bug introduced in v0.5 and has two additional bug fixes that improve accuracy. It is highly recommended to update to this version.
- Crucial bugfix to v0.5 causing rare but occasional alignments to very long reference regions due to bug in coordinate. This becomes detrimental to speed.
- Identifying symmetrical hash collisions and in those cases test the reverse orientation. This leads to a further slight bump in alignment accuracy over previous versions, particularly for shorter read lengths.
- Fix to rare but occasional uninitialized joint alignment score
S
calculation that would cause suboptimal alignment - Fixes reporting of template len field in SAM output if deletion in alignment.
v0.5
Added features, some improvements in alignment (accuracy), and minor bugfixes.
- Added parameter
-N [INT]
to output secondary alignments - Base level alignment parameters can now be specified from command line
-A -B -E -O
- Improved MAPQ calculation: calculating them from alignments (if alignment mode) instead of from seeds.
- Update default base-level alignment parameters for better alignments around indels.
- Added Quality values,
AS:i
andNM:i
tags to SAM output.
See INDEL/SNV calling benchmark in README.
v0.4
- Implemented bitpacking of reference ID and strobe offset. Gives about 15-20% better Peak memory. Use one int for both values. 24 bits for ref_id (can handle up to 2^24 = 16,777,216 unique references), and 8 bits for strobe offset (meaning 255+k as maximum seed length)
- Implemented max_sites and max_score_droppoff as parameters instead of hardcoded
v0.3
- ~10% faster indexing by skipping some unnecessary computations of unique seeds.
- Several improvements to base level alignments using ssw. Fixed known bugs that happened occasionally with the flag, cigar string, and the NM flag in the sam file.
- Changed to reporting Eq/X cigar strings instead of M.
v0.2.1
v0.2
v0.1
Major update of strobealign. This version comes with an improvement in accuracy (and the number of aligned reads) around lengths 100-125nt reads, and it is also faster than older versions for these lengths. Most notable changes:
-
Algorithm changes
- Using xxhash instead of no hash for strobes. Gives a better pseudorandom generation of hashes for linking.
- Linking strobes using bitcount( (h_1 ^ h_2) ^ q) which creates a skewed seed length distribution towards shorter seeds in the window. This improves mapping candidate read detection particularly for shorter reads (100nt).
-
Parameters
- Adding the option to customize sampling window of second strobe with
-l
and-u
. - Adding a parameter
-r [INT]
for approximate read length (default 150). This will make strobealign customize parameters-l
-u
, and-k
- Adding the option to customize sampling window of second strobe with
-
Also cuts the reference accessions at first space, which fixes issue #4
v0.0.3.2
v0.0.3
Version 0.0.3
- Has paired-end alignment mode
- Implements a rescue mode both in SE and PE alignment modes (described in preprint).
- Changed to symmetrical strobemer hash values due to inversions (described in preprint).
Known bugs:
- Negative SAM coordinate bug in Single-end alignment mode. Observed once in 150M simulated reads
- Segfault in paired-end mapping mode (never in alignment mode). Observed for the shortes reads (100nt) three times in 150M simulated reads