Releases: KamilSJaron/smudgeplot
Arched
- local aggregation for subtracting hetmers containing sequencing errors
- fishnet algorithm for 1n coverage fit
Known problems:
- quantification of higher ploidy smudges is not well reflecting higher coverage variance and therefore leads to "overspilling" of kmers pairs to other smudges;
- Sometimes for species with strong tetraploid signal and weak diploid signal, the fit is better for assigning the 4n smudge as 2n. This is a recognisable pattern and clear limitation of the optimisation criteria we use at the moment. We are working on fixing this.
- We removed explicit ploidy predictions, users need to make a join interpretation of smudgeplot, genomescope and prior knowledge about to species to determine the most sensible ploidy. An explicit ploidy level is our preferred solution to both problems listed above, so hopefully this functionality will return.
Oriel
The big changes are
the search for the kmer pair will be within both canonical and non-canonical k-mer sets (Gene demonstrated it makes a difference)
the tool will be supporting FastK kmer counter only
the backend by Gene is paralelized and massively faster
the intermediate file will be a flat file with the 2d histogram with cov1, cov2, freq columns (as opposed to list of coverages of pairs cov1 cov2);
at least for now WE LOSE the ability to extract sequences of the kmers in the pair; this functionality will hopefully restore at some point together with functionality to assess the quality of assembly.
the smudge detection algorithm is under revision and a new version will be released on 18th of October 2024
Double-hung with curtains
- fixed issue with L and U being too close to each other. Smudgeplot simply creates a wild plot of the data that are fed to it regardless of being harder for interpretation (aligned with "honest data reporting" philosophy of smudgeplot, but might cause more confusion, perhpa we will add some more warnings in the future)
Double-Hung
Adding a new feature smudgeplot extract
for extracting kmer pairs from a rectangle of the smudgeplot.
Great thanks to @zhenzhenyang-psu !
Documentation: https://github.com/KamilSJaron/smudgeplot/wiki/smudgeplot-extract
For usage see smudgeplot.py extract -h
Still Single Hung
This release is just to get it to Zenodo, otherwise identical to v0.2.2.
Single-Hung
This version updates:
- the annotation algorithm for higher ploidy levels based on simulations.
- encourages using of our KMC that speeds up the search for kmer pairs a lot
- adding a new warnings for mismatching estimates of 1n coverage by different approaches (was silent before)
- change in terminology, instead of "estimated ploidy" we say "proposed ploidy", as there is no model explicitly tested
Single-Hung
- fixed logging (now it's directed to err stream)
- an estimate of ploidy based on all smudges of that ploidy (instead of the ploidy of the brightest smudge)
- smudgeplot interface uses
.py
suffix to meet community standards
Single-Hung
This version is using the same computational backend as the previous version (0.1.3), but it's wrapped in a single interface that is expected to be kept in future:
smudgeplot <task> <arguments>
Further adjustments:
- improved algorithm for placing smudges on the plot for higher ploidy levels than 4
- alternative algorithm for extracting kmers available (
--middle
inhetkmers
task)
I had no idea how to name the release, so I have decided to name individual versions of smudgeplots by types of windows, so let's start simple: Single-Hung it is. Hopefully, it will be good enough name to carry all the smudges.
beta3 - More modest algorithm for guessing 1n coveage, no default filtering
- no quantile filtering by default, a new parameter
-q
to set up the filter, the falg--no-qunatile-filtering
was removed (set-q 0.99
to set up the previously default filter) - algorithm for peak identification was failing if AAB was absent but AAAABB was present. This problem should be resolved by now by considering both the diploid and triploid 1n estimates.
- fixed the installation instructions
- minor fixes
beta2 release
- switch to colorblind friendly palette
- parameter nbins is autoscaling by default and fixed if defined by user
- added an option to disable quantile filtering (
--no_quantile_filt
flag) - improved interface & README