Samtools + libdeflate
out performs sambamba
on a single thread
#485
Labels
Samtools + libdeflate
out performs sambamba
on a single thread
#485
Hello,
I recently heard about
sambamba
and it's performance gains oversamtools
, and was excited to compare it tosamtools + zlib
andsamtools + libdeflate
(I had also heard thatlibdeflate
really improvessamtools
performance).I compared all three configurations and you can see my full post here: Samtools sort: most efficient memory and thread settings for many samples on a cluster
In short, I compare overall performance (measured by time) at different CPU and memory options. I was impressed that
sambamba
outperforms the other two in pretty much every configuration. There were two things I wanted to share directly that may be of interest:samtools + libdeflate
out performssambamba
which suggestssambamba
could be optimized even more at the compression steps (Fig. 1). You can comparesambamba
(red) andsamtools + libdeflate
(purple) at 1 CPU on the far left of Fig. 1. I'm not sure whatsambamba
uses for compression, though. I'm guessing it doesn't uselibdeflate
, otherwise I suspect it would have suffered from the same poor CPU utilization thatsamtools + libdeflate
suffered from with additional threads. Ifsambamba
is usingzlib
, however, I suspect you could really push the limits for manipulating.bam
files.sambamba
does the best at utilizing allotted CPUs, but it also eventually flattens out. This is obviously a classic computer science problem, but thought you might like to see wheresambamba
flattens out. TBH, I doubt there's much incentive to optimize CPU usage any higher than 9 CPUs, anyway, but who knows?samtools + libdeflate
flattens out very quickly and is unable to fully utilize allotted CPUs as well as the other two configurations (Fig. 2). I assume this boils down tolibdeflate
, but maybe it's more complicated than that. I reported this on thelibdeflate
GitHub page so they can look into it.And thank you for your work. We need more efficient tools like
sambamba
!Figure 1: Realtime vs CPU and Mem Per Thread for
samtools + zlib
,samtools + libdeflate
(Lsamtools), andsambamba
Figure 2: Requested CPUs vs. CPU utilization for
samtools + zlib
,samtools + libdeflate
(Lsamtools), andsambamba
The text was updated successfully, but these errors were encountered: