nsqd: Use Klaus Post's compression libraries #1484

philpearl · 2024-04-15T13:40:49Z

We would quite like to use compression with NSQ to save on data transfer costs, but the CPU impact is higher than we'd like. Our experiments have shown that Klaus Post's compression libraries perform much better than the standard library Deflate and Google's Snappy, with the sweet spot appearing to be level 3 flate compressing our traffic to about 25% of its original size, but only incurring a CPU cost equivalent to Snappy.

Would there be any interest in taking a PR that makes this change?

mreiferson · 2024-04-28T18:48:14Z

At a glance, I don't see any fundamental problem with improving performance by swapping out the dependency. Should we also expose the other compression algorithms, too?

adamroyjones · 2024-04-29T07:40:35Z

I think that's an excellent idea. zstd in particular has a strong appeal to it.

It's especially appealing with NSQ as it seems like a common pattern is for a topic to have messages with a single, well-defined schema—variations on a theme. Dictionaries (as generated by zstd --train) could be very useful.

philpearl · 2024-04-29T13:14:42Z

Fabulous. I'll put together a PR for the dependency swap.

philpearl · 2024-04-29T17:57:26Z

Hmm, I think our original testing must have been flawed when looking at Snappy. The Klaus Post version of this seems to be slower than the Google version, and the Klaus Post Deflate doesn't reach the speed of Snappy at any level.

This is what I'm getting comparing replacing Snappy & Deflate in both NSQD and the Go NSQ library

name                  old time/op    new time/op    delta
Compress/snappy-16       272µs ± 2%     310µs ± 7%  +13.96%  (p=0.000 n=10+10)
Compress/deflate3-16     746µs ± 1%     612µs ± 2%  -17.91%  (p=0.000 n=10+10)
Compress/deflate5-16    1.06ms ± 1%    0.66ms ± 1%  -37.60%  (p=0.000 n=10+9)
Compress/deflate6-16    1.28ms ± 2%    0.73ms ± 5%  -43.46%  (p=0.000 n=9+10)
Compress/deflate9-16    1.47ms ± 4%    1.72ms ± 9%  +16.33%  (p=0.000 n=10+9)

There's also an added complication that the Klaus Post Deflate compresses a little less at most levels.

=== RUN   TestCompareDeflate
    protocol_v2_test.go:2056: deflate level 1: compress to 19.304255% - 98.069603% of Go deflate
    protocol_v2_test.go:2056: deflate level 2: compress to 18.701276% - 104.756670% of Go deflate
    protocol_v2_test.go:2056: deflate level 3: compress to 18.185710% - 103.829701% of Go deflate
    protocol_v2_test.go:2056: deflate level 4: compress to 16.835251% - 105.500279% of Go deflate
    protocol_v2_test.go:2056: deflate level 5: compress to 16.005709% - 103.914756% of Go deflate
    protocol_v2_test.go:2056: deflate level 6: compress to 15.584694% - 103.055326% of Go deflate
    protocol_v2_test.go:2056: deflate level 7: compress to 15.575774% - 103.398863% of Go deflate
    protocol_v2_test.go:2056: deflate level 8: compress to 15.233253% - 101.558040% of Go deflate
    protocol_v2_test.go:2056: deflate level 9: compress to 14.929979% - 99.571684% of Go deflate
--- PASS: TestCompareDeflate (0.02s)

I still think it's worth replacing the Deflate library, but the motivation is much less than I previously believed. WDYT?

mreiferson · 2024-05-12T13:23:45Z

Meh, doesn't seem worth it? It sounds like we're saying "just use snappy"?

We should land all the benchmark code improvements (I've pushed a few more up to your PR), and nsqio/go-nsq#362 though.

philpearl changed the title ~~Use Klaus Post's compression libraries~~ nsqd: Use Klaus Post's compression libraries Apr 15, 2024

mreiferson added perf help wanted labels Apr 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nsqd: Use Klaus Post's compression libraries #1484

nsqd: Use Klaus Post's compression libraries #1484

philpearl commented Apr 15, 2024

mreiferson commented Apr 28, 2024

adamroyjones commented Apr 29, 2024

philpearl commented Apr 29, 2024

philpearl commented Apr 29, 2024

mreiferson commented May 12, 2024

nsqd: Use Klaus Post's compression libraries #1484

nsqd: Use Klaus Post's compression libraries #1484

Comments

philpearl commented Apr 15, 2024

mreiferson commented Apr 28, 2024

adamroyjones commented Apr 29, 2024

philpearl commented Apr 29, 2024

philpearl commented Apr 29, 2024

mreiferson commented May 12, 2024