Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nsqd: enable compression/decompression pipeline from producer to consumer #1148

Closed
andyxning opened this issue Mar 20, 2019 · 5 comments
Closed

Comments

@andyxning
Copy link
Member

andyxning commented Mar 20, 2019

NSQ currently supports snappy compression and decompression for TCP-protocol based producer and consumer. But it has some shortcomings.

  • the memory and disk capacity nsqd used can not be decreased since all messages are stored just like what nsqd receives, i.e., the original messages.
  • the in network bandwidth can not be decreased since the message are not compressed before it is send.
  • they are just available for TCP protocol based producer and consumer.

I have made a proposal for this: https://docs.google.com/document/d/107yDH6pN8-b_i22AfvPkzdDU2yneOC-1lcUllpUmN4w/edit#heading=h.tla1oogr3wac

/cc @mreiferson @ploxiln

@andyxning
Copy link
Member Author

@ploxiln This will rely on your re-design about the nsq storage backend format.

@mreiferson
Copy link
Member

Hi @andyxning, thanks for this proposal.

I'm trying to understand the motivation here, is it to reduce on-disk and in-memory footprint? Have you considered compressing the data in your producer or consumer instead? Remember, to nsqd the messages are just bytes...

@ploxiln
Copy link
Member

ploxiln commented Mar 22, 2019

the in network bandwidth can not be decreased since the message are not compressed before it is send

The network bandwidth is reduced, that's the only thing that snappy or deflate compression of the tcp-protocol does.

the memory and disk capacity nsqd used can not be decreased

this is true, the messages are not compressed in memory or on disk, only "over the wire"

they are just available for TCP protocol based producer and consumer

TCP protocol consumer is the only kind of consumer. For the producer, the HTTP /pub request could use Content-Encoding: gzip ... I'm not 100% sure if that works with the nsqd http endpoint today, but if not, it would be an easy stand-alone change to make it work.

So what remains is some way for messages to be compressed in memory and disk queues. As @mreiferson says the producer and consumer can be made to use any message format, including compressed, so that's an easy thing a user can do today. I suppose it would be a nice additional feature to optionally have nsqd "take care of it" for the common case of text messages ... and it would be nice to avoid double-compressing messages over the wire (without giving up compression of control frames). I haven't read through and digested the proposal yet ...

@andyxning
Copy link
Member Author

@mreiferson @ploxiln PTAL.

@mreiferson
Copy link
Member

let's continue this conversation in #1149

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants