-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nsqd: per-topic message IDs #741
Conversation
interesting. thoughts off the top of my head: This would mean per-topic message ID's are not unique across nsqd restarts (no timestamp or worker component and not persisted) and means they are less likely to be unique from the client perspective across multiple nsqd nodes (we didn't guarantee this already, but it's nice for logging). We do persist message ID's in the disk back queue, so it seems per-topic message ID's could get out-of-sync and overlap w/ this sequence based message ID generation no? It seams reasonable that a client should be able to expect the same message ID for a given message from a given nsqd node every time it gets that message, so i like that we persist the message ID. One approach is we could narrow the definition of a message ID, to make it more of a ACK id, where it's only purpose is an identifier to FIN/ACK/TOUCH a message, and it's not an ID for the message itself. (i'm not sure i like this, just putting it out there) |
Practically speaking that's all it really is and I regret treating them any differently. Agreed that we need to account for the edge cases you mentioned, but they're pretty easy once you've decided on the above. By logging I assume you mean in the Let me get some benchmarks here so we have some numbers to make decisions. |
Benchmark added and NOTE: I think at
|
I share @jehiah's concerns about this approach. It most noticeably affects nsqd instances which persist every message to disk, and long running work queues, but I believe it could impact in-memory queues also. Scenario 1:
Scenario 2:
I think there was always an edge-case problem with using wall-clock to generate the ID since clocks change and cannot be relied upon to be monotonic. There's code to guard against this, but it could make publishing to nsqd unavailable until time proceeds (ever have NTP send you back in time 5 minutes?) - this could actually be another One possible solution could be to persist the sequence with the metadata. This comes with the same issues as Scenario 1 above for messageID's sent out between persists when nsqd restarts without the ability to sync. Two ideas around using persisted ID's:
My suggestion is a hybrid, start with 1 and work towards 2 if it's feasible, but keep the time component. If NTP and fsync both work against you it's just not your day. Thoughts? |
@judwhite w/rt scenario 1, it's interesting but message ID's are only valid on the TCP connection they were received over, so we shouldn't have to worry about cross-connection ID re-use (i'm 90% sure on this). w/r/t number two, it's easy to just re-number the messages ID's when you read them from the backend because the ID wouldn't about the message contents, it's just an ephemeral ID when you send it to the client. My biggest concern is that because we expose the ID to the client, people might have mistakenly used it as an ID for the message contents, and this would break that semantics. =( |
Yea, but that was always a bad idea. #625 would, in theory, make the IDs actually useful with stronger guarantees. |
Thinking about the planned resilience work... I think this can still work, but chiming in to be sure. As long as nsqds are replicating peers' backlogs independently, if an nsqd node goes down, those peers can still agree on the set of messages that need to be recovered. They can just treat that whole node-specific backlog as just a list of messages and re-ID them as part of recovery.
@jehiah, question: to which TCP connection are you referring, the one from nsqd to the consumer? Is that channel-specific? Does that mean, even in just a simple scenario with one nsqd, when a message is requeued, if it's sent to a different consumer (or even the same one after reconnecting), it may get a new message ID? Might two consumers, on the same topic and different channels, get different message IDs for the same message? If these are possible, in a recovery scenario, how will we be able to tell which channels a message has already been FIN'd for? On recovery, we can just duplicate messages back to channels that already got it, so maybe that's a non-issue. The only other thing I can see is that it perhaps makes it harder to track when we don't need to retain a message's contents anymore because the IDs in the REQ wont necessarily match up to the original message. (Do we store the message content multiple times, per channel?) |
I think the outcome of this PR would be that we would officially declare that NSQ's message IDs are an internal implementation detail. It's possible for us to later provide stronger guarantees around message IDs, especially in light of #625, or not.
Right now (regardless of this PR), NSQ's message IDs are client-specific. Only the client that received the message, only on the original connection that received it, can respond to it. This means that message IDs as used by Technically, because of this property, we could generate the IDs on a per-client basis too, but I think generating them at the topic level makes more sense, especially in light of potential work in #625. |
We've always advised users to implement their own domain-level message IDs and put them in the message body. This PR is simply "the nail in the coffin". |
@jehiah I checked Channel.FinishMessage, you're right you can only FIN a message ID which is in-flight for that connection.
I worry it may be too late, changing message ID to be ephemeral risks breaking some users, advised or not 😃 It's exported by go-nsq, weakening the current guarantees (even if not in writing) will probably cause issues for some people. For example, tracking retries by message ID. Would you be open to help moving #625 forward, and strengthening the guarantees instead? |
Some alternative ideas to reduce contention and improve performance:
|
Not following this one?
Thought about this, I need to see if we can modify the algorithm to support concurrent operations so that we don't need to add a goroutine per topic in exchange. |
The idea is, you can avoid calling If on a slow topic lastnanos is actually from hundreds of seconds or even hours or days ago, that's fine, the message id's will still be unique. If nsqd is restarted lastnanos will jump forward to current time, that's also fine. EDIT: theoretical member name would more accurately be called "lastmillis" or probably "lastTS". Also, to get same reduction in calls to EDIT 2: the best value for N is probably |
@ploxiln I tried prototyping your idea and so far I haven't found a way to do it safely without a mutex since there are two variables involved (base time + offset) - maybe you could show me? |
I probably should have reordered my bullet points in my first post; my proposal for the timestamp only makes sense assuming nsqd keeps basically the same id structure/algorithm (but perhaps switches to independent instances for each topic to reduce contention). The same algorithm would indeed require something like a mutex, because of the treatment of the timestamp. (A mutex around a fast section might still be noticeably faster than a go channel.) I'm just throwing out an idea; if the maintainers of nsq want to change to just incrementing int64 sequence per topic, I won't argue, I'm really not very familiar with nsq protocol internals. |
@judwhite we should just bench the simple path, which is the current algorithm in If that shows better scalability characteristics than |
In light of #367 and #838 landing, which will be a "backwards incompatible" release due to removal of deprecated features, I'd like to revisit this before we stamp My preference, in order:
I'll probably give in to (2) if (1) makes everyone's lives difficult. Speak up! |
Last commit restores the time/node based ID generator.
Only nominally slower, but still much better scaling characteristics vs
|
I would go for approach (2). Either approach, by being per-topic, improves scalability. But there's another trade-off: approach (2) avoids changing ID behavior too drastically all at once, pretty much all tools and consumers will be fine. But approach (1) rips off the bandaid and provides duplicated IDs pretty quick - from separate nsqd and from the same nsqd if it restarts after not too many messages. (But not within the same TCP connection, of course.) |
thumbsup on (2). Also this reminds me of a tangential topic about |
No, we don't want it, but I think we need to keep it in order to preserve existing behavior. Given that, I'd vote to rename it |
which parts? We don't promise or guarantee worker ID uniqueness in a cluster, and we are not promising message ID uniqueness across nsqd or topic. We also don't need it for naming of the |
In order to preserve what some people are using message IDs for, continuing to provide |
Fair, a properly managed (also as i'm thinking through this it'd probably be nice to switch from |
If worker-id is removed, then it seems like you might as well do the switch to just an atomically incremented uint64, since without worker-id you're pretty likely to get duplicate IDs in the same topic, at the same time, from two separate nsqd. Here's another idea - switch to incrementing uint64, but with random initial value, so low-number IDs are not repeated so soon from the same nsqd when it restarts twice. |
Agreed with @ploxiln, I think given our choice to maintain compatibility, we must continue to provide
👍 let's land that separately though Otherwise, this should be RTM. |
(Pushing up rebased branches I've had laying around locally for ages)
This modifies message ID generation such that each topic
maintains a monotonic/atomic counter to generate
message IDs, a more scalable and performant approach.