-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support writing to Pubsub with ordering key; Add PubsubMessage SchemaCoder #31608
base: master
Are you sure you want to change the base?
Conversation
Confirmed that ordering key is preserved with both direct runner and dataflow runner |
Thanks @ahmedabu98! At first glance, this approach seems massively preferable to the set of bespoke coders that already exist, and those future ones that might need to exist later. I’d be happy to take a closer look next week! |
Assigning reviewers. If you would like to opt out of this review, comment R: @damondouglas for label java. Available commands:
The PR bot will only process comments in the main thread (not review comments). |
if (getNeedsOrderingKey()) { | ||
pubsubMessages.setCoder(PubsubMessageSchemaCoder.getSchemaCoder()); | ||
} else { | ||
pubsubMessages.setCoder(new PubsubMessageWithTopicCoder()); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fork is required to not break update compatibility
@iht and I have been looking into this today for a Dataflow customer and we came across a few details that seem to be missing on this PR:
The issue we're working on is time-sensitive so we're trying to wrap up our patches today.
A nice to have would be enabling users to customize the output sharding range based on ordering keys. Given the fact that throughput per ordering key is capped to 1 MBps (docs) I'd almost be inclined to say the ordering key should replace the output shard entirely. @ahmedabu98 I'm happy to share our changes in a bit and I'll set up a PR against the source branch of this PR. |
.../java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubIO.java
Outdated
Show resolved
Hide resolved
@sjvanrossum thank you for these insights, I'd be happy to take a look at your PR I'm not familiar with the internal implementation and how it relates to this one, but looks like we'd need changes there too. |
@scwhittle or @reuvenlax may be able to shed a light on Dataflow's implementation and the complexity of changes needed to accommodate this feature. Context: |
The DataflowRunner overrides the pubsub write transform using org.apache.beam.runners.dataflow.DataflowRunner.StreamingPubsubIOWrite so org.apache.beam.runners.dataflow.worker.PubsubSink is used. It would be nice to prevent using the ordering key for now with the DataflowRunner unless the experiment to use the beam implementation is present. To add support for it to Dataflow, it appears that if PUBSUB_SERIALIZED_ATTRIBUTES_FN is set, that maps bytes to PubsubMessage which already includes the ordering key. But for the ordering key to be respected for publishing, additional changes would be needed in the dataflow service backend. Currently it looks like it would just be dropped but if it was respected the service would also need to be updated to ensure batching doesn't occur across ordering keys.
Are you considering producing to a single ordering key from multiple distinct grouped-by keys in parallel? Doesn't that defeat the purpose of the ordering provided? I'm also not sure it would increase the throughput beyond the 1Mb per ordering key limit. An alternative would be grouping by partitioning of the ordering keys (via deterministic hash buckets for example) and then batching just within a bundle. |
Agreed, I'll throw an exception when the
Agreed, I'll create a new bug for this to continue this discussion internally.
The initial patch I wrote concatenated topic and ordering key and left output shards unchanged.
|
Reminder, please take a look at this pr: @damondouglas @shunping |
Assigning new set of reviewers because Pr has gone too long without review. If you would like to opt out of this review, comment R: @robertwb for label java. Available commands:
|
Highlighting this here as well, while trying to retrofit ordering keys onto the existing sinks I thought of rewriting the sink using While writing that sink I stumbled on some issues regarding message size validation as documented in #31800.
My thoughts on fixing the validation issue is to introduce a Coincidentally, the revised batching mechanism I had imagined turns out to be very close to the implementation found in Google Cloud Pub/Sub Client for Java (https://github.com/googleapis/java-pubsub/blob/main/google-cloud-pubsub/src/main/java/com/google/cloud/pubsub/v1/Publisher.java) and would live in @ahmedabu98 the fixes to the batching mechanism should address the comments you had raised on ahmedabu98#427 about my use of variable assignments in the condition of an if statement so I'll get those commits added to that PR. |
I saw @scwhittle @egalpin already entered some ideas. Do you plan to finish the review in the near future? If not available I can do a first pass. I see this new feature is guarded by a flag so won't affect existing uses if the flag is not set. So the current change looks fairly safe to get in. |
I'll have the batching fix added to ahmedabu98#427 before US business hours start tomorrow and I'll defer the rest to separate PRs. 👍 |
Reminder, please take a look at this pr: @damondouglas @damondouglas |
Friendly ping @scwhittle @sjvanrossum |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay
...oogle-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java
Outdated
Show resolved
Hide resolved
.../java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubIO.java
Outdated
Show resolved
Hide resolved
K key = keyFunction.apply(ThreadLocalRandom.current().nextInt(numShards), topic); | ||
@Nullable String orderingKey = message.getOrderingKey(); | ||
int shard = | ||
Strings.isNullOrEmpty(orderingKey) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we use the new logic only if sink is configured to care about ordering keys to avoid changing the batching for existing cases where key is set?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense, after reverting the strict validation on ordering keys requires a change here as well to retain the existing behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See response in the comment below. I think it makes sense to revert this new logic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems we may want to group by ordering key so that we can do better at sequencing publishing to the ordering key if that prevents pubsub ingestion issues.
However to fix the issue that this new sharding is used even if we are not publishing the ordering key, perhaps we should instead clear the ordering key in the verification dofn if we don't want to publish the ordering key? That could be done before we include the ordering key in message size validation etc which doesn't make sense if it will later be ignored.
int shard = | ||
Strings.isNullOrEmpty(orderingKey) | ||
? ThreadLocalRandom.current().nextInt(numShards) | ||
: Hashing.murmur3_32_fixed().hashString(orderingKey, StandardCharsets.UTF_8).asInt(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in general using a huge # of keys hurts performance. It seems like it would be better to still limit to numShards but be deterministic shard. The user can still for now increase the numShards very high if needed for performance.
If we need more publishing paralellism that seems like it should be done below by just adding publishBatch to scheduled executor and then joining, not via key parallelism.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. I see we already group messages into different batches based on ordering key down in WriterFn
. In other words, downstream steps don't rely on bundles being grouped by ordering key.
Do we necessarily need to shard by ordering key here? I wonder if we can revert this section and keep to numShards
CC @sjvanrossum
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, that was the intention based on an earlier comment of yours, but I guess it slipped my mind to apply bucketing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ahmedabu98 a deterministic shard number based on the ordering key ensures that the writers only operate on distinct key ranges, thus avoiding the creation of multiple buffers per ordering key with a potentially small number of batched elements per bundle. Volume skews across keys is as much a problem for Beam as it is for Pub/Sub and does not fit well with the feature since throughput is capped at 1MB/s per ordering key.
One thing that does concern me is ending up with a skewed distribution of keys across buckets, but that could be fixed by tweaking the hashing or bucketing algorithm. The suggestion below switches the hashing algorithm to 64-bit FarmHash since consistentHash
will pad or shrink the provided hash code to long
.
: Hashing.murmur3_32_fixed().hashString(orderingKey, StandardCharsets.UTF_8).asInt(); | |
: Hashing.consistentHash( | |
Hashing.farmHashFingerprint64().hashString(orderingKey, StandardCharsets.UTF_8), | |
numShards); |
…ub_orderingkey_write
@scwhittle I just realized that the Would it make sense to change the client implementation to the official client library or should we duplicate that functionality by extending the Beam clients to handle this? |
Reminder, please take a look at this pr: @damondouglas @damondouglas |
I think using the pubsub provided client would be good to do and was wondering why we weren't using it. I'm guessing it might not have been available when the original Beam implementation was done. Perhaps this could go in with some minimal backoff sleeping to retry such errors and that can be done separately? |
Reminder, please take a look at this pr: @damondouglas @damondouglas |
Assigning new set of reviewers because Pr has gone too long without review. If you would like to opt out of this review, comment R: @robertwb for label java. Available commands:
|
Assigning new set of reviewers because Pr has gone too long without review. If you would like to opt out of this review, comment R: @kennknowles for label java. Available commands:
|
Reminder, please take a look at this pr: @kennknowles @damondouglas |
Fixes #21162
I wasn't able to use the existing
PubsubMessageWithAttributesAndMessageIdCoder
because it doesn't encode/decode the message's topic, which is needed for dynamic destinations. There are already a number of existing coders (6) developed over the years. Every time a new feature/parameter is added to PubsubMessage, we need to make a new coder and fork the code to maintain update compatibility.To mitigate this for the future, this PR introduces a SchemaCoder for PubsubMessage. SchemaCoder allows us to evolve the schema over time, so hopefully new features can be added in the future without breaking update compatibility.
Note that PubsubMessage's default coder is
PubsubMessageWithAttributesCoder
, which can't be updated without breaking backwards compatibility (see #23525). Wherever PubsubMessages are created in a pipeline, we would have to manually override the coder toPubsubMessageSchemaCoder.getSchemaCoder()
or the ordering key will get lost.