Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python: CombineGlobally() needs to be able to produce expected output when processing multiple windows simultaneously #24683

Closed
1 of 15 tasks
BjornPrime opened this issue Dec 15, 2022 · 3 comments
Assignees
Labels
done & done Issue has been reviewed after it was closed for verification, followups, etc. new feature P2 python

Comments

@BjornPrime
Copy link
Contributor

What would you like to happen?

CombineGlobally() is supposed to reduce a PCollection to a single value. When given multiple windows to process, it would previously fail due to an assert statement that limited the max length of its output to one. I replaced this assert statement with a more descriptive error statement here (#24435), but a comprehensive solution for situations when the transform handles multiple windows, and thus multiple PCollections, at once, is still needed, since this situation is expected to arise from fairly run-of-the-mill use of the transform in streaming pipelines.

Issue Priority

Priority: 2 (default / most feature requests should be filed as P2)

Issue Components

  • Component: Python SDK
  • Component: Java SDK
  • Component: Go SDK
  • Component: Typescript SDK
  • Component: IO connector
  • Component: Beam examples
  • Component: Beam playground
  • Component: Beam katas
  • Component: Website
  • Component: Spark Runner
  • Component: Flink Runner
  • Component: Samza Runner
  • Component: Twister2 Runner
  • Component: Hazelcast Jet Runner
  • Component: Google Cloud Dataflow Runner
@liferoad
Copy link
Collaborator

.take-issue

@robertwb
Copy link
Contributor

robertwb commented Jun 9, 2023

We would need to define what it means to combine globally with defaults for non-trivial windowing. (Presumably that'd be a empty value for every empty window, so we'd need some way to enumerate windows.) If one wants to see combined values only when there is data, without_defaults should be advised and used.

@liferoad
Copy link
Collaborator

liferoad commented Jul 7, 2023

fixed by #26922

@github-actions github-actions bot added this to the 2.51.0 Release milestone Aug 11, 2023
@tvalentyn tvalentyn added the done & done Issue has been reviewed after it was closed for verification, followups, etc. label Aug 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
done & done Issue has been reviewed after it was closed for verification, followups, etc. new feature P2 python
Projects
None yet
Development

No branches or pull requests

4 participants