Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: runtime_type_check breaks stateful processing with timers #27167

Closed
1 of 15 tasks
OwlyCode opened this issue Jun 19, 2023 · 2 comments · Fixed by #27646
Closed
1 of 15 tasks

[Bug]: runtime_type_check breaks stateful processing with timers #27167

OwlyCode opened this issue Jun 19, 2023 · 2 comments · Fixed by #27646
Labels
bug done & done Issue has been reviewed after it was closed for verification, followups, etc. P2 python types

Comments

@OwlyCode
Copy link

OwlyCode commented Jun 19, 2023

What happened?

When executing a pipeline with the DirectRunner, using --runtime_type_check True, it is impossible to use timers like in the examples provided by https://beam.apache.org/blog/timely-processing/

It will end up in the following error:

  File "apache_beam/runners/common.py", line 291, in apache_beam.runners.common.DoFnSignature.__init__
  File "apache_beam/runners/common.py", line 320, in apache_beam.runners.common.DoFnSignature._validate
  File "apache_beam/runners/common.py", line 378, in apache_beam.runners.common.DoFnSignature._validate_stateful_dofn
  File "/usr/local/lib/python3.11/site-packages/apache_beam/transforms/userstate.py", line 303, in validate_stateful_dofn
    if timer_spec._attached_callback != getattr(dofn, method_name, None).__func__:
                                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute '__func__'. Did you mean: '__doc__'?

I did some prints to understand what is happening, it seems like the runtime type check wraps the DoFn with a apache_beam.typehints.typecheck.OutputCheckWrapperDoFn, preventing the following code to work:

        # apache_beam/transforms/userstate.py
        method_name = timer_spec._attached_callback.__name__
        if timer_spec._attached_callback != getattr(dofn, method_name, None).__func__:
           # ...

Because the annotated method is not found on the wrapper.

Issue Priority

Priority: 2 (default / most bugs should be filed as P2)

Issue Components

  • Component: Python SDK
  • Component: Java SDK
  • Component: Go SDK
  • Component: Typescript SDK
  • Component: IO connector
  • Component: Beam examples
  • Component: Beam playground
  • Component: Beam katas
  • Component: Website
  • Component: Spark Runner
  • Component: Flink Runner
  • Component: Samza Runner
  • Component: Twister2 Runner
  • Component: Hazelcast Jet Runner
  • Component: Google Cloud Dataflow Runner
@tvalentyn
Copy link
Contributor

Thanks for reporting @OwlyCode ! sounds like we could try to modify the wrapper to preserve the timer_spec. Would you be interested in contributing a fix or, or adding a unit test that demonstrates the failure (which we can skip and re-enable once resolved)?

@OwlyCode
Copy link
Author

I'm viewing this just now. It seems that @sadovnychyi has already a fix on the way. Thank you for your time!

sadovnychyi added a commit to sadovnychyi/beam that referenced this issue Aug 5, 2023
@github-actions github-actions bot added this to the 2.50.0 Release milestone Aug 10, 2023
@tvalentyn tvalentyn added the done & done Issue has been reviewed after it was closed for verification, followups, etc. label Aug 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug done & done Issue has been reviewed after it was closed for verification, followups, etc. P2 python types
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants