[Bug]: The DataflowRunner behavior is changed when removing the runner v1 code #28399

liferoad · 2023-09-11T19:29:49Z

liferoad · 2023-09-11T19:30:20Z

robertwb · 2023-09-15T17:40:52Z

DataflowRunner().run_pipeline(p, options=options) should work just as well as before. Could you clarify exactly what is going wrong?

tvalentyn · 2023-09-15T18:12:31Z

Repro:

import argparse
import logging

import apache_beam as beam
from apache_beam.options.pipeline_options import PipelineOptions
from apache_beam.runners import DataflowRunner

def run(argv=None):
  parser = argparse.ArgumentParser()
  options = PipelineOptions()
  p = beam.Pipeline()
  p | beam.Create([1])
  beam.runners.DataflowRunner().run_pipeline(p, options=options).wait_until_finish()


if __name__ == '__main__':
  logging.getLogger().setLevel(logging.INFO)
  run()

python pipeline.py  --project google.com:clouddfe --temp_location=gs://clouddfe-valentyn --staging_location=gs://clouddfe-valentyn --region us-central1

fails with

ERROR:apache_beam.runners.dataflow.dataflow_runner:2023-09-15T18:10:32.288Z: JOB_MESSAGE_ERROR: Runnable workflow has no steps specified.
INFO:apache_beam.runners.dataflow.dataflow_runner:Job 2023-09-15_11_10_28-12637281573769828865 is in state JOB_STATE_FAILED

liferoad · 2023-09-15T18:45:58Z

I could not figure out what cause this issue. I am wondering whether we need call _check_and_add_missing_options somewhere.

robertwb · 2023-09-15T20:14:30Z

This boils down to https://s.apache.org/no-beam-pipeline :).

It looks like manually setting these flags after construction may be sufficient. Right now we're hijacking the apply() on every PTransform to set them, which would be good to see if we can avoid. It should also be possible to fix this service-side so the client doesn't need to pass these arguments.

liferoad added bug awaiting triage labels Sep 11, 2023

github-actions bot added python P2 labels Sep 11, 2023

liferoad assigned robertwb Sep 14, 2023

liferoad removed the awaiting triage label Sep 14, 2023

tvalentyn added this to the 2.51.0 Release milestone Sep 15, 2023

robertwb mentioned this issue Sep 15, 2023

[BEAM-28399] Ensure dataflow experiments are always set when dataflowflow runner is used. #28485

Merged

3 tasks

tvalentyn closed this as completed Sep 20, 2023

ammppp mentioned this issue Oct 3, 2023

[Bug]: Python WriteToBigtable get stuck for large jobs due to client dead lock #28715

Closed

16 tasks

jrmccluskey added the done & done Issue has been reviewed after it was closed for verification, followups, etc. label Oct 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: The DataflowRunner behavior is changed when removing the runner v1 code #28399

[Bug]: The DataflowRunner behavior is changed when removing the runner v1 code #28399

liferoad commented Sep 11, 2023

liferoad commented Sep 11, 2023

robertwb commented Sep 15, 2023

tvalentyn commented Sep 15, 2023 •

edited

Loading

liferoad commented Sep 15, 2023

robertwb commented Sep 15, 2023

[Bug]: The DataflowRunner behavior is changed when removing the runner v1 code #28399

[Bug]: The DataflowRunner behavior is changed when removing the runner v1 code #28399

Comments

liferoad commented Sep 11, 2023

What happened?

Issue Priority

Issue Components

liferoad commented Sep 11, 2023

robertwb commented Sep 15, 2023

tvalentyn commented Sep 15, 2023 • edited Loading

liferoad commented Sep 15, 2023

robertwb commented Sep 15, 2023

tvalentyn commented Sep 15, 2023 •

edited

Loading