Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Concurrency issue when using addLogicalTypeConversions on avro model data #28279

Closed
1 of 15 tasks
RustedBones opened this issue Sep 1, 2023 · 0 comments
Closed
1 of 15 tasks

Comments

@RustedBones
Copy link
Contributor

RustedBones commented Sep 1, 2023

What happened?

Since beam 2.49 after #26320, we can see the following exception

"java.lang.NullPointerException
	at org.apache.avro.generic.GenericData.addLogicalTypeConversion(GenericData.java:116)
	at org.apache.beam.sdk.extensions.avro.schemas.utils.AvroUtils.addLogicalTypeConversions(AvroUtils.java:169)
	at org.apache.beam.sdk.extensions.avro.io.AvroDatumFactory$ReflectDatumFactory.apply(AvroDatumFactory.java:188)
	at org.apache.beam.sdk.extensions.avro.io.AvroIO$Sink.open(AvroIO.java:2126)
	at org.apache.beam.sdk.extensions.smb.FileOperations$Writer.prepareWrite(FileOperations.java:219)
	at org.apache.beam.sdk.extensions.smb.FileOperations$Writer.access$100(FileOperations.java:206)
	at org.apache.beam.sdk.extensions.smb.FileOperations.createWriter(FileOperations.java:133)

Either in the AvroIO or in the AvroCoder.
This seems to happen due to a concurrency access on the same GenericData (During initialization of the datum reader and writer, there is a high chance for avro returning the singleton instance).

The ReflectDatumFactory should avoid concurrent access to avro's model data.

Issue Priority

Priority: 2 (default / most bugs should be filed as P2)

Issue Components

  • Component: Python SDK
  • Component: Java SDK
  • Component: Go SDK
  • Component: Typescript SDK
  • Component: IO connector
  • Component: Beam examples
  • Component: Beam playground
  • Component: Beam katas
  • Component: Website
  • Component: Spark Runner
  • Component: Flink Runner
  • Component: Samza Runner
  • Component: Twister2 Runner
  • Component: Hazelcast Jet Runner
  • Component: Google Cloud Dataflow Runner
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants