You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There are some missing portable types in the Python SDK (e.g. Date, DateTime, Time) that we should add support for to make the cross-language experience more smooth.
Issue Priority
Priority: 2 (default / most normal work should be filed as P2)
Issue Components
Component: Python SDK
Component: Java SDK
Component: Go SDK
Component: Typescript SDK
Component: IO connector
Component: Beam examples
Component: Beam playground
Component: Beam katas
Component: Website
Component: Spark Runner
Component: Flink Runner
Component: Samza Runner
Component: Twister2 Runner
Component: Hazelcast Jet Runner
Component: Google Cloud Dataflow Runner
The text was updated successfully, but these errors were encountered:
Google recommends using the STORAGE_WRITE_API method in their Dataflow Best Practices, which requires passing this transform the schema argument for a table. But since many of our BigQuery tables have a DATE or DATETIME column, which isn't supported yet for these schemas in Python, we aren't able to use this.
As of Beam 2.60.0, we haven't found a current workaround - e.g. specifying our DATE columns as TIMESTAMP in the Python schema seems to fail either when Beam tries to actually write to BigQuery, or at some point when the Java code is executing and doing its own conversion. If anyone knows a workaround for this, I'd appreciate it.
As a side-note: why does STORAGE_WRITE_API require specifying a schema in advance, while STREAMING_INSERT does not?
What needs to happen?
Beam portable schemas include primitive and more complex types (represented as logical types). Some of these types are supported in the Python SDK:
beam/sdks/python/apache_beam/typehints/schemas.py
Lines 23 to 41 in 99202b2
When necessary, Python classes are created to represent a portable type. For example, see Timestamp below:
beam/sdks/python/apache_beam/utils/timestamp.py
Line 45 in 99202b2
There are some missing portable types in the Python SDK (e.g. Date, DateTime, Time) that we should add support for to make the cross-language experience more smooth.
Issue Priority
Priority: 2 (default / most normal work should be filed as P2)
Issue Components
The text was updated successfully, but these errors were encountered: