You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When users don't explicitly set a timestamp on their records, the Python BT client defaults the timestamp to -1, which Bigtable handles by attaching system time at ingestion. The connector mishandles these rows by not sending over the -1 timestamp and instead dropping it here. When the records get to the underlying Java IO, it doesn't see any explicit timestamp set. Unlike the Python client, the Java BT client defaults timestamps to 0, which Bigtable handles by attaching epoch time.
The result is instead of attaching the current timestamp to cells, we attach epoch time for each of them.
This can affect users in two ways:
Users can set a garbage collection policy that cleans up old records in their table. These records with unset timestamps will show up as really old (1970-1-1) and will be garbage collected
Bigtable keeps the history of a cell in a table. When users write to a cell multiple times, this bug will cause the cell history to be overwritten because the same timestamp (epoch time) is used each time.
Issue Priority
Priority: 1 (data loss / total loss of function)
Issue Components
Component: Python SDK
Component: Java SDK
Component: Go SDK
Component: Typescript SDK
Component: IO connector
Component: Beam examples
Component: Beam playground
Component: Beam katas
Component: Website
Component: Spark Runner
Component: Flink Runner
Component: Samza Runner
Component: Twister2 Runner
Component: Hazelcast Jet Runner
Component: Google Cloud Dataflow Runner
The text was updated successfully, but these errors were encountered:
What happened?
When users don't explicitly set a timestamp on their records, the Python BT client defaults the timestamp to
-1
, which Bigtable handles by attaching system time at ingestion. The connector mishandles these rows by not sending over the-1
timestamp and instead dropping it here. When the records get to the underlying Java IO, it doesn't see any explicit timestamp set. Unlike the Python client, the Java BT client defaults timestamps to0
, which Bigtable handles by attaching epoch time.The result is instead of attaching the current timestamp to cells, we attach epoch time for each of them.
This can affect users in two ways:
Issue Priority
Priority: 1 (data loss / total loss of function)
Issue Components
The text was updated successfully, but these errors were encountered: