Fix tableExists() method #32510

NimzyMaina · 2024-09-19T19:57:57Z

Fix for tableExists() method that causes a Spanner Change Stream consumer to be unable to recover from a restart.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.

Fix for issue apache#32509

github-actions · 2024-09-19T22:05:53Z

Checks are failing. Will not request review until checks are succeeding. If you'd like to override that behavior, comment assign set of reviewers

github-actions · 2024-09-20T17:35:12Z

Assigning reviewers. If you would like to opt out of this review, comment assign to next reviewer:

R: @kennknowles for label java.
R: @damondouglas for label io.
R: @nielm for label spanner.

Available commands:

stop reviewer notifications - opt out of the automated review tooling
remind me after tests pass - tag the comment author after tests pass
waiting on author - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)

The PR bot will only process comments in the main thread (not review comments).

github-actions · 2024-09-28T12:13:58Z

Reminder, please take a look at this pr: @kennknowles @damondouglas @nielm

dedocibula · 2024-09-28T22:02:27Z

...src/main/java/org/apache/beam/sdk/io/gcp/spanner/changestreams/dao/PartitionMetadataDao.java

-            + "WHERE t.table_catalog = '' AND "
-            + "t.table_schema = '' AND "
-            + "t.table_name = '"
+            + "WHERE t.table_name = '"


Instead of removing the filtering altogether can you fork to code depending on this.isPostgres() (see getPartition below for an example)?

For GoogleSQL (else) you can leave the query as is.
For Postgres simply remove t.table_catalog and only keep t.table_schema = "public"

@dedocibula

Okay. If we go down that approach, then we need a way of specifying the metadata table schema name into the options as "public" is just the default one. Someone can specify a custom table_schema as this is the Postgres Dialect. What are your thoughts?

Right, what you are referring to are named schemas (https://cloud.google.com/spanner/docs/named-schemas). I believe that can be addressed in a separate issue as it has to be handled for both dialects and tested. I would keep the scope of this fix to the Postgres regression.

Today's Cloud Spanner Postgres syntax will allow to create "table" or "public"."table" -> both will be added to default/public schema. Anything else such as "schema"."table" will require named schema creation so my proposal should be sufficient to unblock this use case.

@dedocibula not sure what to do with the tests due to the fork. Please guide on that.

@dedocibula please advice

Oh sorry, missed this. So it seems there are two test files in which you could add this:

https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/spanner/changestreams/dao/PartitionMetadataDaoTest.java
This has mostly unit tests which currently only run under GoogleSQL dialect (see setUp). We could probably ask parallel tests here for Postgres, that said the actual engine evaluating these is mocked out so the only thing that comes to mind is to add a verification that the transaction is invoked with a working SQL - partial example

https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/spanner/changestreams/it/SpannerChangeStreamPostgresIT.java
This is e2e integration test which will actually run a pipeline. It should be possible to add another test case that runs two pipelines in sequence using the same parameters although I feel like for this type of change it might be bit excessive. I would suggest starting with the first one

github-actions · 2024-10-04T12:14:13Z

Assigning new set of reviewers because Pr has gone too long without review. If you would like to opt out of this review, comment assign to next reviewer:

R: @Abacn for label java.
R: @johnjcasey for label io.
R: @nielm for label spanner.

Available commands:

stop reviewer notifications - opt out of the automated review tooling
remind me after tests pass - tag the comment author after tests pass
waiting on author - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)

github-actions · 2024-10-15T12:14:32Z

Reminder, please take a look at this pr: @Abacn @johnjcasey @nielm

github-actions · 2024-10-18T12:14:16Z

Assigning new set of reviewers because Pr has gone too long without review. If you would like to opt out of this review, comment assign to next reviewer:

R: @damondouglas for label java.
R: @chamikaramj for label io.
R: @nielm for label spanner.

Available commands:

stop reviewer notifications - opt out of the automated review tooling
remind me after tests pass - tag the comment author after tests pass
waiting on author - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)

github-actions · 2024-10-25T12:14:24Z

Reminder, please take a look at this pr: @damondouglas @chamikaramj @nielm

nielm · 2024-10-25T12:34:50Z

waiting on author

Fix table exists method

24496eb

Fix for issue apache#32509

github-actions bot added java io gcp spanner labels Sep 19, 2024

spotlessApply changes

d2c49c6

Fix tests

3417ce4

github-actions bot added the Next Action: Reviewers label Sep 20, 2024

github-actions bot added the slow-review label Sep 28, 2024

dedocibula reviewed Sep 28, 2024

View reviewed changes

github-actions bot removed the slow-review label Oct 4, 2024

github-actions bot added the slow-review label Oct 15, 2024

github-actions bot removed the slow-review label Oct 18, 2024

github-actions bot added the slow-review label Oct 25, 2024

github-actions bot added Next Action: Author and removed Next Action: Reviewers slow-review labels Oct 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix tableExists() method #32510

Fix tableExists() method #32510

NimzyMaina commented Sep 19, 2024

github-actions bot commented Sep 19, 2024

github-actions bot commented Sep 20, 2024

github-actions bot commented Sep 28, 2024

dedocibula Sep 28, 2024

NimzyMaina Sep 29, 2024

dedocibula Sep 30, 2024

NimzyMaina Oct 1, 2024

NimzyMaina Oct 7, 2024

dedocibula Oct 7, 2024

github-actions bot commented Oct 4, 2024

github-actions bot commented Oct 15, 2024

github-actions bot commented Oct 18, 2024

github-actions bot commented Oct 25, 2024

nielm commented Oct 25, 2024

Fix tableExists() method #32510

Are you sure you want to change the base?

Fix tableExists() method #32510

Conversation

NimzyMaina commented Sep 19, 2024

GitHub Actions Tests Status (on master branch)

github-actions bot commented Sep 19, 2024

github-actions bot commented Sep 20, 2024

github-actions bot commented Sep 28, 2024

dedocibula Sep 28, 2024

Choose a reason for hiding this comment

NimzyMaina Sep 29, 2024

Choose a reason for hiding this comment

dedocibula Sep 30, 2024

Choose a reason for hiding this comment

NimzyMaina Oct 1, 2024

Choose a reason for hiding this comment

NimzyMaina Oct 7, 2024

Choose a reason for hiding this comment

dedocibula Oct 7, 2024

Choose a reason for hiding this comment

github-actions bot commented Oct 4, 2024

github-actions bot commented Oct 15, 2024

github-actions bot commented Oct 18, 2024

github-actions bot commented Oct 25, 2024

nielm commented Oct 25, 2024