Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix tableExists() method #32510

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -82,9 +82,7 @@ public class PartitionMetadataDao {
public boolean tableExists() {
final String checkTableExistsStmt =
"SELECT t.table_name FROM information_schema.tables AS t "
+ "WHERE t.table_catalog = '' AND "
+ "t.table_schema = '' AND "
+ "t.table_name = '"
+ "WHERE t.table_name = '"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of removing the filtering altogether can you fork to code depending on this.isPostgres() (see getPartition below for an example)?

For GoogleSQL (else) you can leave the query as is.
For Postgres simply remove t.table_catalog and only keep t.table_schema = "public"

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dedocibula

Okay. If we go down that approach, then we need a way of specifying the metadata table schema name into the options as "public" is just the default one. Someone can specify a custom table_schema as this is the Postgres Dialect. What are your thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, what you are referring to are named schemas (https://cloud.google.com/spanner/docs/named-schemas). I believe that can be addressed in a separate issue as it has to be handled for both dialects and tested. I would keep the scope of this fix to the Postgres regression.

Today's Cloud Spanner Postgres syntax will allow to create "table" or "public"."table" -> both will be added to default/public schema. Anything else such as "schema"."table" will require named schema creation so my proposal should be sufficient to unblock this use case.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dedocibula not sure what to do with the tests due to the fork. Please guide on that.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dedocibula please advice

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sorry, missed this. So it seems there are two test files in which you could add this:

  1. https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/spanner/changestreams/dao/PartitionMetadataDaoTest.java
    This has mostly unit tests which currently only run under GoogleSQL dialect (see setUp). We could probably ask parallel tests here for Postgres, that said the actual engine evaluating these is mocked out so the only thing that comes to mind is to add a verification that the transaction is invoked with a working SQL - partial example

  2. https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/spanner/changestreams/it/SpannerChangeStreamPostgresIT.java
    This is e2e integration test which will actually run a pipeline. It should be possible to add another test case that runs two pipelines in sequence using the same parameters although I feel like for this type of change it might be bit excessive. I would suggest starting with the first one

+ metadataTableName
+ "'";
try (ResultSet queryResultSet =
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -523,7 +523,7 @@ private ResultSet mockGetParentPartition(Timestamp startTimestamp, Timestamp aft
private void mockTableExists() {
Statement tableExistsStatement =
Statement.of(
"SELECT t.table_name FROM information_schema.tables AS t WHERE t.table_catalog = '' AND t.table_schema = '' AND t.table_name = 'my-metadata-table'");
"SELECT t.table_name FROM information_schema.tables AS t WHERE t.table_name = 'my-metadata-table'");
ResultSetMetadata tableExistsResultSetMetadata =
ResultSetMetadata.newBuilder()
.setRowType(
Expand Down
Loading