Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

E2e bigquerymultitable #1277

Open
wants to merge 3 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 31 additions & 5 deletions .github/workflows/e2e.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Copyright © 2021 Cask Data, Inc.
# Copyright © 2021-2023 Cask Data, Inc.
# Licensed under the Apache License, Version 2.0 (the "License"); you may not
# use this file except in compliance with the License. You may obtain a copy of
# the License at
Expand Down Expand Up @@ -31,7 +31,7 @@ jobs:
# 3) For PRs that are labeled as build and
# - It's a code change
# - A build label was just added
# A bit complex, but prevents builds when other labels are manipulated
# A bit complex but prevents builds when other labels are manipulated
if: >
github.event_name == 'workflow_dispatch'
|| github.event_name == 'push'
Expand All @@ -40,7 +40,7 @@ jobs:
)
strategy:
matrix:
tests: [bigquery, common, gcs, pubsub, spanner, gcscreate, gcsdelete, gcsmove, bigqueryexecute, gcscopy]
tests: [bigquerymultitable]
fail-fast: false
steps:
# Pinned 1.0.0 version
Expand All @@ -59,22 +59,48 @@ jobs:
- name: Checkout e2e test repo
uses: actions/checkout@v3
with:
repository: cdapio/cdap-e2e-tests
repository: Vipinofficial11/cdap-e2e-tests
path: e2e
ref: testBQ
- name: Cache
uses: actions/cache@v3
with:
path: ~/.m2/repository
key: ${{ runner.os }}-maven-${{ github.workflow }}-${{ hashFiles('**/pom.xml') }}
restore-keys: |
${{ runner.os }}-maven-${{ github.workflow }}

- name: Get Secrets from GCP Secret Manager
id: secrets
uses: 'google-github-actions/get-secretmanager-secrets@v0'
with:
secrets: |-
MYSQL_HOST:cdapio-github-builds/MYSQL_HOST
MYSQL_USERNAME:cdapio-github-builds/MYSQL_USERNAME
MYSQL_PASSWORD:cdapio-github-builds/MYSQL_PASSWORD
MYSQL_PORT:cdapio-github-builds/MYSQL_PORT
BQMT_CONNECTION_STRING:cdapio-github-builds/BQMT_CONNECTION_STRING

- name: Run required e2e tests
if: github.event_name != 'workflow_dispatch' && github.event_name != 'push' && steps.filter.outputs.e2e-test == 'false'
run: python3 e2e/src/main/scripts/run_e2e_test.py --testRunner **/${{ matrix.tests }}/**/TestRunnerRequired.java
env:
MYSQL_HOST: ${{ steps.secrets.outputs.MYSQL_HOST }}
MYSQL_USERNAME: ${{ steps.secrets.outputs.MYSQL_USERNAME }}
MYSQL_PASSWORD: ${{ steps.secrets.outputs.MYSQL_PASSWORD }}
MYSQL_PORT: ${{ steps.secrets.outputs.MYSQL_PORT }}
BQMT_CONNECTION_STRING: ${{ steps.secrets.outputs.BQMT_CONNECTION_STRING }}

- name: Run all e2e tests
if: github.event_name == 'workflow_dispatch' || github.event_name == 'push' || steps.filter.outputs.e2e-test == 'true'
run: python3 e2e/src/main/scripts/run_e2e_test.py --testRunner **/${{ matrix.tests }}/**/TestRunner.java
- name: Upload debug files
env:
MYSQL_HOST: ${{ steps.secrets.outputs.MYSQL_HOST }}
MYSQL_USERNAME: ${{ steps.secrets.outputs.MYSQL_USERNAME }}
MYSQL_PASSWORD: ${{ steps.secrets.outputs.MYSQL_PASSWORD }}
MYSQL_PORT: ${{ steps.secrets.outputs.MYSQL_PORT }}
BQMT_CONNECTION_STRING: ${{ steps.secrets.outputs.BQMT_CONNECTION_STRING }}
- name: Upload debug files
uses: actions/upload-artifact@v3
if: always()
with:
Expand Down
6 changes: 6 additions & 0 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -1247,6 +1247,12 @@
<version>0.4.0-SNAPSHOT</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>mysql</groupId>
<artifactId>mysql-connector-java</artifactId>
<version>8.0.25</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>ch.qos.logback</groupId>
<artifactId>logback-classic</artifactId>
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
# Copyright © 2023 Cask Data, Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License"); you may not
# use this file except in compliance with the License. You may obtain a copy of
# the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations under
# the License.

@BQMT_SINK
Feature: BigQueryMultiTable sink - Validate BigQueryMultiTable sink plugin error scenarios

@BQMT_Required
Scenario Outline: Verify BigQueryMultiTable Sink properties validation errors for mandatory fields
Given Open Datafusion Project to configure pipeline
When Expand Plugin group in the LHS plugins list: "Sink"
When Select plugin: "BigQueryMultiTable" from the plugins list as: "Sink"
Then Navigate to the properties page of plugin: "BigQuery Multi Table"
Then Click on the Validate button
Then Validate mandatory property error for "<property>"
Examples:
| property |
| dataset |

Scenario:Verify BQMT Sink properties validation errors for incorrect value of chunk size
Given Open Datafusion Project to configure pipeline
When Expand Plugin group in the LHS plugins list: "Sink"
When Select plugin: "BigQueryMultiTable" from the plugins list as: "Sink"
Then Navigate to the properties page of plugin: "BigQuery Multi Table"
And Enter input plugin property: "referenceName" with value: "Reference"
And Replace input plugin property: "project" with value: "projectId"
And Enter input plugin property: "dataset" with value: "dataset"
Then Override Service account details if set in environment variables
Then Enter input plugin property: "gcsChunkSize" with value: "bqmtInvalidChunkSize"
Then Click on the Validate button
Then Verify that the Plugin Property: "gcsChunkSize" is displaying an in-line error message: "errorMessageIncorrectBQMTChunkSize"

@BQMT_Required
Scenario:Verify BQMT Sink properties validation errors for incorrect dataset
Given Open Datafusion Project to configure pipeline
When Expand Plugin group in the LHS plugins list: "Sink"
When Select plugin: "BigQueryMultiTable" from the plugins list as: "Sink"
Then Navigate to the properties page of plugin: "BigQuery Multi Table"
And Enter input plugin property: "referenceName" with value: "Reference"
And Replace input plugin property: "project" with value: "projectId"
Then Override Service account details if set in environment variables
Then Enter input plugin property: "dataset" with value: "bqmtInvalidSinkDataset"
Then Click on the Validate button
Then Verify that the Plugin Property: "dataset" is displaying an in-line error message: "errorMessageIncorrectBQMTDataset"

Scenario:Verify BQMT Sink properties validation errors for incorrect reference name
Given Open Datafusion Project to configure pipeline
When Expand Plugin group in the LHS plugins list: "Sink"
When Select plugin: "BigQueryMultiTable" from the plugins list as: "Sink"
Then Navigate to the properties page of plugin: "BigQuery Multi Table"
And Replace input plugin property: "project" with value: "projectId"
And Enter input plugin property: "dataset" with value: "dataset"
Then Override Service account details if set in environment variables
Then Enter input plugin property: "referenceName" with value: "bqmtInvalidSinkReferenceName"
Then Click on the Validate button
Then Verify that the Plugin Property: "referenceName" is displaying an in-line error message: "errorMessageIncorrectBQMTReferenceName"

Scenario:Verify BQMT Sink properties validation errors for incorrect value of temporary bucket name
Given Open Datafusion Project to configure pipeline
When Expand Plugin group in the LHS plugins list: "Sink"
When Select plugin: "BigQueryMultiTable" from the plugins list as: "Sink"
Then Navigate to the properties page of plugin: "BigQuery Multi Table"
And Enter input plugin property: "referenceName" with value: "Reference"
And Replace input plugin property: "project" with value: "projectId"
And Enter input plugin property: "dataset" with value: "dataset"
Then Override Service account details if set in environment variables
Then Enter input plugin property: "bucket" with value: "bqmtInvalidTemporaryBucket"
Then Click on the Validate button
Then Verify that the Plugin Property: "bucket" is displaying an in-line error message: "errorMessageIncorrectBQMTBucketName"
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
# Copyright © 2023 Cask Data, Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License"); you may not
# use this file except in compliance with the License. You may obtain a copy of
# the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations under
# the License.

@BQMT_SINK
Feature: BigQueryMultiTable sink -Verification of Multiple Database Tables to BigQueryMultiTable successful data transfer using macros

@MULTIPLEDATABASETABLE_SOURCE_TEST @BQMT_Required
Scenario:Verify data is getting transferred from Multiple Database Tables to BQMT sink with all datatypes using macros
Given Open Datafusion Project to configure pipeline
When Expand Plugin group in the LHS plugins list: "Source"
When Select plugin: "Multiple Database Tables" from the plugins list as: "Source"
When Expand Plugin group in the LHS plugins list: "Sink"
When Select plugin: "BigQuery Multi Table" from the plugins list as: "Sink"
Then Navigate to the properties page of plugin: "Multiple Database Tables"
Then Replace input plugin property: "referenceName" with value: "ref"
Then Enter input plugin property: "connectionString" with value: "connectionString" for Credentials and Authorization related fields
Then Replace input plugin property: "jdbcPluginName" with value: "mysql"
Then Replace input plugin property: "user" with value: "user" for Credentials and Authorization related fields
Then Replace input plugin property: "password" with value: "pass" for Credentials and Authorization related fields
And Select radio button plugin property: "dataSelectionMode" with value: "sql-statements"
Then Click on the Add Button of the property: "sqlStatements" with value:
| selectQuery|
Then Validate "Multiple Database Tables" plugin properties
And Close the Plugin Properties page
Then Navigate to the properties page of plugin: "BigQuery Multi Table"
And Enter input plugin property: "referenceName" with value: "Reference"
Then Click on the Macro button of Property: "projectId" and set the value to: "bqProjectId"
Then Click on the Macro button of Property: "datasetProjectId" and set the value to: "bqDatasetProjectId"
Then Click on the Macro button of Property: "serviceAccountType" and set the value to: "serviceAccountType"
Then Click on the Macro button of Property: "serviceAccountFilePath" and set the value to: "serviceAccount"
Then Click on the Macro button of Property: "dataset" and set the value to: "bqDataset"
Then Click plugin property: "truncateTable"
Then Click plugin property: "allowSchema"
Then Validate "BigQuery Multi Table" plugin properties
And Close the Plugin Properties page
Then Connect plugins: "Multiple Database Tables" and "BigQuery Multi Table" to establish connection
Then Save the pipeline
Then Preview and run the pipeline
Then Enter runtime argument value "projectId" for key "bqProjectId"
Then Enter runtime argument value "projectId" for key "bqDatasetProjectId"
Then Enter runtime argument value "serviceAccountType" for key "serviceAccountType"
Then Enter runtime argument value "serviceAccount" for key "serviceAccount"
Then Enter runtime argument value "dataset" for key "bqDataset"
Then Run the preview of pipeline with runtime arguments
Then Wait till pipeline preview is in running state
Then Open and capture pipeline preview logs
Then Verify the preview run status of pipeline in the logs is "succeeded"
Then Close the pipeline logs
Then Close the preview
Then Deploy the pipeline
Then Run the Pipeline in Runtime
Then Enter runtime argument value "projectId" for key "bqProjectId"
Then Enter runtime argument value "projectId" for key "bqDatasetProjectId"
Then Enter runtime argument value "serviceAccountType" for key "serviceAccountType"
Then Enter runtime argument value "serviceAccount" for key "serviceAccount"
Then Enter runtime argument value "dataset" for key "bqDataset"
Then Run the Pipeline in Runtime with runtime arguments
Then Wait till pipeline is in running state
Then Open and capture logs
Then Verify the pipeline status is "Succeeded"
Then Validate the values of records transferred to BQMT sink is equal to the value from source MultiDatabase table

@MULTIPLEDATABASETABLE_SOURCE_TEST @BQMT_Required
Scenario:Verify data is getting transferred from Multiple Database Tables to BQMT sink with split field using macros
Given Open Datafusion Project to configure pipeline
When Expand Plugin group in the LHS plugins list: "Source"
When Select plugin: "Multiple Database Tables" from the plugins list as: "Source"
When Expand Plugin group in the LHS plugins list: "Sink"
When Select plugin: "BigQuery Multi Table" from the plugins list as: "Sink"
Then Navigate to the properties page of plugin: "Multiple Database Tables"
Then Replace input plugin property: "referenceName" with value: "ref"
Then Enter input plugin property: "connectionString" with value: "connectionString" for Credentials and Authorization related fields
Then Replace input plugin property: "jdbcPluginName" with value: "mysql"
Then Replace input plugin property: "user" with value: "user" for Credentials and Authorization related fields
Then Replace input plugin property: "password" with value: "pass" for Credentials and Authorization related fields
And Select radio button plugin property: "dataSelectionMode" with value: "sql-statements"
Then Click on the Add Button of the property: "sqlStatements" with value:
| selectQuery|
Then Validate "Multiple Database Tables" plugin properties
And Close the Plugin Properties page
Then Navigate to the properties page of plugin: "BigQuery Multi Table"
And Enter input plugin property: "referenceName" with value: "Reference"
Then Click on the Macro button of Property: "projectId" and set the value to: "bqProjectId"
Then Click on the Macro button of Property: "datasetProjectId" and set the value to: "bqDatasetProjectId"
Then Click on the Macro button of Property: "serviceAccountType" and set the value to: "serviceAccountType"
Then Click on the Macro button of Property: "serviceAccountFilePath" and set the value to: "serviceAccount"
Then Click on the Macro button of Property: "dataset" and set the value to: "bqDataset"
Then Click on the Macro button of Property: "SplitField" and set the value to: "bqmtSplitField"
Then Click plugin property: "truncateTable"
Then Click plugin property: "allowSchema"
Then Validate "BigQuery Multi Table" plugin properties
And Close the Plugin Properties page
Then Connect plugins: "Multiple Database Tables" and "BigQuery Multi Table" to establish connection
Then Save the pipeline
Then Preview and run the pipeline
Then Enter runtime argument value "projectId" for key "bqProjectId"
Then Enter runtime argument value "projectId" for key "bqDatasetProjectId"
Then Enter runtime argument value "serviceAccountType" for key "serviceAccountType"
Then Enter runtime argument value "serviceAccount" for key "serviceAccount"
Then Enter runtime argument value "dataset" for key "bqDataset"
Then Enter runtime argument value "splitField" for key "bqmtSplitField"
Then Run the preview of pipeline with runtime arguments
Then Wait till pipeline preview is in running state
Then Open and capture pipeline preview logs
Then Verify the preview run status of pipeline in the logs is "succeeded"
Then Close the pipeline logs
Then Close the preview
Then Deploy the pipeline
Then Run the Pipeline in Runtime
Then Enter runtime argument value "projectId" for key "bqProjectId"
Then Enter runtime argument value "projectId" for key "bqDatasetProjectId"
Then Enter runtime argument value "serviceAccountType" for key "serviceAccountType"
Then Enter runtime argument value "serviceAccount" for key "serviceAccount"
Then Enter runtime argument value "dataset" for key "bqDataset"
Then Enter runtime argument value "splitField" for key "bqmtSplitField"
Then Run the Pipeline in Runtime with runtime arguments
Then Wait till pipeline is in running state
Then Open and capture logs
Then Verify the pipeline status is "Succeeded"
Then Validate the values of records transferred to BQMT sink is equal to the value from source MultiDatabase table
Loading
Loading