This documentation describes the process to configure and deploy Data Mesh into your organization. For the purposes of this guide, it is suggested to keep all default values in place to allow for a smoother and consistent experience.
This guide assumes you have already configured and deployed the Foundations Project Version 4.1.
Data Mesh has been developed with GitHub and Cloudbuild connections, and will assume your foundations follows this same process.
At the time of this writing, foundations does not have support for gitHub and Cloudbuild connections, so a customized module is provided for this example.
If you have not deployed foundations, refer to 0-bootstrap/README-GitHub.md
in the official repository.
It contains the necessary steps and procedures that are needed to deploy Foundations from scratch.
Once successfully deployed, you can begin the deployment process for Data Mesh, starting from Dependencies
and working your way in sequence. This includes the necessary additions to the foundations project which are needed to deploy Data Mesh.
You must have role Service Account User (roles/iam.serviceAccountUser
) on the Terraform Service Account created in the foundation Seed Project.
The Terraform Service Account has the permissions to deploy step 4-projects of the foundation:
sa-terraform-proj@<SEED_PROJECT_ID>.iam.gserviceaccount.com
Install the following dependencies:
-
Google Cloud SDK version 400.0.0 or later.
-
Terraform version 1.5.7 or later.
-
Tinkey version 1.11.0 or later
-
Java version 11 or later
-
jq version 1.6 or later
To configured Application Default Credentials run:
gcloud auth application-default login
For these instructions we assume that:
-
The foundation was deployed using GitHub Actions.
-
Every repository should be on the
plan
branch andterraform init
should be executed in each one. -
The following layout should exists in your local environment since you will need to make changes in these steps. If you do not have this layout, checkout the source repositories for the foundation steps following this layout.
gcp-bootstrap gcp-environments gcp-networks gcp-org gcp-projects
-
Also checkout the repository for this code,
gcp-data-mesh-foundations
, at the same level.
The final layout should look as follows:
gcp-data-mesh-foundations
gcp-bootstrap
gcp-environments
gcp-networks
gcp-org
gcp-projects
Update serviceusage_allow_basic_apis.yaml
to include the following apis:
- "firestore.googleapis.com"
- "orgpolicy.googleapis.com"
- "cloudtasks.googleapis.com"
- "bigqueryconnection.googleapis.com"
- "bigquerydatapolicy.googleapis.com"
- "bigquerydatatransfer.googleapis.com"
- "composer.googleapis.com"
- "containerregistry.googleapis.com"
- "datacatalog.googleapis.com"
- "dataflow.googleapis.com"
- "dataplex.googleapis.com"
- "datalineage.googleapis.com"
- "dlp.googleapis.com"
- "resourcesettings.googleapis.com"
- "dataform.googleapis.com"
- "datapipelines.googleapis.com"
Update the local copies on each repository:
gcp-bootstrap/policy-library/policies/constraints/serviceusage_allow_basic_apis.yaml
gcp-environments/policy-library/policies/constraints/serviceusage_allow_basic_apis.yaml
gcp-networks/policy-library/policies/constraints/serviceusage_allow_basic_apis.yaml
gcp-org/policy-library/policies/constraints/serviceusage_allow_basic_apis.yaml
gcp-projects/policy-library/policies/constraints/serviceusage_allow_basic_apis.yaml
If your policy constraints are being pulled from a different location or repository, update this file in its correct location. Commit and push the code on any repository changed.
-
Add the role Organization Policy Administrator (
roles/orgpolicy.policyAdmin
) to the Terraform Service account created for theprojects
stepsa-terraform-proj@<SEED_PROJECT_ID>.iam.gserviceaccount.com
. Organizational policies at a project level will be needed for data mesh to function successfully. In addition, add in the extra variables needed to deploy:cd gcp-bootstrap cp ../gcp-data-mesh-foundations/0-bootstrap/iam-datamesh.tf ./envs/shared/iam-datamesh.tf cp ../gcp-data-mesh-foundations/0-bootstrap/variables-datamesh.tf ./envs/shared/variables-datamesh.tf
-
This process will also need a Classic personal access token:
- This token will need access to create and modify repositories for project specific repositories created in step
4-projects
- Permissions:
- Repositories: Full control
- User: read:user
- Admin(Org): read and write
- Admin(Repo_hook): read and write
- Admin:(Org_Hook)
- Permissions:
- This token will need access to create and modify repositories for project specific repositories created in step
-
To prevent saving the
gh_token
andgithub_app_infra_token
in plain text in theterraform.tfvars
file, export the GitHub fine grained access token as well as the Github application infrastructure token (Classic token) as an environment variable:export TF_VAR_gh_token="YOUR-FINE-GRAINED-ACCESS-TOKEN" export TF_VAR_github_app_infra_token="YOUR-CLASSIC-ACCESS-TOKEN"
-
In the
github.tf
file, located ingcp-bootstrap/envs/shared
, update thecommon_secrets
local variable and include the GitHub App Infra Token:common_secrets = { "PROJECT_ID" : module.gh_cicd.project_id, "WIF_PROVIDER_NAME" : module.gh_oidc.provider_name, "TF_BACKEND" : module.seed_bootstrap.gcs_bucket_tfstate, "TF_VAR_gh_token" : var.gh_token, "TF_VAR_github_app_infra_token" : var.github_app_infra_token, }
-
Update the GitHub workflow files with the latest versions required for the Data Mesh setup. Please note that the
github-*.yaml
files differ slightly from those in Foundations Project Version 4.1. In these files, an additionalenv
variable has been added to enable access to a GitHub secret:"TF_VAR_github_app_infra_token"
. This classic token, which will be propagated to your current GitHub Foundation repositories, will be essential in future steps for creating repositories as you progress to the4-projects
stage.For this Data Mesh example foundation, it is recommended to retain this configuration to facilitate a smoother deployment process.
cp ../gcp-data-mesh-foundations/build/github-*.yaml ./.github/workflows
-
Run this process manually to ensure that the
TF_VAR_gh_token
andTF_VAR_github_app_infra_token
are not set in theterraform.tfvars
file.cd envs/shared git checkout plan terraform init terraform plan -input=false -out bootstrap.tfplan
-
Once terraform plan has finished, review the output and apply the changes.
terraform apply bootstrap.tfplan
-
Push your changes to your
gcp-bootstrap
repository.cd ../.. git add . git commit -m 'add required data-mesh iam role' git push
-
Submit a PR in
gcp-bootstrap
fromplan
toproduction
. This will trigger a terraform plan. Allow the plan to complete. Once complete, merge the PR toproduction
. This action will not change your state file, however it will ensure that your codebase is up to date. -
Change directory out of this folder.
cd ../../../
-
Proceed to the next step
1-org
.
This step creates a secret to hold the Github Token that will be used in step 5-app-infra
.
-
Copy over the following files to your
envs/shared
folder in yourgcp-org
repository or1-org
folder:- gcp-data-mesh-foundations/1-org/envs/shared/iam-datamesh.tf
- gcp-data-mesh-foundations/1-org/envs/shared/keys-datamesh.tf
- gcp-data-mesh-foundations/1-org/envs/shared/outputs-datamesh.tf
- gcp-data-mesh-foundations/1-org/envs/shared/variables-datamesh.tf
- gcp-data-mesh-foundations/1-org/envs/shared/secrets-datamesh.tf
- gcp-data-mesh-foundations/1-org/envs/shared/remote-datamesh.tf
cd gcp-org git checkout plan
cp ../gcp-data-mesh-foundations/1-org/envs/shared/iam-datamesh.tf ./envs/shared/iam-data-mesh.tf cp ../gcp-data-mesh-foundations/1-org/envs/shared/keys-datamesh.tf ./envs/shared/keys-datamesh.tf cp ../gcp-data-mesh-foundations/1-org/envs/shared/outputs-datamesh.tf ./envs/shared/outputs-datamesh.tf cp ../gcp-data-mesh-foundations/1-org/envs/shared/variables-datamesh.tf ./envs/shared/variables-datamesh.tf cp ../gcp-data-mesh-foundations/1-org/envs/shared/secrets-datamesh.tf ./envs/shared/secrets-datamesh.tf cp ../gcp-data-mesh-foundations/1-org/envs/shared/remote-datamesh.tf ./envs/shared/remote-datamesh.tf
-
Overwrite the github workflows files with the current files needed for data-mesh.
cp ../gcp-data-mesh-foundations/build/github-*.yaml ./.github/workflows
-
Add a new environment
common
in filegcp-org/envs/shared/projects.tf
in localenvironments
environments = { "development" : "d", "nonproduction" : "n", "production" : "p", "common" : "c" }
-
Push your changes to your
gcp-org
repository.git add . git commit -m 'add required data-mesh configuration' git push
-
Submit a PR from
plan
toproduction
. Allow the plan to complete. Once complete, merge the PR toproduction
. Allow your terraform apply to complete. -
Change directory out of this folder.
cd ..
-
CD into the
gcp-environments
foldercd gcp-environments git checkout plan
-
Copy over the following files to your respective environment folders in your
gcp-environments
repository or2-environments
folder:- envs/development/outputs-datamesh.tf
- envs/nonproduction/outputs-datamesh.tf
- envs/production/outputs-datamesh.tf
cp ../gcp-data-mesh-foundations/2-environments/envs/development/outputs-datamesh.tf ./envs/development/outputs-datamesh.tf cp ../gcp-data-mesh-foundations/2-environments/envs/nonproduction/outputs-datamesh.tf ./envs/nonproduction/outputs-datamesh.tf cp ../gcp-data-mesh-foundations/2-environments/envs/production/outputs-datamesh.tf ./envs/production/outputs-datamesh.tf
-
Overwrite the github workflows files with the current files needed for data-mesh.
cd gcp-environments cp ../gcp-data-mesh-foundations/build/github-*.yaml ./.github/workflows
-
Additionally, add the following files to the
modules/env_baseline
folder in yourgcp-environments
repository or2-environments
folder.- kms-datamesh.tf
- remote-datamesh.tf
- variables-datamesh.tf
- outputs-datamesh.tf
cp ../gcp-data-mesh-foundations/2-environments/modules/env_baseline/kms-datamesh.tf ./modules/env_baseline/kms-datamesh.tf cp ../gcp-data-mesh-foundations/2-environments/modules/env_baseline/remote-datamesh.tf ./modules/env_baseline/remote-datamesh.tf cp ../gcp-data-mesh-foundations/2-environments/modules/env_baseline/variables-datamesh.tf ./modules/env_baseline/variables-datamesh.tf cp ../gcp-data-mesh-foundations/2-environments/modules/env_baseline/outputs-datamesh.tf ./modules/env_baseline/outputs-datamesh.tf git add . git commit -m 'add required data-mesh configuration' git push
-
Create a PR in
gcp-environments
fromplan
todevelopment
. This will trigger a terraform plan. Allow the plan to complete. Once complete, merge the PR todevelopment
. -
Repeat the above steps for
nonproduction
. Create a PR fromdevelopment
tononproduction
. Alow the plan to complete. Once complete, merge the PR tononproduction
. -
Repeat the above steps for
production
. Create a PR fromnonproduction
toproduction
. Alow the plan to complete. Once complete, merge the PR toproduction
.
Wait for the build pipeline for the development branch to run successfully.
Proceed to the next step 3-networks
.
-
Copy over the following files from
envs/shared
to your respective environment folders in yourgcp-networks
repository or3-networks-dual-svpc
folder.- envs/shared/main-datamesh.tf
- envs/shared/outputs-datamesh.tf
- envs/shared/variables-datamesh.tf
cd gcp-networks git checkout plan cp ../gcp-data-mesh-foundations/3-networks/envs/shared/main-datamesh.tf ./envs/shared/main-datamesh.tf cp ../gcp-data-mesh-foundations/3-networks/envs/shared/outputs-datamesh.tf ./envs/shared/outputs-datamesh.tf cp ../gcp-data-mesh-foundations/3-networks/envs/shared/variables-datamesh.tf ./envs/shared/variables-datamesh.tf
-
Under
modules/restricted_shared_vpc/service_control.tf
, replacemodule "regular_service_perimeter"
version from~> 6.0
to~> 5.2
-
Additionally, add in the following file to
modules/restricted_shared_vpc
:- dns-datamesh.tf
cp ../gcp-data-mesh-foundations/3-networks-dual-svpc/modules/restricted_shared_vpc/dns-datamesh.tf ./modules/restricted_shared_vpc/dns-datamesh.tf
-
If the VPC-SC perimeter has not been enforced yet, add the following line to the
modules/base_env/main.tf
file undermodule "restricted_shared_vpc"
:enforce_vpcsc = true
-
Overwrite the github workflows files with the current files needed for data-mesh.
cp ../gcp-data-mesh-foundations/build/github-*.yaml ./.github/workflows
-
Under the file
shared.auto.tfvars
located in the root ofgcp-networks
repository or3-networks-dual-svpc
folder, include the following empty lists. These will be filled in the following steps for5-app-infra
ingress_policies = [ ] egress_policies = [ ]
-
As a temporary measure, a set of egress rules will be added to
common.auto.tfvars
. This file will be removed once step4 Projects
is completed. It is necessary for these rules to exist in order for theterraform apply
commands to succeed in step4 Projects
without encountering VPC-SC Service Perimeter errors. In the instructions following4 Projects
, there will be a more robust solution presented to handle VPC-SC ingress and egress rules for each environment.Execute the following commands:
terraform -chdir="envs/shared" init terraform -chdir="envs/development" init terraform -chdir="envs/nonproduction" init terraform -chdir="envs/production" init
terraform -chdir="../gcp-environments/envs/production" init terraform -chdir="../gcp-environments/envs/nonproduction" init terraform -chdir="../gcp-environments/envs/development" init
export common_kms_project_number=$(terraform -chdir="../gcp-org/envs/shared" output -raw common_kms_project_number) export dev_kms_project_number=$(terraform -chdir="../gcp-environments/envs/development" output -raw env_kms_project_number) export nonprod_kms_project_number=$(terraform -chdir="../gcp-environments/envs/nonproduction" output -raw env_kms_project_number) export prod_kms_project_number=$(terraform -chdir="../gcp-environments/envs/production" output -raw env_kms_project_number)
echo "common_kms_project_number = ${common_kms_project_number}" echo "dev_kms_project_number = ${dev_kms_project_number}" echo "nonprod_kms_project_number = ${nonprod_kms_project_number}" echo "prod_kms_project_number = ${prod_kms_project_number}"
sed -i'' -e "s/COMMON_KMS_PROJECT_NUMBER/${common_kms_project_number}/" ../gcp-data-mesh-foundations/temp_vpcsc/common.auto.tfvars sed -i'' -e "s/DEV_KMS_PROJECT_NUMBER/${dev_kms_project_number}/" ../gcp-data-mesh-foundations/temp_vpcsc/common.auto.tfvars sed -i'' -e "s/NONPROD_KMS_PROJECT_NUMBER/${nonprod_kms_project_number}/" ../gcp-data-mesh-foundations/temp_vpcsc/common.auto.tfvars sed -i'' -e "s/PROD_KMS_PROJECT_NUMBER/${prod_kms_project_number}/" ../gcp-data-mesh-foundations/temp_vpcsc/common.auto.tfvars
Verify that the following file has been updated:
cat ../gcp-data-mesh-foundations/temp_vpcsc/common.auto.tfvars
-
Copy the
egress_policies
variable in the file../gcp-data-mesh-foundations/temp_vpcsc/common.auto.tfvars
and place it in yourgcp-networks
common.auto.tfvars
file.
`
-
After these changes are complete, push the code to the repository:
git add . git commit -m 'add required data-mesh configuration' git push cd ..
-
Create a PR in
gcp-networks
fromplan
todevelopment
. This will trigger a terraform plan. Allow the plan to complete. Once complete, merge the PR todevelopment
. -
Repeat the above steps for
nonproduction
. Create a PR fromdevelopment
tononproduction
. Alow the plan to complete. Once complete, merge the PR tononproduction
. -
Repeat the above steps for
production
. Create a PR fromnonproduction
toproduction
. Alow the plan to complete. Once complete, merge the PR toproduction
. -
Once all pipelines have executed successfully, proceed to the next step
4-projects
.
If adding to an existing foundation, execute the following steps:
-
In your organization, create the following groups:
"cdmc-conf-data-viewer@[your-domain-here]" "cdmc-data-viewer@[your-domain-here]" "cdmc-encrypted-data-viewer@[your-domain-here]" "cdmc-fine-grained-data-viewer@[your-domain-here]" "cdmc-masked-data-viewer@[your-domain-here]"
-
update your
4-projects
folder orgcp-projects
repository with thebusiness_unit_4
folder.cd gcp-projects git checkout plan cp -RT ../gcp-data-mesh-foundations/4-projects/business_unit_4/ ./business_unit_4
-
Update
common.auto.tfvars
ingcp-projects
with the groups created in Step 1. This will be a map containing the following:cat <<EOF >> common.auto.tfvars consumer_groups = { confidential_data_viewer = "cdmc-conf-data-viewer@[your-domain-here]" non_confidential_data_viewer = "cdmc-data-viewer@[your-domain-here]" non_confidential_encrypted_data_viewer = "cdmc-encrypted-data-viewer@[your-domain-here]" non_confidential_fine_grained_data_viewer = "cdmc-fine-grained-data-viewer@[your-domain-here]" non_confidential_masked_data_viewer = "cdmc-masked-data-viewer@[your-domain-here]" } EOF
export domain_name="[your-domain-here]" sed -i'' -e "s/\[your-domain-here\]/${domain_name}/g" common.auto.tfvars
-
Update
development.auto.tfvars
andnonproduction.auto.tfvars
andproduction.auto.tfvars
with the creation of a variablecreate_resource_locations_policy
set tofalse
:echo "create_resource_locations_policy = false" >> development.auto.tfvars echo "create_resource_locations_policy = false" >> nonproduction.auto.tfvars echo "create_resource_locations_policy = false" >> production.auto.tfvars
-
Overwrite the github workflows files with the current files needed for data-mesh.
cp ../gcp-data-mesh-foundations/build/github-*.yaml ./.github/workflows
-
Include the following modules located in the
modules
directory:- artifacts
- data_consumer
- data_domain
- data_governance
- data_mesh
- github_cloudbuild
- kms
- service_catalog
- tf_cloudbuild_workspace
cp -RT ../gcp-data-mesh-foundations/4-projects/modules/artifacts/ ./modules/artifacts cp -RT ../gcp-data-mesh-foundations/4-projects/modules/data_consumer/ ./modules/data_consumer cp -RT ../gcp-data-mesh-foundations/4-projects/modules/data_domain/ ./modules/data_domain cp -RT ../gcp-data-mesh-foundations/4-projects/modules/data_governance/ ./modules/data_governance cp -RT ../gcp-data-mesh-foundations/4-projects/modules/data_mesh/ ./modules/data_mesh cp -RT ../gcp-data-mesh-foundations/4-projects/modules/github_cloudbuild/ ./modules/github_cloudbuild cp -RT ../gcp-data-mesh-foundations/4-projects/modules/kms/ ./modules/kms cp -RT ../gcp-data-mesh-foundations/4-projects/modules/service_catalog/ ./modules/service_catalog cp -RT ../gcp-data-mesh-foundations/4-projects/modules/tf_cloudbuild_workspace/ ./modules/tf_cloudbuild_workspace
-
Update each
backend.tf
file underbusiness_unit_4
with the bucket name of that holds your terraform states.terraform -chdir="../gcp-bootstrap/envs/shared/" init export backend_bucket=$(terraform -chdir="../gcp-bootstrap/envs/shared/" output -raw gcs_bucket_tfstate) echo "backend_bucket = ${backend_bucket}" for i in `find . -name 'backend.tf'`; do sed -i'' -e "s/UPDATE_ME/${backend_bucket}/" $i; done
-
This data mesh example uses GitHub with Cloud Build via Cloud Build repositories (2nd gen) connections. You can either have terraform create your repositories, or you can create them yourself using any other available methods your organization requires. A variable
create_repositories
is set totrue
by default inshared.auto.tfvars
.To not automatically create the repositories, ensure that
create_repositories
is set tofalse
inshared.auto.tfvars
file. For the purposes of this example, it is recommended to keep the default value oftrue
. -
In order to run this example, you will require two things:
- A GitHub Application Installation ID. The installation ID is the ID of your Cloud Build GitHub app. Your installation ID can be found in the URL of your Cloud Build GitHub App. In the following URL; (eg) https://github.com/settings/installations/1234567, the installation ID is the numerical value 1234567. Once acquired, update
shared.auto.tfvars
github_app_installation_id
with this value. - A Classic personal access token. This token was created in
0-bootstrap
process of this document.
- A GitHub Application Installation ID. The installation ID is the ID of your Cloud Build GitHub app. Your installation ID can be found in the URL of your Cloud Build GitHub App. In the following URL; (eg) https://github.com/settings/installations/1234567, the installation ID is the numerical value 1234567. Once acquired, update
-
Under
shared.auto.tfvars
, thegh_common_project_repos
are set to create the following repositories:- artifacts
- data-governance
- ingest
- non-confidential
- confidential
- consumer
- service-catalog
-
Update your current
gcp-projects/shared.auto.tfvars
to include the following variables, replacingupdate-me
with your GitHub organization name and00000000
with your installation ID.github_app_installation_id = 00000000 create_repositories = true gh_common_project_repos = { owner = "update-me", project_repos = { artifacts = "gcp-dm-bu4-prj-artifacts", data-governance = "gcp-dm-bu4-prj-data-governance", domain-1-ingest = "gcp-dm-bu4-prj-domain-1-ingest", domain-1-non-conf = "gcp-dm-bu4-prj-domain-1-non-conf", domain-1-conf = "gcp-dm-bu4-prj-domain-1-conf", consumer-1 = "gcp-dm-bu4-prj-consumer-1", service-catalog = "gcp-dm-bu4-prj-service-catalog", } } gh_artifact_repos = { owner = "update-me", artifact_project_repos = { artifact-repo = "gcp-dm-bu4-artifact-publish" service-catalog = "gcp-dm-bu4-service-catalog-solutions" } }
-
ingest
,non-confidential
andconfidential
are set fordomain-1
which is the placeholder data domain name for the purposes of this example. Each additional data domain you wish to add will need to be added to thegh_common_project_repos
object.To further the use of this example, the artifacts project needs a secondary repository which will hold the Dockerfiles and python packages for the data mesh to be built via pipelines.
You will find the information for this repository under
gh_artifact_repos
inshared.auto.tfvars
. -
Data Domains and Consumer projects are defined in each
main.tf
file in their respective environments. The variables are defined as:variable "data_domains" { description = "values for data domains" type = list(object( { name = string ingestion_apis = optional(list(string), []) non-confidential_apis = optional(list(string), []) confidential_apis = optional(list(string), []) } )) } variable "consumers_projects" { description = "values for consumers projects" type = list(object( { name = string apis = optional(list(string), []) } )) }
It is advisable that you keep these defaulted for the purposes of this example.
-
Much like foundations, it is imperative that the
shared
folder is manually planned and applied. This is due to the environment deployments heavy reliance on the outputs ofshared
. Once ready, apply your terraform code:export GOOGLE_IMPERSONATE_SERVICE_ACCOUNT=$(terraform -chdir="../gcp-bootstrap/envs/shared/" output -raw projects_step_terraform_service_account_email) echo ${GOOGLE_IMPERSONATE_SERVICE_ACCOUNT}
export CLOUD_BUILD_PROJECT_ID=$(terraform -chdir="../gcp-bootstrap/envs/shared/" output -raw cicd_project_id) echo ${CLOUD_BUILD_PROJECT_ID}
gcloud auth application-default login --impersonate-service-account=${GOOGLE_IMPERSONATE_SERVICE_ACCOUNT}
export TF_VAR_github_app_infra_token="YOUR-CLASSIC-ACCESS-TOKEN" echo $TF_VAR_github_app_infra_token
./tf-wrapper.sh init shared ./tf-wrapper.sh plan shared ./tf-wrapper.sh validate shared $(pwd)/policy-library ${CLOUD_BUILD_PROJECT_ID} ./tf-wrapper.sh apply shared
-
Unset your impersonated service account
gcloud auth application-default login unset GOOGLE_IMPERSONATE_SERVICE_ACCOUNT
-
Push your code to the repository
git add . git commit -m 'add required data-mesh configuration' git push
-
Create a PR in
gcp-networks
fromplan
todevelopment
. This will trigger a terraform plan. Allow the plan to complete. Once complete, merge the PR todevelopment
.**N.B.:** The terraform plan may take upwards of 15 mintutes to complete. It may seem that it has stalled, but due to the resources being gathered and the amount of data being processed, it will take some time.
-
Once the plan has completed successfully, Merge your PR.
-
Repeat the above steps for
nonproduction
. Create a PR fromdevelopment
tononproduction
. Alow the plan to complete. Once complete, merge the PR tononproduction
.**N.B.:** The terraform plan may take upwards of 15 mintutes to complete. It may seem that it has stalled, but due to the resources being gathered and the amount of data being processed, it will take some time.
-
Repeat the above steps for
production
. Create a PR fromnonproduction
toproduction
. Alow the plan to complete. Once complete, merge the PR toproduction
.**N.B.:** The terraform plan may take upwards of 15 mintutes to complete. It may seem that it has stalled, but due to the resources being gathered and the amount of data being processed, it will take some time.
-
In
5-app-infra/4-data-governance
, CDMC Tag engine will be deployed out. In order for tag engine to run properly, Organization Policies must be created on the project level. If the policy is created on an initial terraform apply, the validation mechanism (gcloud terraform vet
) fails with an error:Error converting TF resource to CAI: Invalid parent address() for an asset
. This issue happens because the projects being targeted have not been created yet. To resolve this,development.auto.tfvars
,nonproduction.auto.tfvars
, andproduction.auto.tfvars
contain a variablecreate_resource_locations_policy
. These are currently set to false. Once the steps above have been executed, set this variable to true. This will allow the policy to be created without error, as the projects being targeted have been now created. This will resolve the terraform validation error above. -
Set the variable
create_resource_locations_policy
to true indevelopment.auto.tfvars
,nonproduction.auto.tfvars
, andproduction.auto.tfvars
. -
Push your code to the repository
git add . git commit -m 'add required data-mesh configuration' git push
-
Once you have set the valriables to true, you must run through each environment deployment again. Start by submitting a PR from
plan
to thedevelopment
branch. Once that has been merged, submit a PR to thenonproduction
branch. Once that has been merged, submit a PR to theproduction
branch. -
Once all of the PRs have been merged and all terraform code has been applied, CD out of the folder
cd ..
Once 4-projects
has been updated and deployed, follow the instructions in each subfolder in the 5-app-infra
folder.
Each deployment step must be done via numerical order:
0-vpc-sc
: this will offer instructions to update all service perimeters that are used in the data mesh project1-tag-engine-oauth
: this will offer instructions on configuring oauth for the tag engine2-artifacts-project
: this will offer instructions on deploying the artifacts project which will be used to configure pipelines to build docker containers3-artifact-publish
: this will offer the repository contents for publishing artifacts (Dockerfiles and python packages) to the artifacts project4-data-governance
: this will offer instructions on deploying the data governance project5-service-catalog-project
: this will offer instructions on deploying the service catalog project6-service-catalog-solutions
: this will offer instructions on deploying the service catalog solutions for the interactive environment.7-data-domain-1-non-confidential
: this will offer instructions on deploying the non-confidential data project8-data-domain-1-ingest
: this will offer instructions on deploying the data ingestion project9-data-domain-1-confidential
: this will offer instructions on deploying the confidential data project10-run-cdmc-engines
: this will offer instructions on how to run the CDMC engines11-consumer-1
: this will offer instructions on deploying the Consumer project12-adding-additional-data
: this will offer instructions on how to add additional data domains and/or datasets to an existing data domain