Name	Name	Last commit message	Last commit date
parent directory ..
envs	envs
images	images
modules	modules
scripts	scripts
.gitignore	.gitignore
README.md	README.md
common.auto.tfvars	common.auto.tfvars
development.auto.tfvars	development.auto.tfvars
nonproduction.auto.tfvars	nonproduction.auto.tfvars
production.auto.tfvars	production.auto.tfvars

5-app-infra

These instructions are part of expanding and deploying the Data Mesh architecture. Please follow each part in sequence.

0-vpc-sc	Runs a local terraform plan that outputs the necessary configurations for your Service Perimeter.
1-tag-engine-oauth	Instructions on how to configure OAUTH needed for Data Mesh's Tag Engine.
2-artifacts-project	Sets up a repository structure and instructions on deploying the artifacts project
3-artifact-publish	A repository structure containing Dockerfiles and python packages that will be used for building and publishing artifacts
4-data-governance	A repository structure containing instructions on deploying the data governance project
5-service-catalog-project	A repository structure containing instructions on deploying the service catalog project
6-service-catalog-solutions	Instructions on how to configure Service Catalog
7-data-domain-1-nonconfidential	A repository structure containing instructions on deploying the non-confidential data project
8-data-domain-1-ingest	A repository structure containing instructions on deploying the ingest project
9-data-domain-1-confidential	A repository structure containing instructions on deploying the confidential data project
10-run-cdmc-engines	Instructions on how to run the CDMC engines
11-consumer-1	A repository structure containing instructions on deploying the Consumer project
12-adding-additional-data	Instructions on how to add additional data domains and/or datasets to an existing data domain

Consumer Project

Repository for consumer-1

Deploying gcp-dm-bu4-prj-consumer-1

clone your repository that was created in 4-projects

git clone [email protected]:[git-owner-name]/gcp-dm-bu4-prj-consumer-1.git bu4-prj-consumer-1

cd over to the bu4-prj-consumer-1 directory
```
cd bu4-prj-consumer-1
```

Seed the repository if has no been initialized yet.

git commit --allow-empty -m 'repository seed'
git push --set-upstream origin main

git checkout -b production
git push --set-upstream origin production

git checkout -b nonproduction
git push --set-upstream origin nonproduction

git checkout -b development
git push --set-upstream origin development

git checkout -b plan

Copy contents of foundation to new repo.

cp -RT ../gcp-data-mesh-foundations/policy-library/ ./policy-library
cp ../gcp-data-mesh-foundations/build/cloudbuild-connection-tf-* .
cp ../gcp-data-mesh-foundations/build/tf-wrapper.sh .
chmod 755 ./tf-wrapper.sh

Copy contents over to new repo.

cp -R ../gcp-data-mesh-foundations/5-app-infra/11-consumer-1/* .

Update the backend.tf files with the backend bucket from step Projects.

export backend_bucket=$(terraform -chdir="../gcp-projects/business_unit_4/shared" output  -json state_buckets | jq -r '."consumer-1"')
for i in `find . -name 'backend.tf'`; do sed -i'' -e "s/UPDATE_APP_INFRA_BUCKET/${backend_bucket}/" $i; done

Update remote_state_bucket in common.auto.tfvars

export remote_state_bucket=$(terraform -chdir="../gcp-bootstrap/envs/shared" output -raw projects_gcs_bucket_tfstate)
sed -i'' -e "s/REMOTE_STATE_BUCKET/${remote_state_bucket}/" common.auto.tfvars

Commit Changes

git add .
git commit -m 'Initialize consumer-1 repo'

Push your plan branch
```
git push --set-upstream origin plan
```
Create a PR request from plan to development in your GitHub Repository
Observe the plan in GCP Build by going to: https://console.cloud.google.com/cloud-build/builds;region=us-central1?hl=en&project=[prj-c-bu4-infra-gh-cb-ID-HERE]
Merge development to nonproduction and observe the terraform apply in GCP build here: https://console.cloud.google.com/cloud-build/builds;region=us-central1?hl=en&project=[prj-c-bu4-infra-gh-cb-ID-HERE]
Merge nonproduction to production and observe the terraform apply in GCP build here: https://console.cloud.google.com/cloud-build/builds;region=us-central1?hl=en&project=[prj-c-bu4-infra-gh-cb-ID-HERE]
Once done, cd out of this folder
```
cd ..
```

Data Access Management API

In 4-data-governance a Data Access API was created to manage data access permissions. Below, there is information and a guide on how to utilize this API.

Defined User Groups

cdmc-conf-data-viewer
cdmc-data-viewer
cdmc-masked-data-viewer
cdmc-fine-grained-data-viewer
cdmc-encrypted-data-viewer

The above user groups are created at the organizational level. According to their respective roles in the organization, only users in these specific groups can access the data.

These groups must have the Data Access Management Service Account and the Approvers as owners. The Service Account ownership must be given through the Google Cloud Console IAM & Admin.

Select the Organization, under the Select a Resource Dropdown. And navigate to Cloud Console IAM & Admin
On the Right Menu, click on Groups
Grab the Data Access Management API service account on your Governance Project. The service account is data-access-management@DATA_GOVERNANCE_PROJECT_ID.iam.gserviceaccount.com. Replace the DATA_GOVERNANCE_PROJECT_ID with your Governance Project ID.
Finally, you can add the Data Access Management API service account as the group OWNER for each one.

Consumers Group and Roles

Data Viewers: Users who can access non-confidential data.
- BigQuery Data Viewer - roles/bigquery.dataViewer
- BigQuery Job User - roles/bigquery.jobUser
Encrypted Data Viewers: Users who can access non-confidential data with sensitive encrypted data.
- Cloud KMS CryptoKey Decrypter Via Delegation - roles/cloudkms.cryptoKeyDecrypterViaDelegation
Fine-Grained Data Viewers: Users can access protected data by column-level access control.
- Fine-Grained Reader - roles/datacatalog.categoryFineGrainedReader
Masked Data Viewers: Users who can access non-confidential data with sensitive data masked.
- Masked Reader - roles/bigquerydatapolicy.maskedReader
Confidential Data Viewers: Users who can access confidential data.
- BigQuery Data Viewer - roles/bigquery.dataViewer
- BigQuery Job User - roles/bigquery.jobUser

Requester Endpoints

Navigate to the Data Governance Project in the Google Cloud Console, and go to Cloud Run.
Identify Data Access Management API: Locate the API that begins with data-access-management-api-. Each API is designed to manage access for a specific dataset ingested into your data domain.
Copy the URL: For each of the data-access-management-api- API, click on the Copy to clipboard button next ot the URL link.

Export the Variable DATA_ACCESS_MANAGEMENT_API with the URL From the Previous Step, and Run the Following Commands in the Terminal to Request a Specific Role:

curl \
    --location "${DATA_ACCESS_MANAGEMENT_API}/v1/permission-requests/users" \
    --header 'Content-Type: application/json' \
    --header "Authorization: Bearer $(gcloud auth print-identity-token)" \
    --data '{"roles": ["roles/bigquerydatapolicy.maskedReader"]}'

Approver Endpoints

Navigate to Cloud Run: Access the Google Cloud Console, and go to Cloud Run.
Identify Data Access Management API: Locate the API that begins with data-access-management-api-. Each API is designed to manage access for a specific dataset ingested into your data environment.
Copy the URL: For each of the data-access-management-api- API, click on the Copy to clipboard button next ot the URL link.
Export the Variable DATA_ACCESS_MANAGEMENT_API with the URL From the Previous Step, and Run the Following Commands in the Terminal to List all Permission Requests:
```
  curl -X GET \
--location "${DATA_ACCESS_MANAGEMENT_API}/v1/permission-requests/" \
--header "Authorization: Bearer $(gcloud auth print-identity-token)"
```
Export the Variable DATA_ACCESS_MANAGEMENT_API with the URL From the Previous Step, and REQUEST_ID with the ID of the Permission Request, and Run the Following Commands in the Terminal to Approve a Specific Request:
```
  curl -X PUT \
--location "${DATA_ACCESS_MANAGEMENT_API}/v1/permission-requests/${REQUEST_ID}/approve" \
--header "Authorization: Bearer $(gcloud auth print-identity-token)"
```
Export the Variable DATA_ACCESS_MANAGEMENT_API with the URL From the Previous Step, and REQUEST_ID with the ID of the Permission Request, and Run the Following Commands in the Terminal to Deny a Specific Request*:
```
  curl -X PUT \
--location "${DATA_ACCESS_MANAGEMENT_API}/v1/permission-requests/${REQUEST_ID}/deny" \
--header "Authorization: Bearer $(gcloud auth print-identity-token)"
```

Requirements

To be included in a consumer group, you must first submit a request to the Data Management API, and wait for the approval from a group owner. For detailed instructions on how to request membership, please refer to the following link.

Link: Data Management API

Once your request is approved, you will be granted access to the groups and their associated permissions.

- cdmc-conf-data-viewer

Highest level authentication to access data. Users added to this group can directly access confidential data that is stored in its raw format, in the confidential project.

Example Query:

SELECT * FROM `<confidential_project_id>.<dataset_id>.<table_id>` LIMIT 10;

- cdmc-data-viewer

Lowest level access to data. Users added to this group can access raw data that is non-sensitive and stored in the non-confidential project. While the users can query the de-identified fields, they do not have any access to query the masked fields. Users will therefore have to use the except function to avoid the masked field in their queries. Data in the de-identified fields will be visible as encrypted.

Example Query:

SELECT * EXCEPT(Card_Holders_Name) FROM `<non_confidential_project_id>.<dataset_id>.<table_id>` LIMIT 10;

- cdmc-masked-data-viewer

Users added to this group have similar access as the cdmc-data-viewer group with the exception that they can query the masked field. However, the values from the masked field are displayed in encrypted format to the users, similar to the de-identified fields.

Example Query:

SELECT * FROM `<non_confidential_project_id>.<dataset_id>.<table_id>` LIMIT 10;

- cdmc-fine-grained-data-viewer

Users added to this group have similar access as the cdmc-masked-data-viewer group. The difference in these users is that they can actually see the raw value of the masked field. The values in the de-identified fields is still displayed as encrypted.

Example Query:

SELECT * FROM `<non_confidential_project_id>.<dataset_id>.<table_id>` LIMIT 10;

- cdmc-encrypted-data-viewer

Users added to this group have similar access as the cdmc-data-viewer group with the exception that they can query and view the de-identified field data in raw format. While the users can query and view the de-identified fields, they do not have any access to query the masked fields. Users will therefore have to use the except function to avoid the masked field in their queries. To be able to re-identify the data in de-identified fields, the users will first have to retrieve the wrapped key in bytes, and use this wrapped key in combination with the kms key name. The wrapped key and the kms key should be the same ones that have been used to de-identify the data in the respective fields. The following python script should allow the user to retrieve the wrapped key in binary format. This key can then be used in the query, as shown in the example below.

Python Script:

python ./get_wrapped_key_bytes --wrapped_key projects/<project_id>/secrets/<secret_name>/versions/<version>

Sample output:

%> python ./get_wrapped_key_bytes.py --wrapped_key projects/<project_id>/secrets/<secrect_name>/versions/<version>
b'\n$\x00<e|5\x9c"\xab?\xac\'o\xa5\xeb\xb8\xee4\xf0\xb9&+v&\x1d\xdd:\x85\x11\xd0.\xe3\x9b\xeby\x8c\xc0\x12A\x00sa;\xd8\xe0>\x99\x13\xc4\xc1\xa6\xacn\xfa\xaa\xef\xb0\xa1\xd1\n\n\xa7\x91\xb6\xd8\x02\x9cE\xc5\xad\xebfZ\xfe\xe82\xcc*c>\xef\x0f\xb4$\xdek\x95\x8bu\t\xa9\xe2\xf2<\\\x0bI\x1aw66\\m7'

Example Query:

CREATE TEMP FUNCTION decrypt_data(encodedText STRING)
            RETURNS STRING
                AS (
                    DLP_DETERMINISTIC_DECRYPT(
                        DLP_KEY_CHAIN(
                            "gcp-kms://projects/<project_id>/locations/<location>/keyRings/<keyRings>/cryptoKeys/<cryptoKeys>",
                            <wrapped_key_from_script_output>
                        ),
                    encodedText,  
                    ''  
                ) 
        );

        SELECT 
          Card_Type_Code,
          Issuing_Bank,
          Card_Number,
          decrypt_data(Card_Number) as decrypted_Card_Number,
          Card_PIN,
          decrypt_data(Card_PIN) as decrypted_Card_PIN, 
          Credit_Limit
        FROM 
          `<non_confidential_project_id>.<dataset_id>.<table_id>`;

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

11-consumer-1

11-consumer-1

README.md

5-app-infra

Consumer Project

Deploying gcp-dm-bu4-prj-consumer-1

Data Access Management API

Defined User Groups

Consumers Group and Roles

Requester Endpoints

Approver Endpoints

Requirements

- cdmc-conf-data-viewer

- cdmc-data-viewer

- cdmc-masked-data-viewer

- cdmc-fine-grained-data-viewer

- cdmc-encrypted-data-viewer

Files

11-consumer-1

Directory actions

More options

Directory actions

More options

Latest commit

History

11-consumer-1

Folders and files

parent directory

README.md

5-app-infra

Consumer Project

Deploying gcp-dm-bu4-prj-consumer-1

Data Access Management API

Defined User Groups

Consumers Group and Roles

Requester Endpoints

Approver Endpoints

Requirements

- cdmc-conf-data-viewer

- cdmc-data-viewer

- cdmc-masked-data-viewer

- cdmc-fine-grained-data-viewer

- cdmc-encrypted-data-viewer