-
Notifications
You must be signed in to change notification settings - Fork 342
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BigQuery Data Graph Public Beta #7173
Open
stayseesong
wants to merge
2
commits into
develop
Choose a base branch
from
DOC-1001
base: develop
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,54 +1,85 @@ | ||
--- | ||
title: BigQuery Setup | ||
title: BigQuery Data Graph Setup | ||
beta: true | ||
plan: unify | ||
hidden: true | ||
redirect_from: | ||
- '/unify/linked-profiles/setup-guides/BigQuery-setup' | ||
--- | ||
|
||
> info "" | ||
> At this time, you can only use BigQuery with Linked Events. | ||
> BigQuery for Data Graph is in beta and Segment is actively working on this feature. Some functionality may change before it becomes generally available. This feature is governed by Segment’s [First Access and Beta Preview Terms](https://www.twilio.com/en-us/legal/tos){:target="_blank"}. | ||
|
||
On this page, you'll learn how to connect your BigQuery data warehouse to Segment. | ||
Set up your BigQuery data warehouse to Segment for the [Data Graph](/docs/unify/data-graph/data-graph/). | ||
|
||
|
||
## Set up BigQuery | ||
|
||
## Step 1: Roles and permissions | ||
> warning "" | ||
> You need to be an account admin to set up the Segment BigQuery connector as well as write permissions for the `__segment_reverse_etl` dataset. | ||
|
||
To set up the Segment BigQuery connector: | ||
> You need to be an account admin to set up the Segment BigQuery connector as well as write permissions for the `__segment_reverse_etl` dataset. | ||
|
||
1. Navigate to **IAM & Admin > Service Accounts** in BigQuery. | ||
To set the roles and permissions: | ||
1. Navigate to **IAM & Admin > Service Accounts** in BigQuery. | ||
2. Click **+ Create Service Account** to create a new service account. | ||
3. Enter your **Service account name** and a description of what the account will do. | ||
3. Enter your Service account name and a description of what the account will do. | ||
4. Click **Create and Continue**. | ||
5. In the **Grant this service account access to project** section, select the [*BigQuery User*](https://cloud.google.com/bigquery/docs/access-control#bigquery.user){:target="_blank"} role to add. | ||
6. Click **+ Add another role** and add the *BigQuery Job User* role. | ||
7. Click **+ Add another role** and add the [*BigQuery Metadata Viewer*](https://cloud.google.com/bigquery/docs/access-control#bigquery.metadataViewer){:target="_blank"} role. | ||
8. Click **Continue**, then click **Done**. | ||
9. Search for the service account you've just created. | ||
11. From your service account, click the three dots under **Actions** and select **Manage keys**. | ||
12. Click **Add Key > Create new key**. | ||
13. In the pop-up window, select **JSON** for the key type, and click **Create**. | ||
14. Copy all the content within the file you've created and downloaded. | ||
15. Navigate to Segment and paste all the credentials you've just copied into the **Enter your credentials** section as you connect your warehouse destination. | ||
|
||
## Grant access to datasets and tables for enrichment | ||
|
||
Grant access to datasets and tables so that Segment can list datasets, tables, and columns, and create Linked Events. | ||
|
||
Grant | ||
- [`BigQuery Data Viewer`](https://cloud.google.com/bigquery/docs/access-control#bigquery.dataViewer){:target="_blank"} role <br> | ||
OR | ||
- Permissions: | ||
- `bigquery.datasets.get` | ||
- `bigquery.tables.list` | ||
- `bigquery.tables.get` | ||
- `bigquery.tables.getData` | ||
|
||
These can be scoped to projects or [datasets](https://cloud.google.com/bigquery/docs/control-access-to-resources-iam#grant_access_to_a_dataset){:target="_blank"}. | ||
5. Click **+ Add another role** and add the *[BigQuery User](https://cloud.google.com/bigquery/docs/access-control#bigquery.user){:target="_blank"}* role. | ||
6. Click **Continue**, then click **Done**. | ||
7. Search for the service account you just created. | ||
8. From your service account, click the three dots under **Actions** and select **Manage keys**. | ||
9. Navigate to **Add Key > Create new key**. | ||
10. In the pop-up window, select **JSON** for the key type, and click **Create**. The file will download. | ||
11. Copy all the content in the JSON file you created in the previous step, and save it for Step 5. | ||
|
||
> info "" | ||
> To create Linked Events on your listed tables, Segment needs `bigquery.tables.get` and `bigquery.tables.getData` at dataset level. However, you can still scope `bigquery.tables.get` and `bigquery.tables.getData` to specific tables. See BigQuery's [docs](https://cloud.google.com/bigquery/docs/control-access-to-resources-iam#grant_access_to_a_table_or_view){:target="_blank"} for more info. | ||
|
||
## Step 2: Grant read-only access for the Data Graph | ||
Grant the [BigQuery Data Viewer](https://cloud.google.com/bigquery/docs/access-control#bigquery.dataViewer){:target="_blank"} role to the service account at the project level. Make sure to grant read-only access to the Profiles Sync project in case you have a separate project. | ||
|
||
To grant read-only access for the Data Graph: | ||
1. Navigate to **IAM & Admin > IAM** in BigQuery. | ||
2. Search for the service account you just created. | ||
3. From your service account, click the **Edit principals pencil**. | ||
4. Click **ADD ANOTHER ROLE**. | ||
5. Select the **BigQuery Data Viewer role**. | ||
6. Click **Save**. | ||
|
||
## *(Optional)* Step 3: Restrict read-only access | ||
If you want to restrict access to specific datasets, grant the BigQuery Data Viewer role on datasets to the service account. Make sure to grant read-only access to the Profiles Sync dataset. | ||
|
||
To restrict read-only access: | ||
1. In the Explorer pane in BigQuery, expand your project and select a dataset. | ||
2. Navigate to **Sharing > Permissions**. | ||
3. Click **Add Principal**. | ||
4. Enter your service account in the New principals section. | ||
5. Select the **BigQuery Data Viewer** role in the **Select a role** section. | ||
6. Click **Save**. | ||
|
||
You can also run the following command: | ||
|
||
``` | ||
GRANT `roles/bigquery.dataViewer` ON SCHEMA `YOUR_DATASET_NAME` TO "serviceAccount:<YOUR SERVICE ACCOUNT EMAIL>"; | ||
``` | ||
|
||
## Step 4: Validate permissions | ||
1. Navigate to **IAM & Admin > Service Accounts** in BigQuery. | ||
2. Search for the service account you’ve just created. | ||
3. From your service account, click the three dots under **Actions** and select **Manage permissions**. | ||
4. Click **View Access** and click **Continue**. | ||
5. Select a box with List resources within resource(s) matching your query. | ||
6. Click **Analyze**, then click **Run query**. | ||
|
||
## Step 5: Connect your warehouse to Segment | ||
1. Navigate to **Unify > Data Graph** in Segment. This should be a Unify space with Profiles Sync already set up. | ||
2. Click **Connect warehouse**. | ||
3. Select *BigQuery* as your warehouse type. | ||
4. Enter your warehouse credentials. Segment requires the following settings to connect to your BigQuery warehouse: | ||
* **Service Account Credentials:** JSON credentials for a GCP Service Account that has BigQuery read/write access. This is the credential created in Step 1. | ||
* **Data Location:** This specifies the primary data location. This can be either region or multi-region. | ||
5. Test your connection, then click **Save**. | ||
|
||
## Update user access for Segment Reverse ETL dataset | ||
If you ran Segment Reverse ETL in the project you are configuring as the Segment connection project, a Segment-managed dataset is already created and you need to provide the new Segment user access to the existing dataset. | ||
|
||
Grant the [BigQuery Data Editor](https://cloud.google.com/bigquery/docs/access-control#bigquery.dataEditor){:target="_blank"} role on the `__segment_reverse_etl` dataset to the service account if you run into an error on the Segment app indicating that the user doesn’t have sufficient privileges on an existing `__segment_reverse_etl` dataset. Note that the `__segment_reverse_etl` dataset is hidden in the console. Run the following SQL command: | ||
|
||
``` | ||
GRANT `roles/bigquery.dataEditor` ON SCHEMA `__segment_reverse_etl` TO "serviceAccount:<YOUR SERVICE ACCOUNT EMAIL>"; | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,5 @@ | ||
--- | ||
title: Redshift Setup | ||
title: Redshift Data Graph Setup | ||
beta: true | ||
plan: unify | ||
hidden: true | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something about this paragraph confuses me. So they only need to do this if they run into this error, right? Could we maybe put "if you run into an error...." first? I read this initially as... you have to do this.