The Importance of Retaining Full Control over your Data #
In any data-related process involving third parties, it is crucial for your organization to retain full control over your data. This ensures data security, long-term accessibility, and compliance with internal policies.
At Maya Insights, we have consistently advocated for this approach. All organizations showing their trust in us have the option to store all the data processed by our data model in their own cloud or local storages. Especially in the recent months, we have evangelized this scheme to the dozens of companies that have contacted us to backup their Universal Analytics (or “GA3”) data in view of its ultimate sunset on June 30th.
Step-by-Step Guide to Setting up a BigQuery Data Destination #
Setting up your destination involves three main steps:
- creating a service account to take on the data transition from the data source to your cloud,
- creating a dataset, and
- sharing the service account key with your external service provider
It is important to stress some administrative points before we start:
- First, in order to prevent any data storage limitations that come with the Free Tier of BigQuery, you need to setup a billing account. If you are using BigQuery exclusively to save static tables, like those our clients retrieve from backing up their UA data, then the ongoing cost will be virtually zero.
- Moreover, you need to set the dataset id according to your service provider’s specifications. In our case, the naming convention we follow is prj_xxx_db, where “xxx” corresponds to an id assigned to your dataset by Maya’s internal servers.
- Before you begin, make sure your account has access to your organization’s Google Cloud project with at least Service Account Admin and BigQuery Admin rights.
Below are detailed instructions to help you through the process:
✅Step A: Create a Service Account: #
- Navigate to the “Service Accounts” page and click on “Create Service Account”
Note: If you do not have a BigQuery project, you will need to create one before proceeding. You can assign it a name of your choice.
- Fill in the Service Account Details by giving the account a service account name of your choice.
- Grant this service account access to the project: Assign it the roles of BigQuery Job User and BigQuery Read Session User.
- Grant users access to this service account: no specific input is required. Just click done.
- Back in the service account page, select your project, copy the service account email (i.e. column email) created to be used in the next step.
✅Step B: Using the navigation menu, go to BigQuery → BigQuery Studio #
Note: You may be prompted to enable the BigQuery API in order to proceed. If that’s the case, click on enable.
- In the “Explorer” panel, select the project in which you want to create your dataset.
- Expand the “More Actions” option and click “Create Dataset” (as shown below).
- In the “dataset_id” field, It is critical that you insert the dataset id designated by Maya. Select the region/multi-region where you want to save your data. That would -preferably- be somewhere close to your physical location or the location you already have your data stored in. Click on “Create Dataset”.
- Click on the created dataset on the Explorer panel. Click on Sharing → Permissions → Add Principal → Then add the service account email copied from step B4 on New principals field, with role BigQuery Admin, and click Save.
⚠️ Please note that once a data source is set, the destination can’t be altered.