GCP
BigQuery
Google
Loading CSV data into BigQuery
2023/04/29 01:24:29
0
1230
Goal:
Ingest CSV data into tables inside of BigQuery
Required roles and permissions:
Permissions to load data into BigQuery
IAM permissions:
bigquery.tables.create
bigquery.tables.updateData
bigquery.tables.update
bigquery.jobs.create
predefined IAM roles:
roles/bigquery.dataEditor
roles/bigquery.dataOwner
roles/bigquery.admin
(includes thebigquery.jobs.create
permission)bigquery.user
(includes thebigquery.jobs.create
permission)bigquery.jobUser
(includes thebigquery.jobs.create
permission)
Permissions to load data from Cloud Storage
IAM permissions:
storage.buckets.get
storage.objects.get
storage.objects.list
(required if you are using a URI wildcard)
Task 1. Create a new dataset to store tables
- In the BigQuery console, click on the View actions icon () next to your Project ID and click Create dataset.
- Set the Dataset ID to movie. Leave the other fields at their default values.
- Click Create dataset.
You’ll now see the movie dataset under your project name.
Task 2. Ingest a new dataset from a CSV
- In the BigQuery Console, click View actions icon next to nyctaxi dataset then click Create table.
- Specify the following table options:
Source:
- Create table from: Upload
- Select file: Browse the file you downloaded locally earlier
- File format: CSV
Destination:
- Table name: rating
Leave all other settings at default.
Schema:
- Check Auto Detect (tip: Not seeing the checkbox? Ensure the file format is CSV and not Avro)
Advanced Options:
- Leave at default values
3. Click Create Table.
You should now see the rating table below the movie dataset.
Task 3. Ingest a new dataset from Google Cloud Storage
Now, try to load another subset of the same rating data that is available on Cloud Storage. And this time, let’s use the CLI tool to do it.
- In your Cloud Shell, run the following command:
bq load \
--source_format=CSV \
--autodetect \
--noreplace \
movie.rating1 \
gs://cloud-sql-postgres-samples/rating.csv
You should now see the rating1 table below the movie dataset.