How to Load a Large CSV file into BigQuery
Welcome to Notes from BenDesk: Ben is our resident Freshdesk captain and manager of all help@ inquiries. We're bringing you interesting inquiries from his inbox each month to help share learnings across our community.
Question of the Month: I’m having trouble loading a large CSV file into BigQuery and am hitting a file size limit. Is there a way to upload a large CSV file, and if so, can you share any instructions?
BenDesk Answer: Yes, there is! BigQuery can handle the process of loading large CSV files as tables. To do so, you will need to follow a few simple steps of loading the CSV to a Google Cloud Storage (GCS) bucket and making sure the data is accessible in your project. Below you’ll find a step-by-step guide to help you with this process:
Step 1: Create a Google Cloud Storage bucket.
Google Cloud Storage buckets are essentially containers for holding and organizing your data. Google provides a detailed guide on how to create a bucket here.
Step 2: Upload your file to your newly-created GCS bucket.
From within your BigQuery Project:
Right-click on the three dots and select "Create dataset".
Enter the Dataset ID (dataset name) and set the data location. Once complete, click "Create Dataset".
Right-click the actions menu next to the new dataset and select "Create table".
Step 3: Create the table
Select "Google Cloud Storage" in the "Create Table From" drop-down.
Select "Browse" and select your file from the bucket directory.
Select the file format (in this case, CSV) and enter the table name.
Select "Auto Detect" for the schema. You can also specify field names and datatypes manually, either by entering each field in the provided boxes or by editing as text.
Step 4: Advanced options (Optional)
Identify the delimiter type if applicable.
If the sheet includes a header row, click the drop-down next to Advanced Options, and update rows to be skipped to 1. BigQuery will use the header row to label columns in the table.
Select "Create Table" to complete the process.
Step 5: Confirm your new table
Confirm that your new table appears in the appropriate project when the load job is completed. Once confirmed, this process is complete, and you can start using your data!
For more information on loading tables into BigQuery, including screenshots, please check out our help article here.