In the last tutorial we set up our BigQuery account. The problem is, it’s kind of useless without any real data there. Sure we have some interesting public datasets to play with but that’s not real work.
We will use Fivetran to automatically load the data we need into BigQuery. This means we can then query the data with SQL and do some super cool analytics :D
If you want to learn SQL, then well, don’t we have the course for you: supercooldata.com
It suits people in different areas such as product, marketing, SEO, finance, sales, customer support etc.
Fivetran is awesome because you just click a few buttons and data magically appears and syncs to your BigQuery account.
This all happens automatically and you don’t need to know Python or any crazy stuff to get it going.
Fivetran has a huge amount of connectors to almost every major software tool your business uses like Stripe, Mailchimp, Hubspot, Zendesk, Facebook Ads, Google Ads etc.
You get the point, you can now take all the data from the tools you use, load them into BigQuery and schedule reports. How cool is that?
We also prepared an easy to follow video (this article covers the second part of the video from 3 minutes onwards)
- Navigate to www.fivetran.com and click the “Free Trial” button
- Create an account by following the prompts
- Select “Google Analytics” for source
- Select “BigQuery” for destination
- Click the “Set Up” button
- Follow the instructions on the right in the “Setup Guide” section
- Choose “3 months” to make the process faster (you can repeat with full data next time)
- Click “Sync Specific Accounts” to make the process quicker
- Choose the Google Analytics profiles/views that you want to sync
- Configure your first Google Analytics report as shown after clicking “+ Add report”:
- Click “OK”
- For the first run, keep one report. You can add more later with whatever dimensions and metrics you want to be loaded to BigQuery.
- NOTE: Each report needs to have the “date” field included
- Click “Save & Test”
- Click “CONTINUE”
- If you followed the BigQuery setup tutorial, your project name should be “My First Project”. This is NOT the value that Fivetran needs. Navigate to Google Cloud Console: https://console.cloud.google.com
- Your project ID is visible here:
- Copy paste your Project ID from Google Cloud Console into the Fivetran Project ID field.
- Copy the bold email in the Fivetran form. We need to give Fivetran permission to load data to your BigQuery account.
- Navigate to https://console.cloud.google.com/iam-admin/iam
- Click “ADD” to add a new member which is the Fivetran service user that will load data to BigQuery on your behalf
- Paste the email address given by Fivetran into the “New members” field
- Select “BigQuery User” as the role to provide Fivetran
- Click “SAVE”
- Go back to the Fivetran window and click “SAVE & TEST”
- Click “Review Schema”
- Click “Start Syncing”
- This may take some time as Fivetran loads Google Analytics data into your BigQuery account
- After the Sync is complete, you can check the data is in BigQuery next to the public data you accessed in the last tutorial. I named this report “test_123”.
- You can now query this data in BigQuery by pasting the query below into your editor and clicking “Run”. Change “test_123” to what you named the table in the Fivetran set up.
SELECT * FROM google_analytics.test_123 LIMIT 1000;
No more manually exporting data from Google Analytics and getting it into Excel to do your analysis. Fivetran loads it automatically while you sleep.
Fivetran will now sync your data by default every 6 hours so you don’t need to manually run any Google Analytics reports:
You can change the frequency really easily from the set up tab for each connector.
Okay you might be thinking, wait a sec, Google Analytics is great but I use Amplitude/Mixpanel/Heap or some other privacy invasive tool to track and stalk my users.
Well, you’re in luck… Fivetran has so many connectors, it will probably have what you use.
It took me less than 10 minutes to set up Google Analytics, Google Search Console and Mailchimp automatic data loads to BigQuery. All this data is updated every 6 hours and I don’t need to touch it again… it just works.
These are all the tables I’ve synced with Fivetran in my BigQuery account.
I didn’t hire any data engineers to load this in for me and take several months to define the tables, talk to business stakeholders about what columns they need, argue about column and table names and then to write the Python code to do it.
I just clicked a couple of buttons and here it is! It’s all so…clean.
This is what my connections dashboard looks like in Fivetran:
Okay that’s great, I’ve got my real data in my BigQuery account.
What do I do with that?
Now you can write your own SQL queries to do your own analysis.
But…uhh… what is ess-queue-elll?
start sales pitch
If only there was some way you could easily and quickly skill-up in SQL to learn how to analyse data, hmm… check out supercooldata.com :)
end sales pitch
BigQuery has a nice feature that lets you schedule those queries so if you’ve got a lot of data, you don’t need to wait for your queries to run so your charts are up to date.
Also, BigQuery integrates with DataStudio (not my favourite tool, but it does the job) quite well so you can also make some plots from the data that’s returned from those queries.
That’s it, BigQuery is setup, Fivetran connectors are setup and your real data is now available for analysis.
You can take a break, go back on Slack and feel productive while more data is loading.
Now we just need to write some SQL queries and setup some awesome looking charts - we will show you how to do that later.