It has been wonderful to watch the world pay more and more attention to data over the last few years, especially seeing more people and businesses moving towards data driven decision making. What concerns me most alongside this though, is the lack of questioning and the high level of trust put into numbers people may not fully understand the production of.
What I’m saying is, can you really trust your numbers? Whether it is the forecasted weather temperature, forecast share value or the revenue your website is said to have made last month. Which ones can you trust? Do you know how they are calculated? You know that when you stand on your bathroom scales the number is dependent on the machine working properly and you not wearing a heavy coat – but to what extent might your web analytics data be accurate? What level of variance is actually acceptable for each number? And how can you do anything about it? Well, when it comes to the trust and accuracy of Google Analytics, I am here to help!
I have been auditing and improving data collection in Google Analytics for more years than I now care to admit, and if you’ve got a problem with your data, I’ve probably seen it a few times before! So I’m going to take you through some of the top things to look out for, why they happen and how you can resolve it – because there’s no point knowing what’s broken without any clue on how to fix it, right? Let’s get your GA working better for you, so you can make your website work better for you too. Because at the end of the day, that’s what pays the bills and buys the toys!
We’re going to start at the very beginning – Account, Property and View Set Up, plus two key reports that tie into this – because if I tried to include everything in an audit you might be here all week! If you can see there’s some major concerns or you get stuck working through this, do get in touch and talk about how we could help.
Account Set Up
Best practice set up would be one Google Analytics Account per business, however we know some businesses have been set up with different accounts for different parts of the business for whatever reason. The key thing is, by moving them into one account you have all the properties in one place and it is easier to manage. This doesn’t do anything for accuracy though, so let’s skip to properties.
The property is where the UA code is set, aka the tracking ID. This is the unique number that goes in the code so that it knows to send the data to this property. There are a number of settings specific to properties which you should be aware of:
If you have a very busy website you will want to review how many hits you are receiving. This is because the Google Analytics free version is limited to 10 million hits per month per property, before it may stop recording additional hits and may restrict your access to the reports. Scary huh?
If you approach the 10 million hit mark, or close to it, on a regular basis you will need to consider the Google Analytics 360 suite, which offers a premium level of service including unsampled reports – which for businesses seeing over 10 million hits per month is going to be a must have item! There are plenty more benefits to this platform too, so get in touch if you want to know more about it.
Under the tracking information section there are a few key things to be aware of.
Here’s a summary of what they do:
Tracking Code: This shows you the code you can use to tag your site with GA, although we would strongly recommend for accuracy and management that you use Google Tag Manager rather than inserting a new snippet in the code of the page for every tag, as suggested here.
Data Collection: Turning this on allows GA to collect additional data which it uses for advertising, such as remarketing and demographic reporting. If you turn this on you must inform your visitors that this is an additional way they are being tracked.
Data Retention: This allows you to keep user level data for a chosen period, to help you comply with GDPR. Without wanting to divert this whole post to a new topic, my two favourite blog posts on this setting are available here, with one from Jeff Sauer and one from Brian Clifton, both of whom are well respected authorities in the analytics world who I’m pleased to say I’ve had the pleasure of chatting to at conferences over the years.
User-ID: If you have a site on which users can sign in, you can use this functionality to add a User ID to each session of data in GA where available, on the condition your users have agreed to this and you follow the relevant privacy rules. This User ID can allow for cross domain reports to be populated, as well as offline stitching of data to correlate to your CRM database. This data can be highly beneficial but do be careful to follow the rules.
Session Settings: This sets how long a session and campaign last, which by default is 30 minutes and 6 months respectively. This is important for understanding that some long sessions are just users not doing anything for this long and why you might still be getting visitors coming in from a campaign that ended 3 months ago – if the campaign cookie was set then and is still available when they return direct to the site / with no new campaign information then this is what will be set again as their campaign / source.
Organic Search Sources: If you’re not happy that Google Analytics’ list of organic search engines covers everything you would like it to and are seeing some come in under referral traffic, then update the list here to correctly classify search engines with your logic.
Referral Exclusion List: With Universal Analytics tracking library (as opposed to the legacy classic method) if a user triggers a referral to happen mid-session, it will start a new session with that referral being shown as the traffic source. You might think, “what’s the problem?”, but think to how common payment journeys work and you may realise the problem – PayPal is often the best performing traffic source, yet has no marketing activity. So, you need to exclude your domain and third party sites which users visit during their session on your site in order to make the traffic source direct, aka, ignored. However, do not add spam domains and the like in here, that needs a view filter as they will otherwise still count as sessions but just be hidden in the direct pot.
Search Term Exclusion List: If you wish to class any brand terms as direct traffic or such like, you can add the terms to ignore in here.
The key things to ensure you have right here are:
- Put the correct domain in the settings (in order for you to preview pages in behaviour reports)
- Leave the “Default Page” setting empty
- Add parameters as exclusions if they do not change the content of the page (aka, tracking tags)
- Ensure you have the right currency for your reporting
- Exclude Bots
- Link Google Ads and Search Console correctly
- Set up internal Site Search tracking
- Channel Groupings
If you’re into your marketing (aka, at least 50% of Google Analytics users!) then you want to be measuring traffic and sales against the right marketing channels. For those who run custom marketing campaigns (again, I’m going to bet with at least 50% of GA users), you might have noticed that the marketing you’re doing doesn’t always fit perfectly into the Default Channel Grouping’s that Google Analytics has built for you. So what do you do? Well, you review your data and customise the channel groupings accordingly of course!
It isn’t actually as hard as it sounds, well, unless the data coming in is an unorganised mess, although, based on experience that is probably the case in, guess what – at least 50% of accounts! So don’t feel bad, just make some time to go through your reports and understand how you should be grouping your marketing data in order for it to fit your marketing budgets.
It is also really important that you put good data in here, so please check out our previous post on UTM tracking in order to help you tidy up the data and stop having to create channel groupings based on mucky information:
The final part in the “what can go wrong in the admin area” part of this guide, and boy can these go wrong! However – don’t panic. Mostly if you go wrong with filtering data in/out that you shouldn’t then you spot it soon enough, so there is not likely to be a major problem here. It is, however, very useful to review filters regularly and just make sure you have what you expect in there.
The most important thing to remember when using Filters is:
Always have a RAW DATA view available with NO filters in it so that you have backup data available for troubleshooting and any occasions where you might accidentally mess up the main reporting view.
Filters can be used for so many different things, here are just a few recommended filters to put on Google Analytics views:
- Exclude company IP addresses
- Make all internal search terms lowercase
- Include data to your hostnames (domains) only
- Exclude third party companies’ IP addresses
- Make campaign, source and even medium lower case if required
My top tip when using GA filters, even after all these years, is to test complicated ones out in a test view first and when applying filters – ALWAYS have real time reports open to check you’ve not accidentally excluded everything instead of just a small thing. We’ve all been there…
So, now that we have gone over the core areas in the Admin area, let’s actually start taking a look at the Google Analytics reports to see how we can review and improve the accuracy here!
Important Data to Check for
Having just discussed filtering for your domain only, one of the key reports that people do not even know exists, let alone the importance, is the Hostname Report. This report shows the data to your site broken down by hostname, aka Domain name.
Now you might think, “why do I need this? I only have one domain there won’t be anything else here!”. Unfortunately, a lot of accounts we review actually have some unexpected results here and a few companies we have worked with have actually had to take legal action to shut down duplicate companies that are 100% ripping their website and products off. These companies have ignorantly not only copied the whole website, but left the Google Analytics tracking code from the original site in place, meaning the main company can see the data for the copy site. This has happened 3 times in the last 3 years, so although it is rare, it is still happening enough that you should be aware of it.
So how do you check which domains are tracking in your Google Analytics report? Well, here is the journey you need to click through from any starting point in the reports:
Audience > Technology > Network > Hostname (from the link above the table)
Now that you’ve got here once – Bookmark the link!
This image shows some prime examples of why to tidy up your data – only 81% of all Users are on the main domain!
(not set) in the hostname report suggests that hits have been sent without a pageview before them, which mostly happens through errant event tracking. Not surprisingly as we use this site for a lot of random tests in GTM!
Secondly, you see a lot of spammy looking domains which generate traffic, but the bounce rate, pages per session and average session duration all being either 1 or 0 suggest bot activity rather than human users being reported in Google Analytics. Hence why you would want to put an “include only hostname” on the one domain you wish to report on, or use a(?) regex combination for multiple domains.
In an ideal scenario you will only see your main business domains here, with occasional translate service pages for international users needing another language. No development / staging sites, no spam, no unknowns.
This hostname report is great for seeing what % of your data is actually data you want to report on! Think about the improvements to conversion rate when this includes only what you are targeting!
Personal Data – Do not breach any laws!
There’s the ICO Cookie Law, GDPR, new laws coming in shortly and there’s also the Google Analytics Terms and Conditions which have always been in place. These all warn against or forbid the collection of personally identifiable information in Google Analytics.
Ensure your developers, marketers, directors and everyone involved in the website functionality know what cannot be collected, what is ok and how you need to explain it and get acceptance from your users before tracking it. I’m not offering any legal advice here, but I am telling you to be aware and be careful.
If you collect personally identifiable information in GA your whole account will be deleted.
Personally identifiable information may include: Names, emails, addresses, post codes, etc
If you collect data that you don’t have permission to collect, whether personal or not, you are liable for very hefty fines.
Review what you need to do and search Google Analytics to ensure nothing is accidentally slipping through.
As best practice, we also recommend using the Custom Task in Google Tag Manager to ensure any personal data is removed before being processed by Google Analytics, should it be required for functionality on the site, but obviously, the best thing to do is not risk tracking any of it in the first place! If you do find some data, there’s a new Data Deletion tool in GA, but let’s save that for another blog post.
So there we have it, the core account settings, property settings and top reports to check for fundamental account issues in Google Analytics. Make a note to come back and check these on a regular basis – particularly PII and hostname reports! And if you need some help doing all of this yourself, our friendly team would be happy to help, just get in touch.