7 easy ways to screw up your Google Analytics data

1. Don’t actively maintain your Exclude URL query parameters list

Google has this view setting as ‘optional’ but if you don’t set it up correctly and maintain it, then you can make a real mess of your data.
ViewSettings-queryExclusion-highlight
The setting tells GA whether to treat each value of the query string parameter as a separate URL (ie a separate page).
The rule of thumb is:

  • when the URL parameter determines which content is served (eg ?pageId=34), you want it treated as a separate URL and you leave it out of this setting
  • when the URL parameter is about something else an application needs (eg ?sessionID=798787), you do not want each variant treated as a new page, and so you put it in the box for Exclude URL Query Parameters

Getting this wrong can really mess with your reporting:

  • you can end up with one page being treated as thousands of separate pages, with the pageviews and other metrics split amongst them. This can be difficult, sometimes impossible, to piece back together in your reporting.
  • It can also run you up against the “(other)” problem: when there are too many different values to be presented on a report – say too many different URLs on the page report – GA will bundle some of them up and call them “(other)”, which is not informative. See https://support.google.com/analytics/answer/1009671?hl=en

One of the problems with this is that it’s usually developers who decide what URL parameters their page will use, and when they add or change parameters, they don’t naturally see the need to inform whoever is maintaining GA.

On larger sites, with many teams working on them, keeping up with new URL parameters can be a big problem.

How to manage this:

  • Process: Try to get the development processes set up so you (the GA person) knows about the query string parameters before they go live – ideally a list from the developers saying what each one does. Failing that, access to the development site so you can figure this out
  • Monitoring: keep an eye on query strings going into your data – check regularly for new ones that should not be there. One way to check is to go into the Behaviour|Site Content|All Pages report and filter for pages with a query string (contains ?)

Speaking of checking for query string parameters lookout for personally identifiable information in query strings
Capturing personal information into Google Analytics is against the terms of service – it can get your GA shut down (see https://support.google.com/analytics/answer/2795983?hl=en).

So, while I’m checking for query string parameters, I run a specific check for any email addresses that might be making it in there, by looking at the All Pages report with an advanced filter of page matches this regex \?.*@.+\..+

If anything shows up you need to deal with it urgently:

  • Add the parameter name to URL exclusions – for every view including your unfiltered view (you do have an unfiltered view?) Note that this is not a complete answer – this keeps the information out of your view, but this does not prevent it being sent to GA, which is required by the terms of service.
  • Ask the developers to stop passing personal information in query strings – there are lots of good reasons not to – this information typically ends up in server logs and is passed in the referrer header when requesting other resources the page uses. So, having it in the URL potentially exposes it to a lot of systems
  • If you are using GTM or another good tag manager, you can often stop these parameters by setting the location variable that the tag uses.

2. Don’t actively maintain referral exclusions

Some bits of GA setup you can set and forget. We used to think of referral exclusion as one of these – you just needed to add the domains of your own sites (where there was cross-domain tracking) and your payment gateway, and that was it, unless something changed.
Lately we’ve seen more and more payments that are referred through card verification sites, which seem to come with a wide variety of URLs

Unless all those sites are in your referrer exclusion list, they appear as referrals on your confirmation page.
Worse, if your goals and ecommerce tracking are fired on that confirmation page (which is usual), your conversions can be attributed to those referrals.
the last thing you want is a lot of transactions attributed to referral from a card verification page – ugh.

Keep an eye on your referrers, especially any coming in to your confirmation pages!

3. Run on multiple domains without cross-domain tracking

You can give yourself a real reporting and attribution headache by running your tag across multiple domain names without correctly configuring cross-domain tracking.
GA runs on first party cookies. First party means that the domain they are tied to is the domain of your site, and browsers don’t share cookies from one domain to another.

If you don’t correctly set up cross domain tracking, users simply moving from one part of the site to another (on a different domain) start a new session and a new user in GA. This messes up your session and user counts, but worse, it means you often can’t track conversions back to their source.

4. Use tagged links internally

Google provides a tool to tag links with a query string that enables you to attribute traffic to a source. There’s a temptation to use tagged links within the site to see which links are being most clicked. Do this if you want to make a real mess of your data.

Arriving at a page with a tagged URL (ie from a tagged link) causes GA to start a new session which is attributed to the link the tag. This can confuse your data beyond repair, and make it impossible to track conversions back to their actual source.

If you want to see what links on the site are most used, use event tracking.

5. Use referral exclusion for referrer spam

Most of us get some referrer spam – spurious sessions in GA referred by dodgy sites only in the hope that you’ll see their names among your referrers and check them out. Pretty desperate stuff, and you don’t want to count it as real traffic.

Since you don’t want to count referrals from these sites, it makes sense to put them in your referrer exclusion list, doesn’t it? Well, no. Your referrer exclusion list is for sites where you want to count the traffic, but not as a new session (eg your payment gateway). Referrer spam, you want to filter entirely. Here are some good references on how.

6. Don’t bother with tag management and a data layer

We see a lot of sites that are still running the old GA tag – unable to take advantage of the neat features of universal analytics and the analytics.js library. Why? It’s usually because making the change is a big deal. They may have an ecommerce integration, they may have a lot of event tracking.
If this were well set up in a tag management system, with communication between the site and the tags mediated by a data layer, making such an upgrade would be so much easier. Adding a new analytics system, or new event tracking, or integrating split testing, would be so much easier.

7. Don’t look out for bots

Developers, hosting companies, even marketers seem to have a growing fondness for bots. Bots scooping up the site, running the tags, and screwing up your data.

GA has a setting for blocking known bots, but Google doesn’t know anything about that special hand-coded custom bot that your hosts use to monitor uptime.

Keep an eye out and try to filter them before the data goes south. Remember data processing happens once only in GA, so if it’s not filtered by then, it’s in your data for good.

I speak from experience in saying: having to use a segment to get rid of that bot that ran your funnel 500 times a day for 3 months, and explaining the segment to your stakeholders, is a serious pain!

Leave a Reply

Your email address will not be published. Required fields are marked *