Our analytics and conversion manager, Richard Chapman, offers his advice on how to handle non-human referral traffic and how to stop it from skewing your website data.
Have you been seeing some unusual amounts of referral traffic in your Google Analytics account?
This may be a case of referral spam skewing your referral traffic data in Google Analytics. This occurs when your site receives non-human referral traffic from spam programs or spam bots, and this make-believe traffic is being recorded in your GA account, skewing your data and causing you reporting issues.
At ClickThrough, we have seen referral spam affect several accounts that we work on, with spam commonly coming from Russia, Kazakhstan and even further afield. This bot traffic can, at times, be easily spotted, as the domain name can be a seemingly obvious spam site. However, it isn’t always so straightforward. Look out for referral traffic with 100% bounce rate, or no bounce rate at all – the likelihood is that they will be a spam site. If you’re still not sure, simply visit the site and you’ll soon see whether or not the site is bringing you legitimate referral traffic.
TIP: Make sure you’ve got some heavy duty anti malware installed before you click around these sites!
Now you’ve identified the ghost spam, what’s next?
Do not use the referral exclusion list
First things first, there has been a lot of information on the topic of excluding referral spam, and not all of it has been correct.
DO NOT use the referral exclusion list in Google Analytics. Why, I hear you ask? Google states that the Referral Exclusion List is used “to exclude traffic from a third-party shopping cart to prevent customers from being counted in a new session and as a referral when they return to your order confirmation page after checking out on the third-party site”.
Granted, this is rather confusing. You may assume that this means Google will exclude the visit from GA data. However, what actually happens is Google Analytics tries to connect the return visit to the previous source and medium, stopping it from being recognised as referral traffic. Because there is no previous source/medium it is now set as direct traffic. In essence, moving one lot of bad referral traffic to another source/medium, so it’s still skewing your data in Google Analytics.
Get rid of ghost spam, the right way
So now we’ve established that the referral exclusion list will not help when it comes to this pesky ghost spam, how do you go about eliminating it from your GA data?
This “traffic” must be filtered off from each view and not excluded using the referral exclusion option at property level.
To filter it off just do the following:
- Create a new filter at view level and call it “Referrer Spam”
- Set the type to custom
- Set the field to campaign source
- Enter the referral spam domain into the filter pattern field
- Click save
This will now filter the spam traffic from those sources. If you need to add more sources into the Filter Pattern field then just place a pipe after the domain and remember to escape the dot with a back slash, for example:
This is now a regular expression of your spam websites that needs removing from Analytics (keep a copy of this handy on your clipboard or in a text file for later). It’s always best to get a web developer to check the regular expression.
It’s also a good idea to make sure the “filter known bots and spiders” option is ticked in the view options.
Tip: Filters do not work retrospectively so this is only effective from the time of creation and can take 24 hours to kick in. You can use a custom segment to remove the unwanted traffic from reports.
Creating a custom segment
Creating a custom segment to remove spam data from reports can be a bit of a mine field, here is our step by step guide:
- Simply open your reporting view in GA, and click “Add Segment” and select “New Segment” (the big red CTA). Name this new segment “No spam” or something similar and then select Advanced Conditions on the left navigation.
- In the filter options select “sessions” and “exclude”
- Select “Source” and “matches regex” on the two drop downs
- Paste the regular expression you created earlier (see, I said keep it handy for later) into the box.
Avoiding classic Blue Peter one liners (here’s one I made earlier), it should look like this:
Then simply save and apply the segment. This segment will now remove your ghost spam from your reports leaving you with clean data.
Checking your referral traffic regularly can help you to identify and eliminate any ghost spam that may be affecting your reports. The likelihood is that where you’ve excluded one spam site, hundreds more will spring up in its place, so the necessary data cleansing in your Google Analytics account won’t stop any time soon. Let’s hope Google have a plan of action in the works…
Struggling to exclude referral spam in your GA account? Our Google Analytics expert can offer support and training for you and your team. Take a closer look at our Google Analytics Consultancy service for more information.