Technical SEO is a hugely important factor in the online success of your business. But how can you be sure that your web developer implements best-practice SEO in your site build? Here are four quick checks you can carry out to make sure your site is SEO compliant and protect your business online.

This checklist is adapted from our FREE eBook – Technical SEO Best Practices: A Marketers’ Guide.

Marketing managers, we know how it is. It’s a difficult job to juggle KPIs, budgets, stakeholder relationships and a million other considerations.

So when you’re managing a new website build, you need to count on your developer to get everything right without constant intervention.

But while many web developers pride themselves on putting together clean, attractive designs, they don’t grasp how to implement SEO properly.

And getting SEO right is really important. In fact, it can make or break your business online. We’ve seen SEO slip-ups made by well-regarded development agencies that have crippled clients’ search engine rankings.

While complicated mistakes might require SEO specialism to identify, there are a few checks you can carry out yourself to root out common technical SEO mistakes – and get your web developers to fix them, fast.

Who can use this SEO checklist?

Put simply, if you’re in charge of managing a new website build, this list could help you.

You might be:

  • Thinking about bringing in developers to redevelop your website, and you want to know what problems need fixing (and whether they can fix them).
  • Engaged with a developer, and you want to make sure they’re working to SEO best practice.
  • Worried that your brand new site is suffering from technical SEO failures.

[Editor’s note: for even more do-it-yourself SEO checks, download your FREE eBook Technical SEO Best Practices: A Marketers’ Guide.]

Check 1: Duplicate Content

'Duplicate Original' stamp

Source: woodleywonderworks at Flickr.

Why it’s important

You should ensure your on-site content is as unique as possible. Google says:

Duplicate content generally refers to substantive blocks of content within or across domains that either completely match other content or are appreciably similar…”

…adding:

Google tries hard to index and show pages with distinct information.”

The implication here is that Google tries not to index pages without distinct information.

If your content is duplicated within your domain, or from another website, the likelihood is that this is not malicious – in other words, it hasn’t been done deliberately in an effort to manipulate search rankings.

However, if Google does decide your content duplication is malicious, it may decide not to show that particular page in search results, or worse, put a penalty on your site.

The best-practice stance is ‘better safe than sorry’. Make sure all substantial blocks of content (i.e. descriptive copy that doesn’t feature in menus or other unavoidable repetitions), is unique across your site and across the web in general.

Why web developers get it wrong

Developers often don’t grasp the potential problems duplicate content can cause. Sometimes, they’ll take copy from a directory (if it’s a business description) or a competitor’s site (if it’s a product description) and use it on your site unaltered. In some cases, this copy might have been meant as placeholder copy, but through communication errors, it ends up still being in place when your site goes live.

Alternatively, there may already be duplicate content issues on your site, which your web developer maintains when switching to the rebuild.

How to check for duplicate content

This one’s easy – just Google it.

If your site is already live, this technique will determine whether your content is duplicated within your site or on other websites. If your site isn’t yet live, it should only bring up other websites.

Choose a few pieces of content to check across several pages of your website. Take a sentence from each of these blocks of copy and paste it into Google, surrounded by quotation marks (“), as follows:

This content is not duplicated.

If all goes well, you should see a message like the one above, showing that there are no other instances of this sentence found online.

If the content is duplicated, you’ll see other web pages featuring your copy in the search results.

Be aware that this test isn’t completely fool-proof. Google may be able to determine whether content is essentially duplicated even when a few words are changed or the order of the sentence is shuffled. This technique won’t account for this, so be sure to be thorough.

Need a more thorough way to check for duplicate content? There are several tools available that attempt to provide a better approximation of the way Google (presumably) looks for duplication.

Check 2: Heading Tags

Newspaper headlines

Source: m01229 at Flickr.

Why they’re important

Heading tags are HTML elements that help search engines make sense of the content on a page. They look like this: <h1>Heading 1 Tag Text Goes Here</h1>, <h2>Heading 2 Tag Text Goes Here</h2>. The numbering carries on through <h3>, <h4>…

Heading tags remain an important ranking factor for Google. So the information you put within your heading tags needs to give Google the best possible clues as to the content of the page. The most important ‘clues’ need to go in the <h1> tag, lesser information needs to go in the <h2> tags, and so on.

Take for example a page with the following heading tags:

  • ‘<h1>What’s it all about?</h1>’
  • ‘<h2>Here’s some more stuff to look at</h2>’

We, as people with brains, cannot understand the content of the page using these as clues. Neither can search engines.

If, however, we changed the heading tags to read:

  • ‘<h1>About our vegan cupcakes</h1>’
  • ‘<h2>More info on vegan baking</h2>’

It makes a lot more sense. The latter version has the kind of clarity you should be aiming for in your own heading tags if you want to ensure your site is compliant with best-practice SEO.

Why web developers get it wrong

Probably the most common reason for the misuse of heading tags is that they’re seen as a stylistic tool. Design-focussed web developers often use ‘<h1>’ and ‘<h2>’ as shorthand for ‘big bold heading’, and ‘slightly smaller subheading’.

Now, in the way many webpages are laid out, this will make sense. But design should not be the driving force behind heading tag choice – their primary function is to give search engines clues as to the content and context of a page.

This design-led approach can lead to heading tags being used simply for emphasis (‘Buy one, get one free!’, ‘We’re not like them, we’re the BEST’), or worse, used in menus and other elements that are repeated across multiple pages.

Let us stress there is nothing wrong with using different typefaces, larger text or font effects for emphasis. This kind of approach leads to a clean, attractive, user-friendly design. However, these emphases should not be achieved with heading tags – rather, developers should use CSS markup to achieve their aims.

If the emphasised text is contextually important, then heading tags should be used.

How to check your heading tags

Go to a page on your website (it’s a good idea to start with the homepage), and view the page’s source in your browser.

If you’re using Chrome or Firefox, the keyboard shortcut is Ctrl + U (Cmd + U on Mac).

Now press Ctrl/Cmd + F to start a ‘find’ command, and type ‘<h1’ in the search bar (omitting the quotation marks).

Heading tag example

An example from our homepage…

This will direct you to the first <h1> tag on the page. Check the text that’s contained between the opening and closing <h1> tags. Is it relevant? Is it descriptive? Is it an acceptable length for a title?

Now click on to find the next <h1> tag. Find one? Bad news – the consensus is that it’s best practice to have just one <h1> tag on a page.

Repeat the process with ‘<h2’ to find <h2> tags on the page. Use the same criteria to judge the suitability of these tags. There can be multiple <h2> tags on a page.

Repeat the process on at least a couple of other pages. Look out for duplicate heading tags used on every page.

Check 3: Canonicalization

Canonicalization

Why it’s important

In layman’s terms, canonicalization means ‘pointing search engines and users to the correct version of your site’.

If you’re wondering why there could be two versions of your website, remember that the all-singing, all-dancing, user-friendly web is still built on technology pioneered 30 years ago. Small differences between two URLs mean that they’re read as being two different URLs entirely.

Lots of things can cause issues with canonicalization (such as inconsistencies in capitalisation), but the issue with the most widespread impact is the use of ‘www’ at the start of your website’s URLs.

As part of your initial talks with your developer, you should have been asked whether you want your website to include a ‘www’ in its URLs or not.

It doesn’t matter whether you choose to include this or not (it’s a holdover from the early days of the World Wide Web that’s become standard, but doesn’t have any real bearing on your site). However, it’s important that you make a decision, and that stick to it.

Your web developer should then implement canonicalization so users and robots are redirected to the ‘correct’ version of your site if they try to visit the ‘incorrect’ version.

To see how this works in practice, try visiting the following URLS, and notice how you’re redirected to the ‘opposite’ version:

http://bbc.co.uk/

http://www.twitter.com/

How can this cause problems if it’s not properly implemented? Well, links pass PageRank strength to your page. If somebody links to the incorrect version of your website, or your internal links are inconsistent, the PageRank will be passed to the wrong place.

You can’t avoid this happening with external links, but canonicalization ensures that when it does happen, the PageRank strength is redirected to the right place – as well as ensuring that search engines know which version of a page to display, and don’t mistakenly believe that you’re maliciously duplicating content.

Why web developers get it wrong

Most web developers have a fairly good grasp of canonicalization, but many fail to implement it properly. As a rough estimate, around 60 per cent of the sites we audit have canonicalization improperly implemented, whilst around ten per cent fail to implement it at all.

We’re not sure why this is. Again, it might be because web developers sometimes don’t understand the importance of canonicalization for SEO, so fail to be thorough in their implementation.

When testing for canonicalization, be sure to be thorough. Often, we find no issues with the main sections of a site, but see problems with auxiliary sections like blogs.

How to check for canonicalization

This test looks for canonicalization of the ‘www’/’non-www’ version of your site. However, be aware that many more things can cause canonicalization issues. A full technical SEO audit is required to root out and fix these issues.

That said, it’s very easy to do a quick/non-exhaustive check for this. As with the BBC and Twitter examples linked above, you simply have to attempt to visit the ‘incorrect’ version of your site, and check if you’re redirected to the correct version.

For example, if you’ve decided to include a ‘www’, you should type something like the following into the address bar:

http://example.com/

If you don’t want the ‘www’, type something like this:

http://www.example.com/

Now check the address bar. If you’ve been redirected to the correct URL, canonicalization is implemented for this page. If not, then you need to speak to your web developer to get it sorted.

Repeat this process with a number of pages across your site, removing or including the ‘www’ depending on the version you’ve deemed canon.

Check 4: Test Server Indexing

Boxes marked 'test'.

Source: DaveBleasdale at Flickr.

Why it’s important

Your test server is a test server for a reason. It’s a place where experiments can be carried out on your new site before they’re carried over to your existing domain, safe from the eyes of users and, more importantly, search engines.

This is important because if search engines index your test server, you may face problems later on.

Here’s a scenario: your site may have been live in some form on your test server for months before it moves to its correct domain, all the while having its content read by search engines.

When these search engines see your new site pop up on its correct domain, they may see it as duplicating the content of an entire site it’s previously indexed. Bad news.

To prevent this, your test server should have a properly formatted robots.txt file that tells search engines not to index any of the pages therein.

And then, when your site goes live, it’s equally important that this robots.txt file is changed so search engines can see your site. Otherwise, you won’t appear in any search results!

Why web developers get it wrong

Again, there’s no particular reason for this happening apart from a misunderstanding of possible implications. But issues with test server indexing, or problems caused by improperly implemented robots.txt files can be very serious, so if you do find problems, make sure your web developer fixes things as soon as possible. (If possible!)

How to check whether your test server is indexed

As with our duplicate content check, Google provides all the tools you need to test whether your test server is indexed.

Visit your test server and grab the URL from the search bar. (This will usually be a subdomain of your ‘real’ domain, e.g. ‘http://uat.example.com/’).

Ignore anything apart from the URL for the domain itself. For example, you would want to delete ‘/index.html’ from the following: ‘http://uat.example.com/index.html’

Now, go to Google and type ‘site:’, then paste the test server domain URL after it, with no spaces. This will tell Google to search only within that site’s domain, and return any pages it has indexed within that site.

Site search example

Like this…

If anything shows up, it means your test server has been indexed, and you should contact your web developers to get it sorted out as soon as possible.

However, if your test server is very new, it may not have been in place long enough for Google’s spiders to pick it up. In any case, you should speak to your web developers to ensure they have formatted the robots.txt file in a way that blocks the test server domain.

Want more advice? Click the banner below to download your free technical SEO eBook now!

Did you find this page useful?

Comments

About the author:

ClickThrough is a digital marketing agency, providing search engine optimisation, pay per click management, conversion optimisation, web development and content marketing services.