Do you have an XML sitemap? Many webmasters aren’t aware of what XML sitemaps are or why they are needed. Tom Williams is here to provide more information.

What is an XML Sitemap?

Put simply, an XML sitemap is a file in which webmasters are able to list all web pages on their site, informing search engines about the structure of their content.

Search engine crawlers will read this file to aid their understanding of your content, allowing for more efficient crawling of your website.

This file also allows webmasters to indicate additional information about each web page:

  • When the page was last updated
  • How often the page changes
  • How important it is in relation to other pages on the site

Why Do You Need an XML Sitemap?

Generally speaking, if a website’s pages are properly linked to, web crawlers are usually able to discover most of a website without having access to an XML sitemap. Even though this may be true in some cases, an XML sitemap will further improve the crawling process.

There are certain situations in which using an XML sitemap is more important:

  • If you have a relatively new website which has very few external links to it – It’s common knowledge that web crawlers discover the web by following links from one page to another. As a result, crawlers will struggle to find your website if there are very few links pointing to it. A sitemap will ensure crawlers can find your website easily.
  • If you have a really large website – Most web crawlers have a crawl depth limit and if your website has a large number of pages, it’s likely that recently updated pages may be overlooked. Utilising a sitemap will ensure this isn’t the case.
  • If your website has a number of pages that are isolated or not well linked to – If your website contains pages that aren’t naturally referenced or linked to but still important, a sitemap will ensure these aren’t missed.

How To Create an XML Sitemap 

Okay, now you know what an XML sitemap is and why it’s used, let’s look at how to create one.

The first step is to decide which pages should be included within your sitemap. Getting a list of pages on your website can be achieved by using crawling software such as Screaming Frog.

The next step is to put these pages into the relevant format. Sitemaps are typically formatted as follows:

<?xml version=”1.0″ encoding=”UTF-8″?>

<urlset xmlns=”http://www.sitemaps.org/schemas/sitemap/0.9″>

<url>

<loc>http://www.example.com/</loc>

<lastmod>2005-01-01</lastmod>

<changefreq>monthly</changefreq>

<priority>0.8</priority>

</url>

<url>

<loc>http://www.example.com/example/</loc>

<lastmod>2005-01-01</lastmod>

<changefreq>weekly</changefreq>

<priority>0.5</priority>

</url>

</urlset>

The example above uses all optional attributes, ‘Last Modified’, ‘Change Frequency’ and ‘Priority’.

If a website has more than one sitemap, a sitemap index should be used:

<?xml version=”1.0″ encoding=”UTF-8″?>

<sitemapindex xmlns=”http://www.sitemaps.org/schemas/sitemap/0.9″>

<sitemap>

<loc>http://www.example.com/sitemap1.xml.gz</loc>

<lastmod>2004-10-01T18:23:17+00:00</lastmod>

</sitemap>

<sitemap>

<loc>http://www.example.com/sitemap2.xml.gz</loc>

<lastmod>2005-01-01</lastmod>

</sitemap>

</sitemapindex>

It is recommended that the sitemap file is placed at the root directory of your web server. If your server is at example.com, then your sitemap should be at http://example.com/sitemap.xml.

A sitemap must be UTF-8 encoded and each sitemap file can’t include more than 50,000 URLs and must be no larger than 10MB.

Sitemap creation can be done manually but there are various  third party tools that can generate one for you – Google provides a list.

Submit Your Sitemap to Google & Monitor for Errors

Having now created an xml sitemap, you can submit it to Google which will enable them to crawl and process the sitemap much faster than if the crawler had to find the file on its own.

This can be done via the ‘Sitemaps’ report within Google’s Search Console, as shown below:

XML sitemap

This report also identifies any errors with your sitemap.

It is also best practice to insert the following line into a website’s robots.txt file:

Sitemap: http://example.com/sitemap_location.xml

This is just another way to ensure Google and other search engines are able to find your sitemap easily.

Need help implementing an XML sitemap for your website? Contact our technical SEO team today.

Did you find this page useful?

Comments

About the author:

Tom joined ClickThrough in 2011. Since then, he has developed an expertise in the technical side of search engine optimisation. He’s Google Analytics-qualified, and in his current role as digital and technical Executive, carries out monthly SEO activities and provides technical consultancy for several of the company’s largest accounts.