Sitemap illustration

What is a Sitemap? All You Need To Know

On this page

    What is a sitemap?

    A sitemap is a file where you can list the web pages of your site to tell search engines about the organization of your site content. Search engine web crawlers like Googlebot read this file to more intelligently crawl your site.

    A sitemap outlines the structure of your website’s pages, offering a hierarchical, organized overview that helps search engines discover, index, and understand all the content on your website.

    It’s like a map of your site’s content, making it easier for search engines to find and index every page on your site.

    Sitemaps are particularly helpful for larger websites with many pages, new websites with just a few external links to them, or websites with rich media content.

    While not mandatory, having a sitemap can be a significant boost to your site’s search engine optimization (SEO), ensuring that all the pages you want to be indexed are known to search engines.

    NOTE: Having a URL in the sitemap DOES NOT guarantee that Google will index it. But it surely helps make URL discovery easy.

    The image below shows the XML sitemap of my website. This XML sitemap contains the blog posts. It should put an image to the concept of sitemap for better understanding. We are going to discuss in more detail later.

    For now, you can visit it here: https://ankitchauhanseo.com/post-sitemap1.xml

    XML sitemap of my website's blog posts

    The importance of having a sitemap

    Here are the key benefits of having a sitemap:

    Improved Search Engine Crawling

    Sitemaps act as a guide for search engine crawlers, helping them to find and understand the structure and content of your website. This is particularly important for websites that are large, have a complex structure, or are new with few external links.

    A sitemap ensures that search engines can discover and index all your important pages.

    Facilitates Content Discovery

    For websites with deep content, numerous pages, or not yet well-linked content, a sitemap facilitates the discovery of these pages by search engines. This is especially crucial for ensuring that newly added content is found and indexed quickly.

    Support for Specialized Content

    Sitemaps allow the inclusion of metadata for specific types of content such as videos, images, and news articles. This metadata can help search engines understand and index your content more effectively, potentially enhancing its presentation in search results.

    Improved User Navigation

    Although HTML sitemaps are more user-oriented, they contribute to a better user experience by providing a clear, accessible overview of a site’s content. This can be particularly useful for users to understand the site structure and find the information they need.

    Prioritization and Change Frequency

    Sitemaps allow you to provide additional information about your site to search engines, such as the relative importance of certain pages and how frequently they are updated. This helps search engines to prioritize their crawling according to the significance and freshness of the content.

     

    How do you decide if your website needs a sitemap?

    Here are some factors to consider when determining the necessity of a sitemap for your website:

    Website Size

    If your website is large and contains hundreds or thousands of pages, it’s likely that search engines might miss some of your content without a sitemap. A sitemap ensures that search engines can discover and index all your website’s pages.

    Website Structure

    Websites with complex structures or deep pages that are not easily discoverable by following links from the homepage or main pages may require a sitemap. A sitemap can help search engines navigate and understand the structure of your site more effectively.

    New or Recently Updated Websites

    If your website is new or you frequently add or update content, a sitemap can be crucial. It informs search engines about these changes, ensuring that the new or updated content is discovered and indexed promptly.

    Rich Media Content

    If your site includes a lot of rich media content (like images, videos, or infographics), especially if it’s not well-linked or embedded in text content, a sitemap can help improve the discovery and indexing of this content.

    Use of Non-HTML Content

    If your site contains a lot of content in formats like PDFs or other non-HTML formats, a sitemap can make this content more accessible to search engines.

    Lack of External Links

    For new websites or sites that do not have many external links pointing to them, a sitemap can be beneficial. It compensates for the lack of external discovery routes to the site’s content.

     

    What are the different types of sitemaps?

    Type of Sitemap Purpose Content Benefit
    XML Sitemaps Primarily used by search engines. Lists the URLs of a site along with metadata about each URL. Ensures that search engines are aware of all the pages on your site.
    HTML Sitemaps Designed for human visitors to help them navigate the website. An organized, usually hierarchical representation of the site’s pages and structure. Enhances user experience by providing a clear overview of the site’s content and structure.
    Image Sitemaps To help search engines discover all the images hosted on your site. Lists the images on your site, allowing you to include additional details. Can improve the visibility of your images in search engines.
    Video Sitemaps To inform search engines about video content on your site. Includes metadata about the videos hosted on your site. Enhances the visibility and indexing of your video content in search results.
    News Sitemaps Specifically for news websites. Lists articles on your news website, providing details like the publication date, title, and keywords. Helps to increase the visibility of your news articles in search engines.
    Mobile Sitemaps For sites that have separate URLs for mobile content. Lists the URLs of mobile-specific versions of your site. Ensures that search engines can discover and serve your mobile-optimized content.

     

    How to build a sitemap?

    Inventory Your Website’s Content:

      • List all the URLs of your website that you want search engines to crawl and index. This includes all your main pages and the individual posts or pages that they link to.
      • Decide which pages are important and should be included. You might want to exclude certain pages that are not beneficial for SEO, such as duplicate content pages or pages with private information.

    Choose a Sitemap Generator:

      • For small websites, you might create the sitemap manually. However, for most websites, it’s more efficient to use a sitemap generator tool. There are many tools available, both free and paid, that can crawl your website and automatically generate a sitemap for you. Examples include:

    Creating the Sitemap:

      • If using a sitemap generator tool, follow the specific instructions for that tool. It typically involves entering your website URL and letting the tool crawl your site.
      • If creating manually, you’ll need to format your URLs in XML format according to the protocol defined on sitemaps.org. Here’s a simple example of what part of an XML sitemap might look like:
    <?xml version="1.0" encoding="UTF-8"?>
    <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
          <url>
              <loc>http://www.example.com/</loc>
              <lastmod>2023-01-01</lastmod>
              <changefreq>daily</changefreq>
              <priority>1.0</priority>
          </url>
          <url>
               <loc>http://www.example.com/about</loc>
               <changefreq>monthly</changefreq>
               <priority>0.5</priority>
          </url>
    <!-- Additional URLs go here -->
    </urlset>

     

    I have a SEO Plugin called SEOPress installed on my website, which builds the XML sitemap for me in a click. See the image below:

    XML sitemap activation in SEOPress

     

     

    What are the best practices to follow for sitemaps?

    Proper Sitemap formatting:

      • Ensure your sitemap is in XML format as per the standard defined by sitemaps.org.
      • Begin with an opening <urlset> tag and end with a closing </urlset> tag.
      • Specify the namespace within the <urlset> tag to adhere to the protocol standard.
      • Use UTF-8 encoding and ensure that all data values are correctly escaped.
      • Include a <url> entry for each URL and a <loc> child entry under each <url> parent tag.
      • While other tags like <lastmod>, <changefreq>, and <priority> are optional, they provide valuable metadata to search engines.

    Content and Structure:

      • Include only canonical URLs. Avoid listing duplicate URLs or any URL that redirects or has a canonical tag pointing to a different page.
      • Prioritize important content. Use the <priority> tag to indicate the relative importance of pages on your site.
      • Update your <lastmod> tag only when significant changes are made to ensure that search engines crawl and index your site efficiently.

    Sitemap Size and Indexing:

      • Keep individual sitemap files limited to 50MB (uncompressed) and 50,000 URLs. If your site is larger, use multiple sitemap files and compile them into a sitemap index file.
      • Ensure the sitemap index file is also correctly formatted, starting with <sitemapindex> and ending with </sitemapindex>.
      • For large or frequently updated sites, consider dynamically generating your sitemap to reflect changes automatically.

    Submission and Accessibility:

      • Submit your sitemap to search engines via their respective webmaster tools, like Google Search Console and Bing Webmaster Tools.
      • Include the sitemap location in your robots.txt file or submit it directly via an HTTP request to search engines.
      • Ensure your sitemap is accessible by placing it in the root directory and checking that it’s not blocked by robots.txt or other security measures.

    Regular Updates and Maintenance:

      • Regularly update your sitemap to reflect new or removed pages and significant content updates.
      • Monitor the sitemap for errors and warnings in search engine webmaster tools and address them promptly to ensure optimal crawling and indexing.

     

    What are some limitations of sitemaps and how do you tackle them?

    Sitemap creation comes with certain limitations, primarily regarding size and URL count. However, these constraints can be managed effectively with the right approach:

    Size Limitation:

      • Each sitemap file must not exceed 50MB (uncompressed). If your sitemap file is larger, you need to compress it using gzip. A compressed sitemap must also be no larger than 50MB.
      • Tackling Size Limitation: If your sitemap approaches the size limit, consider compressing it with gzip. For extremely large websites, you might need to split your sitemap into multiple files and use a sitemap index file.

    URL Count Limitation:

      • A single sitemap file can contain a maximum of 50,000 URLs. This is a hard limit set to ensure that search engine crawlers can process the file efficiently.
      • Tackling URL Count Limitation: If your website has more than 50,000 URLs, you should create multiple sitemap files. Then, use a sitemap index file to list all the individual sitemap files. This index file itself must be submitted to search engines and will guide them to the individual sitemaps.

    Complexity and Management:

      • For very large and complex websites, managing a sitemap can become cumbersome, especially if the site content changes frequently.
      • Tackling Complexity: Implement dynamic sitemap generation. This means your sitemap is automatically updated when pages are added, updated, or removed from your site. Many content management systems (CMS) and e-commerce platforms offer plugins or modules to handle this automatically.

    Ensuring Freshness:

      • Search engines prefer fresh content. If your site updates frequently but your sitemap doesn’t reflect these changes, you may not be leveraging the full potential of your sitemap.
      • Tackling Freshness: Regularly regenerate and resubmit your sitemap, especially after major content updates. Automate this process if possible, especially for large sites where content changes are frequent.

     

    Frequently Asked Questions about Google Sitemap

    How do you submit your sitemap to Google?

    • Google Search Console Setup:
      • Before submitting your sitemap, make sure you have verified ownership of your site in Google Search Console (GSC). If you haven’t done this yet, you’ll need to sign up for GSC and verify your website. Google provides several verification methods, such as uploading a file to your server, adding a meta tag to your site’s home page, or using your Google Analytics or Google Tag Manager account.
    • Accessing Sitemaps Report:
      • Once your site is verified, log in to your Google Search Console account.
      • Select the property (website) you want to manage.
      • On the left sidebar, click on ‘Sitemaps’ under the ‘Index’ section.
    • Submitting Your Sitemap:
      • In the ‘Add a new sitemap’ section, enter the URL of your sitemap. This URL is the exact location where your sitemap file is hosted on your site (e.g., https://www.example.com/sitemap.xml).
      • Click the ‘Submit’ button to submit your sitemap.
    • Monitoring Your Sitemap:
      • After submission, Google will process your sitemap and you can monitor the status in the same ‘Sitemaps’ section in GSC.
      • The report will show if the sitemap was successfully processed, how many URLs were submitted, and how many were indexed. It will also highlight if there were any issues or errors that you need to address.

    See the image below which shows the Google Search Console dashboard of my website. In the image, you can see where the XML sitemap is submitted and monitored.

    XML sitemap submission in Google Search Console

     

    How to delete sitemap via Google Search Console?

    • Log in to Google Search Console:
      • Open Google Search Console and select the property (website) for which you want to remove the sitemap.
    • Access the Sitemaps Report:
      • In the GSC dashboard, navigate to the ‘Sitemaps’ section. You’ll find this in the ‘Index’ section on the left-hand sidebar.
    • Remove the Sitemap:
      • Find the sitemap you wish to remove. Click on it to see the details.
      • While GSC does not offer a direct ‘Delete’ or ‘Remove’ button, removing a sitemap from the list is indirectly done by ensuring the sitemap file is either deleted from your site or made inaccessible (e.g., via a 404 or 410 HTTP status code).
      • After ensuring the sitemap file is no longer accessible on your site, Google will eventually stop attempting to crawl it, and it should be removed from the list in GSC over time.

    The image below is from my Google Search Console dashboard which shows exactly where you can delete a sitemap.

    Deleting a sitemap in Google Search Console

    Posted in

    Share this article:

    Ankit Chauhan is an SEO Consultant and Researcher. Having more than 5 years of extensive experience in SEO, Ankit loves to share his SEO expertise with the community through his blog. Ankit Chauhan is a big-time SEO nerd with an obsession for search engines and how they work. Ankit loves to read Google patents about search engines and conduct SEO experiments in his free time.