Sitemap
An XML file that lists all important pages on a website, helping search engines discover and crawl content more efficiently.
What is Sitemap?
A sitemap is a file, typically in XML format, that provides search engines with a list of pages on your website that you want to be crawled and indexed. While search engines can discover pages through crawling links, a sitemap ensures that all important pages are found, especially those that may not be well-linked internally or are newly published. Sitemaps can also include metadata about each URL, such as when it was last modified, how often it changes, and its relative priority.
XML sitemaps are submitted to search engines through their webmaster tools (Google Search Console, Bing Webmaster Tools) and referenced in your robots.txt file. A single sitemap can contain up to 50,000 URLs and must be no larger than 50MB. For larger sites, you can create a sitemap index file that references multiple individual sitemaps. Best practices include only listing canonical, indexable pages (exclude noindexed, redirected, or error pages), keeping the lastmod date accurate, and updating the sitemap whenever content is added or removed.
In addition to XML sitemaps for search engines, HTML sitemaps provide a user-facing page that lists and links to all important pages on your site. While HTML sitemaps are less critical for SEO than XML sitemaps, they can improve user navigation and help search engines discover pages. For specialized content, you can also create image sitemaps, video sitemaps, and news sitemaps that provide additional metadata specific to those content types.