An XML sitemap is a structured file listing the URLs on a site that the owner wants search engines to discover. The format is defined at sitemaps.org and supported by all major search engines. Each entry typically includes the URL, last-modified date, change frequency, and priority — though Google has stated it largely ignores changefreq and priority and primarily uses loc and lastmod.
A sitemap is not a ranking factor. It is a discovery and prioritisation aid. Submitting a URL in a sitemap does not guarantee indexing; Google still applies its own quality filters and decides what to include. What a sitemap does well is signal which URLs the site considers important and surface lastmod values that let Google focus recrawls on recently updated pages.
The XML sitemap protocol caps individual files at 50MB or 50,000 URLs. Larger sites use a sitemap index — a sitemap of sitemaps — to organise multiple sub-sitemaps. For very large sites, split sitemaps by section (products, articles, glossary, etc.) so coverage reports in Search Console expose indexation per content type.
Critical hygiene: every URL in the sitemap should return 200, be allowed in Robots.txt, be indexable (no Robots Meta Tag noindex), and self-canonicalise (no Canonical Tag pointing elsewhere). Sitemap URLs with redirects, 404s, or non-canonical content reduce trust in the entire sitemap and waste Crawl Budget.
The sitemap location should be declared in robots.txt (Sitemap: https://example.com/sitemap.xml) and submitted in Search Console. Both methods are reinforcing; do both.
For multilingual sites, hreflang can be expressed in the sitemap rather than per-page HTML — each <url> entry includes xhtml:link children for each alternate. This is often easier than maintaining HTML head tags across thousands of pages.
Update lastmod values when content meaningfully changes. Updating lastmod on every page every day, regardless of changes, devalues the signal — Google has stated this kind of "fresh-everything" pattern is ignored. Be honest about which pages have actually changed.
The SEOlvl sitemap is generated dynamically at /sitemap.xml and now includes all glossary URLs alongside the leaderboard, directory, and profile entries.
Track domain authority for your sites
Authority Score, backlinks, and 90-day deltas — refreshed daily across every site you monitor.