# sitemap

## What is a Sitemap?

`sitemap.xml` is an XML file listing all public URLs on a site with metadata: last modified date (`lastmod`), update frequency (`changefreq`), priority (`priority`).

The standard is defined at sitemaps.org and is supported by all search engines. Default path: `/sitemap.xml`. The sitemap link must be added to [robots.txt](/glossary/robots-txt) via the `Sitemap:` directive.

## Why does a site need a Sitemap?

Without a sitemap, AI bots and search engines can only discover pages through incoming links. Pages with no such links are invisible.

For [agent-readiness](/glossary/agent-readiness) this is critical for the glossary (`/glossary/*`) and documentation: bots need these pages, but they may have no incoming links.

## How to configure a Sitemap?

Most CMSs generate sitemaps automatically:
- **WordPress:** Yoast SEO or Rank Math — sitemap is enabled by default
- **Astro:** `@astrojs/sitemap` integration
- **Next.js:** `app/sitemap.ts` or the `next-sitemap` package

Minimal example:

```xml
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
    <loc>https://example.com/</loc>
    <lastmod>2026-05-01</lastmod>
    <changefreq>weekly</changefreq>
    <priority>1.0</priority>
  </url>
</urlset>
```

For sites with more than 50,000 URLs, use `<sitemapindex>` with links to separate sitemap files organized by section.

## How do we check the Sitemap?

The scanner resolves the sitemap URL from `robots.txt` (the `Sitemap:` directive). If the directive is absent, it falls back to `/sitemap.xml`.

Check sequence:

1. **Resolve URL** — take the address from robots.txt or `/sitemap.xml`
2. **HTTP 200** — the file is accessible
3. **Valid XML** — parses without errors
4. **Root element** — `<urlset>` (regular sitemap) or `<sitemapindex>`
5. **Presence of `<lastmod>`** — at least one URL with a date

Gradient result: **1.0** if at least one `<lastmod>` is present; **0.6** if the sitemap is valid but has no dates. Status **fail** — on non-200 HTTP or invalid XML.

[← All glossary terms](/en/glossary)
