Discoverability

llms.txt

A structured markdown file at the site root with instructions and navigation for LLM agents.

What is llms.txt?

llms.txt is a Markdown file at /llms.txt that a site publishes specifically for language models (LLMs). It helps AI agents quickly understand what the site is about, find important pages, and get context without parsing all the HTML.

The standard was proposed by Jeremy Howard (founder of Answer.AI, creator of fast.ai) in 2024. Most modern sites contain HTML, JavaScript, and ad blocks — “noise” that LLMs have to parse to get to useful content. llms.txt provides a clean entry point: a single file with ordered links and descriptions.

By location it resembles robots.txt; by purpose it resembles a README.md: it doesn’t restrict — it guides.

Why does a site need llms.txt?

llms.txt solves three problems at once:

  1. GEO accuracy. When an AI assistant cites your site in an answer, it does so more accurately when structured context is available. Pages with llms.txt are cited more correctly.

  2. Agentic workflows. An agent executing a task (“find the API documentation,” “compare terms”) navigates via llms.txt directly — without crawling the entire site.

  3. Control over representation. You decide what the agent learns about you first: the product, documentation, contact info, terms.

How to implement llms.txt?

A minimal llms.txt per the llmstxt.org spec:

# Agent Ready Scanner

> A free public scanner for AI-readiness across 23 open standards.

## Documentation
- [About](/about)
- [FAQ](/faq)
- [Standards Glossary](/glossary)
- [Implementation Plans](/tiers)

## API
- [Scanner Documentation](/scanner)

File structure:

  • # H1 — product or site name (required)
  • > blockquote — single-line tagline (recommended)
  • ## Section — sections with Markdown links (at least one)
  • ## Documentation, ## Examples, ## Optional — recommended section names
  • ## Optional section — less important links that an LLM may skip when context is limited

Technical requirements:

  • Path: /llms.txt (strictly at the root, no subdirectories)
  • Serve with Content-Type: text/plain or text/markdown
  • Encoding: UTF-8
  • Size: reasonable (< 100 KB — the LLM reads it in full)

For static sites (Astro, Next.js, Hugo):

Create a public/llms.txt file — it will be available at /llms.txt automatically.

For dynamic sites:

Create an endpoint that generates the file on the fly (if links change dynamically, e.g., in a blog).

// Example for Astro (src/pages/llms.txt.ts)
export async function GET() {
  const content = `# My Site\n\n> Description\n\n## Documentation\n- [Docs](/docs)\n`;
  return new Response(content, {
    headers: { 'Content-Type': 'text/plain; charset=utf-8' }
  });
}

Optional llms-full.txt:

For sites with a large amount of content, the spec recommends also publishing llms-full.txt — an extended version with the full text of all pages. llms.txt points to it in the ## Optional section.

How do we check llms.txt?

Our scanner performs GET /llms.txt and verifies:

  1. HTTP 200 — the file exists and is accessible
  2. Content-Typetext/plain or text/markdown
  3. Non-empty content — the file is not empty
  4. Markdown heading — at least one # Heading in the file
  5. Markdown links — at least one [text](url) link

If all 5 conditions are met — status pass. If the file is missing (HTTP 404) or empty — fail. If Content-Type is incorrect — fail with an explanation.

Common mistakes: the file exists but the server returns HTML (a custom 404 with status 200), no Markdown links, file too large (>500 KB — warning).

Sources and specifications