# geo

## What is GEO?

**GEO (Generative Engine Optimization)** is the practice of optimizing a site's content and technical infrastructure so that AI assistants (ChatGPT, Perplexity, Gemini, YandexGPT, GigaChat, Claude) cite your site in their responses to users.

Just as SEO helps you rank in Google's organic results, GEO helps you appear in AI-generated answers. The mechanics differ: a search engine ranks pages by relevance, while an AI model **extracts** ready-made fragments based on content quality and structure.

The term was coined in an academic paper by Princeton University and Georgia Tech in November 2023 ("GEO: Generative Engine Optimization"). [AEO](/glossary/aeo) (Answer Engine Optimization) is the practical side of GEO, focused on how content is formatted for extraction.

## Why does a site need GEO?

In technical and B2B topics, 15–30% of user queries are already directed to AI assistants instead of Google. For educational content and documentation, the share is even higher.

AI assistants answer questions without sending the user to a website. But when they **cite a source**, users click through. Citability is the new form of organic traffic.

Two business scenarios:
- A user asks ChatGPT "what tool for X" — your product needs to be in the answer
- A user asks "how to implement llms.txt" — your page needs to be cited

## What technical foundation does GEO require?

GEO is not just about content (how to write) but also about technical infrastructure (how the site is organized):

| Technical element | Why it matters for GEO |
|---|---|
| **[robots.txt](/glossary/robots-txt)** with AI rules | AI bots (GPTBot, ClaudeBot, PerplexityBot) must have `Allow: /` |
| **[Content Signals](/glossary/content-signals)** | `search=yes` — explicit permission to be cited |
| **[llms.txt](/glossary/llms-txt)** | LLM bots get structured navigation through the site |
| **[Schema.org JSON-LD](/glossary/schema-org)** | AI systems understand content structure better |
| **[Markdown for Agents](/glossary/markdown-for-agents)** | Content is easier to parse than HTML with JS |
| **[sitemap.xml](/glossary/sitemap)** | AI bots discover all pages |
| **[ai-agent.json](/glossary/ai-agent-json)** | Describes the site in machine-readable form |

Of the 23 checks in our scanner, **10 directly affect GEO** (Discoverability + Content Accessibility + Bot Access Control categories).

## How to implement GEO?

**Technical steps (quick wins):**

1. **Allow AI bots** in robots.txt:
   ```
   User-agent: GPTBot
   Allow: /

   User-agent: ClaudeBot
   Allow: /

   User-agent: PerplexityBot
   Allow: /
   ```

2. **Add a Content-Signal:**
   ```
   Content-Signal: ai-train=yes, search=yes, ai-input=yes
   ```

3. **Publish llms.txt** with navigation to key pages

4. **Add Schema.org JSON-LD** to key pages

**Content steps (AEO format):**

- Write in a question-answer format — direct answer in the first sentence
- Add clear term definitions — `DefinedTerm` in Schema.org
- Structure content with H2/H3 headings
- Add authorship and dates — `Article` with `author` and `datePublished`

**Monitoring:**

- Perplexity AI — check citation directly
- ChatGPT — Browse mode
- Google AI Overview — report in Google Search Console

## How do we check GEO?

GEO-relevant checks in our scanner cover the entire technical foundation:

- `robots_txt` — file exists and is valid
- `sitemap` — sitemap is published
- `link_headers` — Link headers for discovery
- `llms_txt` — llms.txt is published with structure
- `ai_agent_json` — ai-agent.json is published
- `schema_org` — Schema.org JSON-LD on the homepage
- `markdown_negotiation` — Markdown for Agents is supported
- `ai_bot_rules` — AI bots are explicitly allowed in robots.txt
- `content_signals` — Content-Signal is declared
- `web_bot_auth` — Web Bot Auth (informational check)

[← All glossary terms](/en/glossary)
