Unmarkdown
AI Tools

What is llms.txt? The File Every Website Needs in 2026

Updated Mar 17, 2026 · 10 min read

There is a quiet revolution happening at the root directories of websites across the internet. Next to the familiar robots.txt and sitemap.xml, a new file is appearing: llms.txt. It is a plain-text markdown file designed not for search engine crawlers, but for AI models. And as of early 2026, more than 844,000 websites have one.

If you build or manage a website, this file is worth understanding. Here is what it is, where it came from, how to create one, and why it matters for the future of how AI interacts with the web.

The origin story

In 2024, Jeremy Howard, co-founder of Answer.AI and creator of fast.ai, proposed a simple idea. AI assistants like Claude, ChatGPT, and Gemini are increasingly being used to answer questions about websites, summarize documentation, and navigate complex knowledge bases. But these models often struggle with websites because HTML pages are cluttered with navigation, ads, scripts, and boilerplate. The useful content gets buried.

Howard's proposal: give AI models a curated, human-readable map of your site's most important content. Not a machine-readable XML sitemap. Not a crawl directive file. A markdown document, written in plain English, that tells an AI assistant what your site is about and where to find the good stuff.

He called it llms.txt.

The specification is deliberately simple. It lives at yoursite.com/llms.txt, it is written in markdown, and it follows a lightweight structure that any AI model can parse without special tooling.

How llms.txt is structured

The format has only one required element: an H1 heading with your project or site name. Everything else is optional but encouraged. Here is the general structure:

# Your Site Name

> A brief description of your site or project in a blockquote.

## Docs

- [Getting Started](/docs/getting-started): Introduction and setup guide
- [API Reference](/docs/api): Complete API documentation
- [Tutorials](/docs/tutorials): Step-by-step walkthroughs

## Blog

- [Understanding Our Architecture](/blog/architecture): Technical deep-dive
- [2026 Product Roadmap](/blog/roadmap): Where we are heading

## Optional

- [Terms of Service](/terms)
- [Privacy Policy](/privacy)

Each section uses an H2 heading, followed by a list of markdown links. Each link can include a colon and a brief description of what the page contains. That description is important. It gives the AI model enough context to decide whether a particular page is relevant to the user's question, without needing to fetch and process every page on your site.

Some sites also provide an llms-full.txt file, which contains the complete text content of all key pages in a single document. This is useful for AI models that prefer to ingest everything at once rather than following individual links.

The growth numbers

The adoption curve has been remarkable. When Howard first proposed the standard in late 2024, a handful of developer-focused sites adopted it. By early 2025, a few hundred had followed. Then the momentum shifted.

As of March 2026, over 844,000 websites include an llms.txt file. That represents roughly 1,800% growth over the past year. The adopters are not just small blogs or hobby projects. Anthropic uses llms.txt to help AI models navigate the Claude documentation. Cloudflare has one. Stripe has one. These are companies that think carefully about how their content is consumed, and they have decided that making their sites AI-readable is worth the effort.

How llms.txt differs from robots.txt and sitemap.xml

It is easy to confuse these three files, since they all live in a site's root directory and deal with how external systems interact with your content. But they serve fundamentally different purposes.

robots.txt tells search engine crawlers what they are allowed to index. It is a permission file. "You may crawl these pages. You may not crawl those pages." It says nothing about what the content actually is.

sitemap.xml tells search engine crawlers what pages exist and how recently they were updated. It is a discovery file. "Here are all my URLs, organized by priority and modification date." It is written in XML and designed for machines, not humans.

llms.txt tells AI models what your site is about and where the most important content lives. It is a comprehension file. "Here is what we do, here is what matters, and here is where to find it." It is written in markdown and designed to be readable by both humans and AI models.

The distinction matters because search engine crawlers and AI assistants have different needs. A crawler wants to index every public page. An AI assistant wants to understand your site well enough to answer a user's question about it. The crawler needs URLs. The AI needs context.

What the major AI companies say (and do not say)

Here is an important caveat. None of the major LLM companies, not OpenAI, not Google, not Anthropic, have officially confirmed that their models actively follow llms.txt files when crawling or generating responses. There is no public documentation from any of these companies stating that their systems look for this file.

This matters because it means llms.txt is, as of now, a community-driven standard. It is gaining momentum through adoption rather than through official endorsement by the companies whose models would benefit from it most.

That said, the logic behind the standard is sound. AI models do process web content. They do perform better when that content is well-structured and clearly organized. And markdown is the format these models handle most naturally, as we have explored in depth. Whether or not a model explicitly seeks out llms.txt, the act of creating one forces you to think about your site's content from an AI's perspective. That exercise has value regardless.

The Cloudflare connection: Markdown for Agents

In February 2026, Cloudflare announced a feature called "Markdown for Agents" that adds significant context to the llms.txt conversation. The feature automatically converts HTML pages into clean markdown when an AI agent requests them. According to Cloudflare, this reduces token consumption by roughly 80% compared to raw HTML.

This is a meaningful signal. One of the largest infrastructure companies on the internet is now building tools specifically to make web content more accessible to AI models, and the format they chose for that accessibility layer is markdown.

The connection to llms.txt is direct. While llms.txt provides a curated summary and navigation layer, Cloudflare's Markdown for Agents handles the content itself. Together, they represent a vision of the web where AI models can efficiently discover what a site offers (via llms.txt) and then consume that content in a token-efficient format (via markdown conversion).

Agentic Engine Optimization: the new SEO

A new acronym is circulating in marketing and developer circles: AEO, or Agentic Engine Optimization. The idea is straightforward. Just as SEO optimized websites for search engine crawlers, AEO optimizes websites for AI agents.

The shift is more than terminological. When a user asks ChatGPT or Claude a question, the AI does not return ten blue links. It synthesizes an answer. If your website's content is not structured in a way that AI models can understand and summarize, your content may never surface in those answers, regardless of how well it ranks in traditional search results.

llms.txt is one of the first practical tools for AEO. By providing a clear, markdown-formatted overview of your site, you make it easier for AI models to include your content in their responses. You are not gaming an algorithm. You are making your content genuinely more accessible to a new class of information consumers.

This does not replace SEO. Search engines are not going away. But the traffic patterns are shifting. AI-powered search (Google's AI Overviews, Perplexity, ChatGPT with browsing) is capturing an increasing share of informational queries. Optimizing for both traditional search and AI comprehension is becoming a practical necessity.

How to create an llms.txt file for your website

Creating an llms.txt file is straightforward. Here is a step-by-step guide.

Step 1: Identify your most important pages. Think about what an AI assistant would need to know to accurately answer questions about your site. For a SaaS product, that might be your docs, pricing page, and key feature pages. For a blog, it might be your pillar content and about page. For an API, it is your reference docs and getting-started guide.

Step 2: Write the file. Open a text editor and create a markdown file following the structure described above. Start with an H1 of your site name, add a blockquote summary, then organize your key pages into logical sections with H2 headings.

Here is a real-world example for a hypothetical developer tools company:

# Acme Dev Tools

> Acme Dev Tools provides CI/CD pipelines, container orchestration,
> and monitoring for development teams. Founded in 2022, serving
> 12,000 teams worldwide.

## Documentation

- [Quick Start](/docs/quickstart): Get your first pipeline running in 5 minutes
- [Configuration Reference](/docs/config): Complete YAML configuration options
- [API Reference](/docs/api): REST API for programmatic access
- [Integrations](/docs/integrations): Connect with GitHub, GitLab, Bitbucket

## Product

- [Pricing](/pricing): Free tier, Team, and Enterprise plans
- [Changelog](/changelog): Recent updates and new features
- [Status Page](https://status.acme.dev): Current system status

## Resources

- [Blog](/blog): Technical articles and product updates
- [Case Studies](/customers): How teams use Acme Dev Tools
- [Security](/security): SOC 2, encryption, and compliance details

Step 3: Deploy it. Place the file at the root of your website so it is accessible at yoursite.com/llms.txt. If you use a static site generator, put it in your public directory. If you use a CMS, you may need to configure a route or redirect.

Step 4 (optional): Create llms-full.txt. If your key content is not too long, consider creating an llms-full.txt that includes the actual text content of your most important pages. This gives AI models everything they need in a single request.

The bigger picture: markdown as the language of AI

The rise of llms.txt is part of a larger trend. Markdown is steadily becoming the universal interchange format between humans and AI systems.

AI models write in markdown. Developers use markdown for documentation, READMEs, and knowledge bases. The Model Context Protocol (MCP) uses markdown as its primary content format for tool descriptions and responses. Cloudflare is converting HTML to markdown for AI agents. And now, llms.txt uses markdown as the format for helping AI understand entire websites.

This convergence is not accidental. Markdown is lightweight, human-readable, and token-efficient. It strips away the visual presentation layer and preserves the semantic structure that both humans and AI models need. A heading is a heading. A list is a list. A link is a link. There is no ambiguity, no rendering engine required, no parsing complexity.

Unmarkdown™ sits at this intersection. As markdown becomes the language of the AI web, the need to create, format, style, and publish markdown content for both human and AI audiences grows. Whether you are formatting AI output for a Google Doc, publishing a styled document from markdown, or preparing content that AI models will consume, the workflow starts and ends with markdown.

What to watch for

The llms.txt standard is still young. Several things could shape its trajectory in the coming months.

First, official support from major AI companies would be transformative. If OpenAI, Google, or Anthropic announced that their models actively check for llms.txt files, adoption would accelerate dramatically. The fact that Anthropic's own documentation site already has one is suggestive, but not the same as an official protocol commitment.

Second, tooling is catching up. Plugins for WordPress, Next.js, and other frameworks are beginning to appear that auto-generate llms.txt files from existing site structure. As creation becomes automated, the barrier to adoption drops further.

Third, the relationship between llms.txt and AI-powered search will become clearer as more data emerges. If sites with well-crafted llms.txt files see measurably better representation in AI-generated answers, the business case becomes concrete.

For now, the investment required to create an llms.txt file is minimal: an hour of thought about your site's most important content, a few dozen lines of markdown, and a file upload. The potential upside, better AI comprehension of your site as AI-mediated browsing grows, makes it worth the effort.

The web was built for browsers. Then it was optimized for search engines. Now, it is being adapted for AI. And the format that connects all three eras is plain, simple markdown.

Your markdown deserves a beautiful home.

Start publishing for free. Upgrade when you need more.

View pricing