Google's AI Overviews, ChatGPT, and Perplexity are now answering "best [software type]" queries directly. If your SaaS product isn't legible to LLM crawlers, you're invisible in the fastest-growing discovery channel in search.
Ask ChatGPT "best project management tools for SaaS startups" and it will give you a list. Ask Perplexity "what SEO agencies work with Indian B2B SaaS" and it will cite specific companies. Ask Google the same and you might get an AI Overview above all the organic results.
These answers come from somewhere. They come from sources LLMs have indexed and understood well enough to cite with confidence. Most SaaS sites are not set up for this. They have HTML pages full of nav bars, JavaScript components, cookie banners, and marketing copy โ all of which LLMs have to fight through to extract useful information.
llms.txt and markdown mirrors solve this. They give LLM crawlers a clean, structured version of your product to read, understand, and cite.
llms.txt is a plain-text file placed at the root of your domain โ yourdomain.com/llms.txt โ that tells AI language models what your site is about.
The concept was proposed by Jeremy Howard (fast.ai) and follows the same mental model as robots.txt: a simple, convention-based file at a predictable location that communicates intent to automated systems. Where robots.txt tells search bots which pages to crawl, llms.txt tells LLMs how to understand your content.
The format is markdown. Headers structure the document. Links point to key pages. No HTML, no JavaScript, no styling โ just content an LLM can consume directly.
Adoption status (May 2026): Perplexity, several Claude crawlers, and AI research tools actively fetch llms.txt. Google has not confirmed support but the file costs nothing to serve and has no downside. Early adoption creates an asymmetric advantage โ you're indexed while competitors aren't.
The file should answer the five questions an LLM needs to understand your product:
Here's a minimal but complete example:
Keep it factual and specific. Generic marketing language ("we help businesses grow") is useless to an LLM that needs to determine whether your product is relevant for a specific query. "We run technical SEO audits for B2B SaaS companies, delivered in 48 hours, starting at $2" gives the model what it needs to cite you accurately.
The standard also defines a llms-full.txt โ a comprehensive content mirror of your entire site in a single file. Think of it as everything an LLM would want to know about your product, without having to crawl individual pages.
A good llms-full.txt includes:
Reference it from llms.txt with a line like:
This lets LLM crawlers fetch the summary first, then pull the full content if they need depth. Same pattern as a sitemap pointing to individual pages.
The second piece of the AI indexing puzzle is markdown mirrors โ .md versions of your key pages served at predictable URLs alongside the HTML versions.
So yourdomain.com/about gets a sibling at yourdomain.com/about.md. Your services page at /services gets a mirror at /services.md.
Why this matters: LLMs extract information from text, not HTML structure. When an LLM crawler fetches your about page, it gets the full HTML document โ navigation, scripts, cookie consent, footer, sidebar, everything. The actual content is buried in there. Markdown mirrors strip all of that and serve clean, structured text that an LLM can parse with zero noise.
| What LLMs see in HTML | What LLMs see in Markdown mirror |
|---|---|
| Navigation links, scripts, cookie banners, styled divs | Clean headings and paragraphs only |
| Inline styles and class names mixed with content | Content-first structure with markdown formatting |
| JavaScript-rendered content may be missing entirely | All content is present in plain text |
| LLM has to infer context from surrounding noise | LLM gets factual, citable content directly |
Put llms.txt, llms-full.txt, about.md, and services.md in your /public directory. Vercel serves everything in /public as static files at the domain root โ no routing configuration needed.
Upload llms.txt via Webflow's Asset Manager. For markdown files, you have two options: (1) host them on a subdomain or CDN and reference from llms.txt, or (2) create a Webflow CMS page with /llms-full slug serving plain text via a custom template with no nav/footer.
Place llms.txt in the WordPress root directory (same level as wp-config.php). For markdown mirrors, plugins like WP Markdown or custom page templates with Content-Type: text/plain header can serve .md versions.
Drop the files directly into the root directory. If your hosting serves static files, they'll be available at the root URL immediately.
Two things to be clear about:
It won't guarantee AI citations. LLMs decide what to cite based on relevance, credibility, and content quality โ not just the presence of a file. llms.txt makes your content more accessible and legible. It doesn't override the underlying quality signal.
It won't replace schema markup. SoftwareApplication, Article, and FAQPage schema are still the primary signals for Google search results and AI Overviews. llms.txt is an additional layer, not a replacement. Both matter.
The right mental model: Schema markup is what tells Google what your page represents. llms.txt is what tells LLM crawlers how your product fits their user's query. They operate on different layers and both should be implemented.
There's a second reason llms.txt matters beyond raw indexing โ credibility.
If your SaaS product is positioning itself anywhere near AI search, AI visibility, or staying relevant in a post-Google-AI-Overviews world, not having llms.txt on your own site is a gap. You're telling prospects to optimize for AI search while your own domain doesn't do the basics.
For us at AutoSEOBot, this was non-negotiable. We implement schema, llms.txt, and markdown mirrors as part of our own technical SEO โ and we include it as a service for clients. It's table stakes for any SaaS taking AI search seriously in 2026.
/llms.txt โ product description, who you serve, services, pricing, key URLs/llms-full.txt โ full site content mirror (homepage, services, about, selected blog posts)/about.md โ clean markdown version of your about page/services.md or /pricing.md โ feature/pricing breakdown in plain textllms-full.txt and markdown mirrors from llms.txtllms.txt URL to your sitemap.xml (optional but good practice)We add llms.txt, markdown mirrors, and full schema markup as part of our SEO Health Fix. โน9,999 one-time โ done in 3 days.
Get the fix โllms.txt is a plain-text file at your domain root (yourdomain.com/llms.txt) that tells AI language models what your site is about. It follows the same convention as robots.txt but is designed for LLM crawlers. It includes a product description, target audience, services, pricing, and links to key pages โ all in structured markdown.
Adoption is growing. Perplexity, some Claude crawlers, and AI research tools actively fetch llms.txt. Google has not confirmed support but the file costs nothing to serve and has no downside. The standard was proposed by fast.ai's Jeremy Howard and has gained adoption across developer and SaaS communities in 2025โ2026.
A markdown mirror is a .md or .txt version of your page content at a predictable URL โ for example, yourdomain.com/about.md alongside yourdomain.com/about. LLMs extract information from text, not HTML structure. Serving clean mirrors eliminates all the noise (nav, scripts, cookie banners) so models get accurate, citable content directly.
A product description (factual, not marketing language), your target audience, core features or services as a bullet list, pricing summary, and URLs to your homepage, pricing page, and blog. Link to llms-full.txt for comprehensive content. Keep it updated โ stale content makes you less reliable as a citation source.
Place llms.txt in your /public directory. Vercel and Next.js serve everything in /public as static files at the domain root โ no routing configuration needed. So /public/llms.txt becomes yourdomain.com/llms.txt automatically.
No. SoftwareApplication, Article, and FAQPage schema are still the primary signals for Google search and AI Overviews. llms.txt is an additional layer for LLM crawlers, not a replacement for structured data. Both should be implemented. Schema for Google; llms.txt for LLM crawlers.