Skip to main content

AI Crawlers and Robots.txt: What to Block, What to Allow

Summarize with ChatGPT
JK
John Kyprianou
February 13, 2026
4 min read
AI crawlers and robots.txt policy setup

AI crawlers are now a normal part of technical SEO. Alongside classic search bots (Googlebot, Bingbot), sites are seeing more traffic from model-related agents such as ClaudeBot, GPTBot, OAI-SearchBot, and PerplexityBot.

The challenge is simple: do you want these systems to access your content, or not?

This guide explains how to decide, and how to implement a clean policy in robots.txt without accidentally harming your core search traffic.

What AI Crawlers Actually Do

Not all AI-related user agents do the same thing. In practice, they usually fall into three buckets:

  1. Crawlers that fetch pages for indexing or training-related pipelines.
  2. Crawlers that support live retrieval for AI answers and citations.
  3. Extended-control directives that signal usage preferences for AI features.

For most websites, the operational question is still binary: allow or disallow.

Should You Block AI Crawlers?

There is no universal answer. Use business goals:

Usually block if:

  • You publish proprietary content that you do not want reused in AI-generated outputs.
  • You run a subscription or paywalled content business and want tighter control.
  • You are seeing heavy crawl load with little referral value from AI platforms.

Usually allow if:

  • You want visibility and citations in AI assistants.
  • You publish educational content, comparison pages, or thought leadership.
  • You are actively investing in AI search optimization.

For many brands, allowing selective access is now part of demand generation. If your content cannot be fetched, it cannot be cited.

Safe Robots.txt Pattern

If your policy is to block common AI crawlers, add explicit blocks like:

User-agent: ClaudeBot
Disallow: /

User-agent: anthropic-ai
Disallow: /

User-agent: GPTBot
Disallow: /

User-agent: OAI-SearchBot
Disallow: /

User-agent: ChatGPT-User
Disallow: /

User-agent: PerplexityBot
Disallow: /

User-agent: Perplexity-User
Disallow: /

Then keep your normal baseline section for everything else:

User-agent: *
Disallow:

This structure is clear, auditable, and easy to update as policies change.

Common Mistakes to Avoid

  1. Blocking User-agent: * by accident.
    If you set Disallow: / under the wildcard group, you can remove your entire site from standard crawling.

  2. Mixing contradictory rules without intent.
    Set your specific AI-agent rules explicitly, then define a clear wildcard baseline.

  3. Forgetting crawl monitoring after policy changes.
    Use logs and crawl stats to confirm your directives are being respected.

  4. Treating robots.txt as security.
    robots.txt is a crawl directive, not an access control mechanism. Sensitive data should be protected at the server/application layer.

A Practical Policy Framework

Use a simple review framework every quarter:

  1. Business value: Are AI platforms sending qualified traffic or branded searches?
  2. Content risk: Is your content highly sensitive or commercially unique?
  3. Infrastructure cost: Is crawler load causing measurable performance issues?
  4. Brand strategy: Do you want broader AI mention visibility this quarter?

If value > risk, allow selectively.
If risk > value, block aggressively and revisit later.

How We Handle It in the Robots.txt Generator

In our robots.txt generator, you can now use the Block AI crawlers toggle to automatically add disallow rules for common AI agents while keeping your default crawler policy intact.

That gives non-technical teams a safer way to apply policy without manually editing syntax.

Final Recommendation

Treat AI crawler policy like any other SEO control: test, measure, iterate.

Do not choose a permanent stance once and forget it. AI referral patterns, crawler behavior, and model ecosystems are still changing quickly in 2026.

If you want help deciding whether to block or allow specific AI agents for your site, request a free SEO review and we can map policy to your goals.

John Kyprianou

John Kyprianou

Founder & SEO Strategist

John brings over a decade of experience in SEO and digital marketing. With expertise in technical SEO, content strategy, and data analytics, he helps businesses achieve sustainable growth through search.

Related Articles

Google Universal Cart explained, a single cross-merchant shopping cart spanning Search, Gemini, YouTube and Gmail, guide by SEO Turtle
AI Search

Google's Universal Cart: When the Checkout Moves Off Your Site

At I/O 2026 Google launched Universal Cart, a single basket that works across Search, Gemini, YouTube and Gmail. For years SEO was about winning the click to your site. Universal Cart removes the click even for buyers. Here is our practitioner read on what actually changes and the unglamorous feed work that decides who wins.

June 26, 2026
A Search Console toggle switch labelled AI Overviews and AI Mode being switched off, illustrating Google's new opt-out control, by SEO Turtle
AI Search

Google Now Lets You Disappear From AI Overviews. Almost Nobody Should

Google quietly shipped a switch that removes your site from AI Overviews and AI Mode while leaving your normal rankings untouched. It sounds like a gift to anyone watching AI eat their clicks. We explain why it is a regulatory escape hatch built for a handful of publishers, not an SEO lever, and why flipping it makes most businesses invisible exactly where attention is heading.

June 25, 2026
An AI search answer with a sponsored result woven into the response text instead of sitting above it
AI Search

Ads Are Moving Inside the AI Answer. Here Is What That Means for Your Visibility

For 25 years the deal was simple: ads sat at the top, organic links sat below, and everyone knew which was which. In June 2026 both Google and OpenAI started weaving ads directly into the AI answer. We break down what actually changed, why it makes organic citations more valuable rather than less, and what a business in Cyprus or the US should do about it right now.

June 24, 2026
Google's reimagined AI search box explained, an intelligent box suggesting and expanding the user's query, guide by SEO Turtle
AI Search

Google Just Reimagined the Search Box. Here Is What It Changes for SEO

For 25 years the Google search box recorded the question you already had. The new AI search box helps you write a better one, suggesting and reshaping your query before you finish typing. That moves the decisive moment of search upstream, off the results page and into the box. Here is our practitioner read on what it changes and what to do now.

June 23, 2026
Google May 2026 core update recovery and AI search visibility guide, by SEO Turtle
SEO Trends

The May 2026 Core Update Is Done. Recovery and AI Visibility Are Now the Same Job

Google's May 2026 core update wrapped on June 2 after nearly twelve volatile days. The old recovery playbook still applies, but there is a twist most businesses miss: the same content quality that wins back rankings is what gets you cited in AI Overviews and AI Mode. They run on the same systems. Here is how we are reading it for clients in Cyprus and the US.

June 22, 2026

Continue Your SEO Journey

Explore more expert insights and take action on your SEO strategy