Your Managed WordPress Host Might Be Blocking AI Bots Without Telling You

Some managed WordPress hosts block AI crawlers at the infrastructure level — before your robots.txt is even checked. Here's how to find out if it's happening to you.

You configured your robots.txt. You made deliberate decisions about which AI crawlers to allow and which to block. You might even have a plugin managing per-bot response strategies.

None of that matters if your hosting provider is blocking AI bots before the request ever reaches WordPress.

This is happening right now on several major managed WordPress platforms. AI crawlers are being intercepted at the infrastructure level and rejected before your server even processes the request. Your robots.txt is never read. Your plugins never fire.

The Infrastructure Layer You Don’t Control

Managed WordPress hosting works by adding layers of infrastructure between the public internet and your WordPress installation. A CDN like Cloudflare, a web application firewall (WAF), caching services — these all handle requests before they reach your server.

This is normally a good thing. It blocks malicious traffic and improves performance. But it also means your hosting provider can filter traffic based on their own policies, not yours.

When a managed host decides to block AI crawlers at the infrastructure level, the block happens upstream of WordPress. The request never touches PHP. It never checks your robots.txt.

Nobody is arguing that blocking bots is inherently wrong. For many sites, reducing AI crawler load is genuinely helpful. The issue is that this is happening without disclosure and without per-site opt-out controls. Site owners have no way to know it’s happening.

Which Hosts Block AI Bots (and Which Don’t)

Not all managed WordPress hosts handle AI crawlers the same way. Here’s what we’ve been able to verify as of May 2026.

WP Engine — Blocks by Default, No Self-Service Toggle

WP Engine blocks AI crawlers at the Cloudflare WAF layer. In 2025, WP Engine reported mitigating 75 billion bot requests across its platform. AI training crawlers like GPTBot, ClaudeBot, and Google-Extended are filtered before they reach WordPress.

There is no user-facing toggle to disable this blocking. If you’re on WP Engine and want to allow a specific AI crawler, you need to contact their support team. Your robots.txt rules for AI bots are never consulted — the crawlers are rejected before they get far enough to read them.

WP Engine is one of the largest managed WordPress hosts, powering over 1.5 million sites. That is a lot of infrastructure where AI crawlers are silently blocked with no self-service opt-out.

Flywheel — No Documented AI Bot Blocking

Flywheel was acquired by WP Engine in 2019 and now shares the same underlying Google Cloud infrastructure, though it keeps a separate management dashboard. As of this writing, Flywheel has no documented AI bot blocking policy. Sites on Flywheel may receive AI crawler traffic that identical sites on WP Engine do not. Whether WP Engine’s platform-level filtering also applies to Flywheel sites is an open question given the shared infrastructure.

SiteGround — Blocks Training Crawlers, More Transparent

SiteGround blocks AI training crawlers by default, but takes a more transparent approach. They distinguish between training bots (GPTBot, ClaudeBot, Google-Extended) and user-action bots (ChatGPT-User, OAI-SearchBot, PerplexityBot). Training crawlers are blocked. User-facing search bots are allowed through.

This distinction matters. Training crawlers collect your content for model training, a one-way extraction with no direct benefit to your site. Search bots fetch content in response to a user query and typically cite or link back to your site. Blocking training bots while allowing search bots is a reasonable default, and SiteGround is more upfront about doing it.

That said, opting out of SiteGround’s training bot blocking still requires contacting their support team. There is no self-service dashboard toggle here either.

Kinsta — Does Not Block by Default

Kinsta provides bot protection tools but leaves the decision to site owners. AI crawlers are not blocked by default. If you’re on Kinsta, your robots.txt and any bot management plugins you’ve installed are the actual gatekeepers — which is how most site owners assume things already work everywhere.

Pressable — Does Not Block by Default

Pressable similarly does not block AI bots at the infrastructure level. Site owners retain control over their bot access policies.

Cloudflare (Direct) — Free Toggle, You Decide

Cloudflare offers an “AI Scrapers and Crawlers” toggle on all plans, including the free tier. When enabled, it blocks known AI bot traffic at the network edge. The difference from managed hosting: this is an opt-in feature that you control. You can see it and turn it on or off whenever you want.

If your managed host uses Cloudflare as its CDN (as WP Engine does), the host may have this toggle enabled at the platform level — without exposing the setting to you.

Xserver (Japan) — One-Click Feature, Off by Default

Japan’s Xserver launched a one-click AI crawler blocking feature in January 2026, the first major Japanese host to offer it. The feature blocks 22 known AI crawler user agents and is turned off by default. Site owners opt in explicitly. The tool exists, it’s documented, and the default is permissive. That’s how it should work.

Why This Matters More Than You Think

Silent AI bot blocking creates a specific problem: you lose AI search visibility without knowing it. You can’t troubleshoot what you can’t see.

The Training vs. Search Bot Split

There are over 60 distinct AI bot user agents active today, and they fall into two broad categories:

Training crawlers (GPTBot, ClaudeBot, Google-Extended, CCBot) collect content for model training. Your content improves the model, but you don’t get traffic or citations in return.

Search and retrieval bots (OAI-SearchBot, ChatGPT-User, Claude-SearchBot, PerplexityBot) fetch content in response to user queries. These bots cite your pages and send referral traffic. Cloudflare’s Q1 2026 data shows that ClaudeBot makes roughly 20,583 crawl requests for every referral it sends back. That’s a steep ratio, but it still represents real, measurable traffic.

That traffic also converts well. Multiple Q1 2026 analyses found that AI-referred visitors convert at 4.4x the rate of standard organic search traffic, particularly for informational queries. These are users who got a direct recommendation to visit your site from an AI assistant. They arrive with high intent.

When a managed host blocks all AI crawlers at the infrastructure level, it often catches both categories. Your site disappears from AI search results and AI-generated recommendations. You never see a decline in bot traffic because the traffic was never visible to begin with. Your robots.txt, where you may have deliberately allowed certain AI crawlers, is never consulted.

robots.txt Becomes Decorative

The entire premise of robots.txt-based bot management assumes that crawlers reach your server and read the file. When infrastructure-level blocking intercepts requests upstream, your robots.txt is irrelevant.

If you’ve configured per-bot rules in robots.txt, or set up ai.txt or llms.txt files to declare nuanced AI access policies, none of those declarations are being read. The bots your host is blocking never see them. You’ve done the work of making informed decisions, but your hosting provider has overridden them without telling you.

Plugin-Level Bot Management Fails Silently

The same applies to WordPress plugins that manage AI bot access. If you’re running a bot management plugin like AI Bot Tracker, it operates at the PHP/WordPress layer. It can only detect and respond to requests that actually reach WordPress.

When your host blocks a bot at the CDN or WAF layer, the request never arrives at WordPress. Your plugin never sees it. No log entry, no response strategy, nothing. From the plugin’s perspective, the bot simply never visited.

This creates a blind spot. Your bot tracking dashboard shows zero visits from certain crawlers, and you interpret that as “those bots aren’t interested in my site.” In reality, those bots may be hitting your site hundreds of times per day and getting rejected before you ever see them.

How to Find Out If Your Host Is Blocking AI Bots

There’s no single definitive test, but a combination of signals can give you a clear answer.

1. Check Your Host’s Documentation

Search your hosting provider’s knowledge base for terms like “AI bots,” “AI crawlers,” “bot protection,” and “web scraping.” If the documentation is silent on the topic, that doesn’t mean it isn’t happening — it may just mean it isn’t documented.

2. Compare Server Logs to Plugin Logs

If you have access to raw server or access logs (not all managed hosts provide this), check for AI bot user agents like GPTBot, ClaudeBot, or Bytespider. If your raw logs show zero AI bot requests but you know those bots are actively crawling millions of sites, the requests are likely being filtered upstream.

On a host that doesn’t block AI bots, AI Bot Tracker typically detects 5-15 distinct AI bot user agents within the first week. If you’re seeing zero or only one or two after a week of monitoring, infrastructure-level blocking is the likely explanation.

3. Test From an External Perspective

Use a service that fetches your site with AI bot user agents and reports the response. If the response is a 403 or a Cloudflare challenge page when you request your homepage as GPTBot, but a normal 200 when you request it as a regular browser, your host is filtering based on user-agent at the edge.

4. Contact Support Directly

Ask your hosting provider: “Does your platform block AI web crawlers at the CDN or WAF level? If so, which crawlers are blocked, and can I opt out for specific bots?” The answer will tell you what you need to know. So will an evasive non-answer.

What to Do About It

Your response depends on your goals.

If You Want AI Bots Blocked

If your primary concern is preventing AI training crawlers from accessing your content, your host doing this for you might be exactly what you want. Just be aware that you’re also likely losing AI search visibility. And platform-level blocks may not catch newer crawlers that aren’t in the WAF’s ruleset yet.

Even with host-level blocking, running a detection plugin is worthwhile. Tracking which bots visit your site tells you what’s getting through. If a bot bypasses your host’s WAF, your plugin catches it. If nothing gets through, the empty dashboard confirms the blocking is working.

If You Want Some AI Bots Allowed

This is where silent blocking becomes a real problem. You’ve decided that certain crawlers — say, PerplexityBot for search citations or ChatGPT-User for real-time browsing — should be allowed to access your site. But your host is blocking them upstream, and there’s no toggle for you to flip.

Your options:

  1. Contact your host and request an exception for specific user agents. Some hosts will accommodate this; others won’t.
  2. Switch to a host that gives you control. Kinsta and Pressable both leave bot management decisions to site owners. If granular control over AI bot access is important to your content strategy, your hosting provider’s bot policy should be part of your evaluation criteria.
  3. Use Cloudflare directly (not through your host) so you control the WAF rules. This requires pointing your DNS to Cloudflare and may not be compatible with all managed hosting setups.

If You Want Full Visibility Regardless

Install AI Bot Tracker on your site. On a host that doesn’t block AI bots upstream, you’ll see the full picture within a few days — which bots visit, how often, and which pages they target. On a host that does block upstream, the absence of expected bot traffic in your dashboard is itself a signal that infrastructure-level filtering is happening.

From there, you can set up per-bot response strategies for the crawlers that do reach WordPress — and use honeypot detection to catch disguised crawlers that slip through user-agent filters.

The point is to make these decisions yourself, with full visibility, rather than having them made for you by default.

The Broader Trend

This is not going away. AI crawling volume is increasing, and more hosting providers will implement platform-level bot filtering. The bandwidth costs are real. Hosts have a financial incentive to reduce AI bot traffic across their networks.

The question is whether the industry moves toward the Xserver/Kinsta model (clear tools, off by default, site owner decides) or the model where blocking is on by default with limited self-service opt-out and minimal documentation.

For site owners: don’t assume your robots.txt is the final word on AI bot access. Check what your hosting provider is doing at the infrastructure level. If you’re on managed WordPress hosting, there’s a real chance that AI crawlers are being filtered before they reach your site, and your bot management strategy is running on incomplete data.

Visibility is the first step. Once you know what’s actually happening, you can decide what should happen.

Try AI Bot Tracker — Free on WordPress.org

Detect, monitor, and respond to AI crawlers on your WordPress site. Full bot detection is free forever.

Download Free Plugin