WordPress Honeypot Detection: How to Catch AI Crawlers That Ignore robots.txt

Honeypot traps catch bots that bypass robots.txt by placing invisible links only crawlers will follow. Here's how the technique works and how to set it up on WordPress.

Not all AI bots respect robots.txt. Of the 60+ AI crawlers active in 2026, roughly 13% have been observed ignoring robots.txt directives — and an unknown number of disguised crawlers never check it at all. When a crawler ignores your disallow rules, you need a way to detect it — and that’s exactly what honeypot traps are designed to do.

Honeypot detection is one of the oldest techniques in web security, adapted here for the specific problem of AI crawler identification. The principle is simple, the implementation is elegant, and the false positive rate is effectively zero.

How Honeypot Detection Works

A honeypot trap works in three steps:

Step 1: Plant the bait. A hidden link is injected into your site’s HTML. This link points to a path that doesn’t appear in your navigation, sitemap, or any visible content. It’s rendered invisible to human visitors using CSS (e.g., display: none, off-screen positioning, or zero-opacity styling).

Step 2: Wait for bots. Human visitors never see the link, so they never click it. But bots that parse raw HTML will find it and follow it — because bots don’t render CSS. To a crawler reading your page’s source code, the hidden link looks like any other link on the page.

Step 3: Catch the crawler. When a request arrives at the honeypot path, you know with certainty that it came from an automated crawler, not a human. No legitimate user would visit a path that’s invisible on the page. The bot has revealed itself.

Why It Works So Well

The strength of honeypot detection is its zero false positive rate. Unlike user-agent matching (which relies on bots honestly identifying themselves) or rate limiting (which can accidentally catch fast-scrolling humans), honeypot traps only trigger when something follows an invisible link. Humans can’t follow a link they can’t see.

This makes it particularly effective against AI bots that:

The honeypot doesn’t care what the bot calls itself or where it connects from. It catches behavior, not identity.

Setting Up Honeypot Detection on WordPress

You can implement a basic honeypot manually by creating a hidden page and logging requests to it. But maintaining this yourself means building the link injection, request logging, bot identification, and response logic from scratch.

AI Bot Tracker handles all of this automatically:

  1. Install and activate the plugin from the WordPress plugin directory
  2. The honeypot deploys immediately — a hidden link is injected into your pages pointing to a randomly generated path (e.g., /_ai-honeypot/f8a3b1/)
  3. Every bot that follows the link is logged with its user-agent, IP hash, timestamp, and the URL it was crawling when it found the honeypot

No configuration is needed. The honeypot is active the moment you activate the plugin.

What Happens After a Bot Is Caught

On the free Monitor tier, honeypot hits are logged and displayed in your dashboard. You can see which bots are following hidden links, how often, and which pages they were crawling.

On the Protect tier and above, you can configure what happens when a bot hits the honeypot. Each of these response strategies serves a different purpose:

You can also enable auto-blocking, which automatically applies your chosen response strategy to any bot that hits the honeypot. Unknown bots are blocked on the first hit. Known bots get a configurable threshold (default 3 hits) before being blocked — just in case a legitimate crawler accidentally follows the link once.

Advanced: Custom Honeypot Paths

The default auto-generated honeypot path works well for most sites. But on the Protect tier and above, you can configure up to 5 custom honeypot paths. This is useful for:

Which Bots Get Caught?

Based on aggregated honeypot data from AI Bot Tracker installations, the most common bots caught by honeypot traps fall into three categories:

The honeypot is particularly valuable for catching the first category. As more AI companies train models using data collected through disguised crawlers and residential proxy networks, behavioral detection becomes the only reliable defense against these hidden crawlers.

Honeypot + robots.txt: Complementary Tools

Honeypot detection doesn’t replace robots.txt — it complements it. Think of it as an enforcement layer within a broader bot management strategy:

Together, these layers give you policy declaration, violation detection, and enforcement. Use robots.txt to set your boundaries. Use honeypot detection to find out who’s crossing them. And use response strategies to decide what happens to violators — from logging to tarpitting to shadowbanning.

For the full details on configuring honeypot detection, see the documentation.

Try AI Bot Tracker — Free on WordPress.org

Detect, monitor, and respond to AI crawlers on your WordPress site. Full bot detection is free forever.

Download Free Plugin