FM202606 Newsletter - Blog Image_Web

Most of your website visitors aren’t human

Take a look at your website analytics. Maybe traffic is steady and perhaps it’s trending up. Either way, the numbers you are looking at, your visitors, your sessions, your pageviews, are not necessarily telling you the full story.

While this might not be a surprise to many users, the reality is that a significant and growing share of the traffic hitting your website isn’t from humans at all. Instead, it is coming from bots, and not the traditional hackers, spammers or bots trying to log in to your website backend, but instead AI crawlers, training bots, and automated tools that are quietly working their way through your website around the clock, and doing it at a scale most business owners would find surprising.

This isn’t a new problem, but it has changed dramatically in a very short time. According to research published by Kinsta, analysing more than 10 billion web requests, AI bot traffic surged 300% in a single year. At the start of 2025, roughly 1 in 200 web visits was an AI bot. By the end of the year, that ratio had shifted to 1 in 31 (TollBit State of the Bots Q4 2025).

That is a significant shift in a very short time.

And it matters for reasons that go beyond an interesting statistic. Depending on how your website is built, this traffic could inflate your analytics, slow your site down for real visitors, and quietly consume server resources you’re paying for. All without showing up in any report that would normally raise a flag.

While none of this requires entering panic mode, it does require a clearer picture of what’s actually happening on your website and why the dashboard you’ve been relying on is only showing you part of it.

The shift happened faster than anyone expected

For most of the web’s history, bots have been a background consideration and something only developers have had a real concern about. Search engines crawled your site; the occasional scraper, bot, or DDoS attack showed up; and everything else was human traffic. That model held for a long time. But this doesn’t hold anymore.

The rise of large language models has created an entirely new category of web crawlers. It is no secret that these systems need content to learn from, to index, and, increasingly, to answer questions in real time. And they get that content by crawling websites. Probably, this blog will be crawled by an AI model soon, and your website too.

The numbers reflect just how quickly this has changed. According to Cloudflare Radar 2025 Year in Review, GPTBot, OpenAI’s training crawler, grew 305% between May 2024 and May 2025 alone.

What makes this relevant for website and business owners is what most of this traffic is actually doing. Around 80% of AI crawling activity is purely for model training. Which means this activity is not sending visitors back to your site. It is not driving enquiries or sales. It is consuming your server resources and appearing in your traffic data without delivering anything in return. The other 20% is more interesting, and we’ll get to that shortly.

The point here is not that AI bots are necessarily a bad thing, or that something has gone wrong with your website. It is simply that the web has changed significantly, and the assumptions and rules built into most analytics setups haven’t kept pace or are only now starting to adapt.

Understanding what is hitting your site is the first step to making sense of the data you’re seeing and the performance issues you might already be experiencing without a clear explanation.

Not all bots are the same

When most people hear “bot traffic,” they picture something vaguely suspicious. The most common assumptions are: automated form submissions (controlled by an annoying reCAPTCHA), fake clicks, or attempts to access the website backend. The reality is more sophisticated nowadays, and understanding the distinction is useful for how you think about your website and your data.

There are roughly three categories of AI-related bot traffic:

Training crawlers (GPTBot, ClaudeBot, Google-Extended and others).

These run in the background on the bot operator’s schedule, crawling the web to build or refresh the training data that AI models learn from. Whether you want them on your site is a legitimate question, but they are not responding to any individual user’s request. They are operating on a schedule, entirely independent of what any human is doing right now.

AI search crawlers (OAI-SearchBot, Claude-SearchBot, Perplexity’s index bots).

These build lookup indexes that AI assistants query when answering questions. Still scheduled, still operator-driven, but the output is a reference index rather than model training data. Think of them as the AI equivalent of Googlebot building a catalogue that is later consulted.

On-demand user fetches (ChatGPT-User, Claude-User, and a growing list of others).

This third category is the most interesting and the most misunderstood. These bots only fire because a specific human, right now, asked an AI assistant a question, and the assistant decided to fetch a specific URL to help answer it.

That last category represents something new and probably more useful for businesses than the rest. And while a real person reads your content through an AI intermediary, your analytics likely didn’t record that activity.

Joost de Valk, creator of Yoast SEO, found exactly this on his own site. Analytics: 254 human visitors in 24 hours. Server logs: 536 on-demand AI fetches on top of that, each triggered by a real person asking a real question. Two accurate numbers, two completely different worlds.

Most businesses only look at one of them.

What a well-built site handles quietly in the background

None of what we’ve covered here requires a dramatic response. No need to overhaul your analytics setup overnight or block every bot that knocks on the door. But it does change what “a well-built website” means in 2026, and what you should reasonably expect from the people responsible for yours.

Hosting matters more than most people realise.

Not all hosting environments handle bot traffic the same way. Quality-managed hosting actively filters inefficient crawler patterns at the infrastructure level before they ever reach your WordPress or CMS installation. Cheap shared hosting typically doesn’t, which means every request lands on your server, and your site has to deal with it. The difference shows up in page speed for real visitors and server stability during traffic spikes.

How your site is built affects how exposed it is.

WordPress sites with odd URL structures, poorly configured permalink settings, missing 301 redirects or basic plugin configuration give bots more surface area to get stuck in.

A well-architected site minimises unnecessary endpoints or URLs, keeps robots.txt properly configured and maintained and ensures that high-cost pages like cart and checkout URLs aren’t being crawled unnecessarily. These aren’t complicated changes, but they require someone who understands both the technical side and the practical consequences.

Agent readiness is quietly becoming part of the standard.

This one is worth flagging for anyone who takes their web presence seriously. Joost de Valk recently launched specification.website, an open-source project that documents what a good website should do regardless of platform. Alongside the familiar categories such as SEO, accessibility, performance, and security, there is now a dedicated section for agent readiness. Making your content properly structured and readable by AI systems is no longer an experimental consideration. It’s now becoming a must-have for anyone interested in improving how their sites perform on an AI-based ecosystem.

The broader point is this: the web has always evolved, and the businesses that adapt early tend to fare better than those that wait until the change is impossible to ignore. SSL certificates were once an optional extra. Mobile responsiveness was once a nice-to-have. Bot management and agent readiness are following the same path.

A website that is well-hosted, well-built, and actively maintained handles most of this without you ever needing to think about it.

If you’re not sure whether your current website meets that bar, it’s worth asking the question. Because the gap between a site that handles this well and one that doesn’t is only going to widen from here.

Juan Ruiz

Web Developer & Director

Juan is an experienced web developer with a career spanning multiple industries and roles. Juan leads the web development team building tailored websites, custom applications, and integrations that make a real difference for clients.

Subscribe to our newsletter

  • This field is for validation purposes and should be left unchanged.

Hi there!

Want to learn more about this service? Send us a message below.

Close the CTA

This field is for validation purposes and should be left unchanged.
+

Enter your email to start the download

Scroll to Top