A recent analysis of internet usage discovered that in the first quarter of 2016, nonhuman sources accounted for roughly 48 percent of all traffic to thousands of sites. These nonhuman visitors might be search engine crawlers indexing a site for Google, harmless scripts used to automate processes, or more nefarious hacker codes doing everything from click fraud to full-blown cyberattacks.
Whatever they are, though, they aren’t human. For anyone attempting to get pure customer analytics for a website, that’s a problem.
The Big Bot Problem
Attributing nearly half of online traffic to bots and crawlers sounds outrageous to anyone who has spent time poring over website analytics services. Most of these services – such as Google Analytics and Adobe Analytics – filter out known bot and crawler traffic before the number ever reaches the user, which can give both marketers and analysts false impressions of just how much nonhuman website traffic exists.
By definition, bots and crawlers are more active than humans. Bots never sleep, and our internal research recently found that they produce 80 times more clicks and traffic than humans. That means for every click a human makes, a bot makes 80, further complicating the quest for reliable analytics.
Today’s bots are far more advanced than those from just a few years ago. They are low-cost, automated tools that can mimic human behavior. They’re no longer limited to click fraud – now they’re able to comment on content, download software, retweet messages, and even purchase products.
Modern bots are able to do this because they are better at evading filters designed to weed out inhuman traffic, but they can be caught. Unfortunately, many ad networks, affiliates, and publishers are reluctant to cut bots from their reports because the higher numbers look better, which has a decisively negative effect on advertising ROI.
Bots Behaving Badly
Cost per acquisition and cost per lead are the only advertising models that rely more heavily on the advertisers’ own data than on the data of the advertising network or ad platform. Fake bot clicks and impressions inflate the numbers for cost per click and cost-per-thousand impressions, driving up ad rates and making advertiser trust a rare commodity.
Better alternatives to click- or impression-based campaigns are tying advertising campaigns to concrete business goals, such as user acquisitions, form submissions, or actual conversions, as these are typically reported by the advertiser and based on its own numbers. These conversion actions also diminish the likelihood of fraud through various human verifications processes, such as CAPTCHAs, email validations and credit card verification.
Crawler and bot activity can take up a lot of bandwidth, slow a site down, and sometimes even crash it with a full-scale bot assault. If consumers have to wait more than a few seconds for a page to load, they’ll leave the site before they even have an opportunity to navigate through it, make a purchase, or complete a contact form – wasting the marketing budget that attracted them to the site in the first place.
How to Beat the Bots
With bots and crawlers painting a distorted picture of internet usage and complicating the ability to analyze web traffic, what can business owners do to fight back?
1. Limit their opportunities.
As your first line of defense, limit the openings through which bots can affect your site. Try requiring visitors to log in to comment on your blog, gating your website either partially or fully, adding CAPTCHAs to contact forms, or sending verification emails. Make it challenging for bots to get through your site.
2. Learn the tells.
Most bots and crawlers have known tells. Things like high bounce rates, lots of traffic from a single IP address, and an outrageous amount of clicks and page views in a short time are all tells. If you discover a bot exhibiting these tendencies, block it on the IP level, but be cautious not to block one of the search crawlers that index your site as your appearance in organic search engine results depends on them.
3. Employ strict KPIs on third-party activity.
Judge the performance of your marketing initiatives by leads and sales generated, not general impressions or clicks. These KPIs are harder to fake and provide a subset of data more human than machine, especially after you limit the bots’ entryways into your site.
4. Find human data sets.
Marketing analytics companies can provide web analytics and competitive intelligence based on the activity of a human consumer panel. My company’s data panel, for instance, contains a negligent 0.02 percent of nonhuman devices, which we identify on the device and activity levels. Do your research and consult with a large, reliable source, then supplement the data with your own to identify where nonhuman actions are most common.
5. Reduce the crawlers’ experience.
This tactic carries a lot of controversy, as search engines would rather developers treat crawlers like humans for optimum indexing, but limiting crawler access reduces the bandwidth allotted to their activities and can increase site load speed for real users. Don’t lose business to a competitor because your pages take too long to load.
6. Add your own tracking.
Supplement third-party tracking with your own tracking, identify the discrepancies, and communicate them to the vendors. Ad networks and affiliates typically base their decisions on their own reporting, but adding your own tracker/pixel/cookie into the site or basing your commercial relationship on actions you have better data for (like leads and actual sales), can create a more mutually beneficial partnership.
Don’t be fooled by the bots and crawlers that may be skewing your data. Set precautions, analyze closely, and take quick action to ensure that the analytics you’re receiving are the ones you need.