Log Hero makes bots and spiders visible in Google Analytics, so you can monitor them as they crawl your website. To help you better understand what kind of bots they are and what they are up to, the Hero transmits several custom dimensions for you.
Currently, there are nine custom dimensions:
- Bot Name
- Is Bot
- Bot IP
- Request Method
- Status Code
- Page Load Time
- User Agent
- Protocol
- Official Bot
- Spam Bot
- Timestamp
Below you find a short explanation to each of them. To use them most effectively, we recommend looking them up in the Log Hero standard view and then selecting the secondary dimensions tab:
Bot Name
Bot Name is the real name of the bot that is crawling your website. We maintain one of the most extensive and most complete lists of bots, which we identified through their IPs and transmitted User Agents.
What you use the Bot Name dimension for
This is important, as many User Agents are spoofed, so of the thousands of hits you might get from crawlers that identify themselves as Google Bots through their User Agent, up to 25% of them will be fake. The Bot Name helps you find out which ones those are.
So, if a bot acts as if it was an official Google Bot by faking its User Agent, but is actually a competitor of yours crawling your site, it will have the Bot Name Unknown Bot.
Is Bot?
Is Bot is a binary true / false answer to whether the session was triggered by a bot. If true, it was a bot, if false, it was a human.
What you use the Is Bot dimension for
This is very useful if you want to see the ratio of human visitors vs. bot visitors at a single glance or if you want to see how many human visitors you really have. Since many people are blocking Google Analytics or JavaScript, you’ll have up to 20% of your users not tracked by GA.
To find out how many human visitors you really have, just filter for Is Bot false and you’ll see the total number of human visitors, which you can compare to your standard GA account.
Bot IP
This is the bot’s IP address. The IP address will only be transmitted if we are sure that a bot really triggered the session, so you won’t ever get any issues with privacy laws such as GDPR.
What you use the Bot IP dimension for
You can use the Bot IP to dig deeper into who is crawling you. It is useful to identify those that are crawling your content, understand where Google is crawling you from, etc. pp.
Request Method
This explains what request method the bot used to fetch the content of your site. Here is an explanation of all possible methods by Mozilla.
What you use the Method dimension for
The Method is primarily useful for dev purposes such as debugging plugins, etc. but can also be used to understand which of the request methods are often resulting in bad Status Codes, etc.
Status Code
The HTTP Status Code refers to how your site responds upon being called, either by bots or humans. The most well-known is the infamous 404 if a URL was called but couldn’t be found.
The number that the code starts with indicates the broader response. Any code beginning with a 1, such as 101, is an informational response but those are rare. Those beginning with 2s are successes, 3s are redirects, 4s indicate client errors, 5s indicate server errors. For a full list read on here.
What you use the Status Code dimension for
Status Codes are one of the most essential features of Log Hero. They tell you how your site responds to calls from bots (and humans). We found that one of our sites had way too many broken links and redirect chains. Another case that we uncovered was a DoS safety feature, that would exclude Google’s bots as well.
Page Load Time
This is how many milliseconds it took for the server to make the entire content of the site available to whoever requested it, whether it was a bot or a human.
This is not how long it takes the client to render the content but how long it took your servers to send the data over.
Here is an article on how to add the Calculated Metric AveragePageLoadTime and how it compares to Time on Site, which is sometimes not a great metric with bots.
What you use the Page Load Time dimension for
Page Load Time is an essential metric for both humans and bots. Search engines, as well as people, don’t like to wait. Each additional second of Page Load Time reduces the conversion rate by 11% and increases the bounce rate by 17%.
User Agent
Anyone requesting information from a server identifies himself via a user agent, even if you are just surfing with a browser (to see your UA, ask Google here).
The problem with User Agents is that they can be faked very easily. So a Google Mobile Bot is not necessarily that but could also be any other bot, including ones used by others to scrape your site or even attack you. Read our big User Agent article here.
What you use the User Agent dimension for
The User Agent dimension helps you identify which browser is being used, what version, and on which operating system. Read our big User Agent article here.
Protocol
This tells you which kind of Protocol was used to serve the content of your page, e.g. http or https
Official Bot
This gives tells whether you are dealing with a real, known, bot or not. To determine this we use a large daily updated database with IP addresses, IP ranges and user agents of officially known bots.
Official Bot is a binary true / false answer If true, the user-agent matches to the officially registered ip addresses from the organization, if false, it can be a fake bot or the bot is not known.
Spam Bot
This tells you if the bot is blacklisted or not and why the bot is blacklisted.
Timestamp
This shows you the exact date and time when the content of your website was accessed.