How to Block Crawlers and Other User Agents
Last Updated -
When running a large WordPress site, performance can be critical. In the case of unwanted crawlers on your site, you might be giving up valuable resources without any benefit.
Of course, blocking crawlers isn't a replacement for proper site optimization and these bots usually aren't much of an issue if you're getting proper cache hits, but there can be times when a crawler is hitting you site quite hard and needs to be blocked.
We offer two options for blocking bots from a server level:
Site-Specific Blocking (Recommended)
Note: This method requires your site to be in NGINX-Only mode. For more information, take a look at our article on switching your site to NGINX-Only mode.
If a website is switched to be in NGINX-Only mode, then we could place the .conf file above to be applied on just this site.
The Nginx-only mode already has all the default WordPress routing configurations built into it, so unless there's any customizations to .htaccess files, it should work. (Although, we always recommend testing after it's switched to make sure the sites plugins still work as intended.)
With this method, you would also be able to modify the
user/nginx-server/block-bots.conf file and add or remove new User-Agents by modifying the block list. Say if you saw one in the access logs that wasn't in the most common ones we see, the list could have items appended to it & nginx reloaded to implement the new User-Agent blocks. These would quickly be blocked and never be cached.
We could place the
block-bots.conf file above in the universal server-wide configs for you, in which that would block the User-Agent bots for every site on the server.
Unfortunately, this would not be customizable by you and could only enabled/disabled by putting in a support ticket request.