Support

Admin Tools for WordPress

#35922 Googlebot / User agents to block, one per line / Robots.txt

Posted in ‘Admin Tools for WordPress’
This is a public ticket

Everybody will be able to see its contents. Do not include usernames, passwords or any other sensitive information.

Environment Information

WordPress version
n/a
PHP version
n/a
Admin Tools version
n/a

Latest post by on Friday, 12 November 2021 20:17 CST

pcshost

Two things actually.

1) Is there a particular line in the Admin Tools option that would prevent the verify ownership of a site w/ Google Search Console? When I call the file up via URL it gives me a 403 Error. With .htaccess disabled it works fine.

2) Is there any conflict between the User Agents listed in the "User agents to block, one per line" option and using a robots.txt file to allow commands for Googlebot? If I put something like:

User-agent: Googlebot
Disallow: /nogooglebot/ 

User-agent: *
Allow: /

Will it bypass what's in the current user agent list in the .htaccess file and allow bots to parse the site?

 

 John P.

tampe125
Akeeba Staff

Hello,

Regarding site ownership: Google will give you a unique file that you have to upload, you simply have to add an exception inside the Htaccess Maker to allow direct access to that file. Please put the name of the file they give you inside the field Allow direct access to these files , then recreate the .htaccess file. Now your website should be verified.

Regarding the robots.txt file: this file will gently ask search engines to not index specific folders. Please note that there is no guarantee that search engines will respect that, it's a kind of gentlemen's agreement. Usually it's used to tell search engines that specific folder contain non-interesting data, like cache or temporary images.

So .htaccess rules will always "win", since they will tell your server to not serve any content to specific visitors.

That being said, blocked User Agent belong to a list to known offenders or scrapers, so they won't block legit users or search engines.

Davide Tampellini

Developer and Support Staff

🇮🇹Italian: native 🇬🇧English: good • 🕐 My time zone is Europe / Rome (UTC +1)
Please keep in mind my timezone and cultural differences when reading my replies. Thank you!

pcshost

Hi. I rebuilt the htaccess file. I see that it's coded a little different so I wanted to make sure it's correct:

Is this correct? The old coding was different as an example for hacking tools.

##### Common hacking tools and bandwidth hoggers block -- BEGIN
SetEnvIf user-agent "(?i:WebBandit)" stayout=1
SetEnvIf user-agent "(?i:webbandit)" stayout=1

The old format was:

##### Common hacking tools and bandwidth hoggers block -- BEGIN
SetEnvIf user-agent "WebBandit" stayout=1
SetEnvIf user-agent "webbandit" stayout=1

Everything else looks the same except for the other additions I made which are working fine.

 

 

 

 John P.

tampe125
Akeeba Staff

Yes, we changed it to be case-insensitive, otherwise a single different char would elude the protection

Davide Tampellini

Developer and Support Staff

🇮🇹Italian: native 🇬🇧English: good • 🕐 My time zone is Europe / Rome (UTC +1)
Please keep in mind my timezone and cultural differences when reading my replies. Thank you!

System Task
system
This ticket has been automatically closed. All tickets which have been inactive for a long time are automatically closed. If you believe that this ticket was closed in error, please contact us.