Support

Admin Tools

#35537 Save Bandwidth by blocking robots and reducing crawl rate

Posted in ‘Admin Tools for Joomla! 4 & 5’
This is a public ticket

Everybody will be able to see its contents. Do not include usernames, passwords or any other sensitive information.

Environment Information

Joomla! version
n/a
PHP version
n/a
Admin Tools version
n/a

Latest post by on Wednesday, 25 August 2021 20:17 CDT

kataskeviwebsite

I have tried to blocking different bots through Admin Tools htaccess maker ( I left default settings ). I used this feature in https://www.rakispilacourisltd.com/ website which in  July alone has taken from server resources around 45GB which seems to be coming primarily from bots. I have done that yesterday night so hope I will be able to see the difference by tomorrow morning on bandwidth. 

I need to also reduce crawl rate for bing bot and google bot. Is that something that the Admin Tools can also do ? 

 

 

Please look at the bottom of this page (under Support Policy Summary) for our support policy summary, containing important information regarding our working hours and our support policy. Thank you!

PKikas

JCSL Ltd 

Best SEO Cyprus and Web Design Services. 

 

kataskeviwebsite

This is the code added to my .htaccess to block the bots

 

##### Common hacking tools and bandwidth hoggers block -- BEGIN
SetEnvIf user-agent "WebBandit" stayout=1
SetEnvIf user-agent "webbandit" stayout=1
SetEnvIf user-agent "Acunetix" stayout=1
SetEnvIf user-agent "binlar" stayout=1
SetEnvIf user-agent "BlackWidow" stayout=1
SetEnvIf user-agent "Bolt 0" stayout=1
SetEnvIf user-agent "Bot mailto:[email protected]" stayout=1
SetEnvIf user-agent "BOT for JCE" stayout=1
SetEnvIf user-agent "casper" stayout=1
SetEnvIf user-agent "checkprivacy" stayout=1
SetEnvIf user-agent "ChinaClaw" stayout=1
SetEnvIf user-agent "clshttp" stayout=1
SetEnvIf user-agent "cmsworldmap" stayout=1
SetEnvIf user-agent "comodo" stayout=1
SetEnvIf user-agent "Custo" stayout=1
SetEnvIf user-agent "Default Browser 0" stayout=1
SetEnvIf user-agent "diavol" stayout=1
SetEnvIf user-agent "DIIbot" stayout=1
SetEnvIf user-agent "DISCo" stayout=1
SetEnvIf user-agent "dotbot" stayout=1
SetEnvIf user-agent "Download Demon" stayout=1
SetEnvIf user-agent "eCatch" stayout=1
SetEnvIf user-agent "EirGrabber" stayout=1
SetEnvIf user-agent "EmailCollector" stayout=1
SetEnvIf user-agent "EmailSiphon" stayout=1
SetEnvIf user-agent "EmailWolf" stayout=1
SetEnvIf user-agent "Express WebPictures" stayout=1
SetEnvIf user-agent "extract" stayout=1
SetEnvIf user-agent "ExtractorPro" stayout=1
SetEnvIf user-agent "EyeNetIE" stayout=1
SetEnvIf user-agent "feedfinder" stayout=1
SetEnvIf user-agent "FHscan" stayout=1
SetEnvIf user-agent "FlashGet" stayout=1
SetEnvIf user-agent "flicky" stayout=1
SetEnvIf user-agent "GetRight" stayout=1
SetEnvIf user-agent "GetWeb!" stayout=1
SetEnvIf user-agent "Go-Ahead-Got-It" stayout=1
SetEnvIf user-agent "Go!Zilla" stayout=1
SetEnvIf user-agent "grab" stayout=1
SetEnvIf user-agent "GrabNet" stayout=1
SetEnvIf user-agent "Grafula" stayout=1
SetEnvIf user-agent "harvest" stayout=1
SetEnvIf user-agent "HMView" stayout=1
SetEnvIf user-agent "ia_archiver" stayout=1
SetEnvIf user-agent "Image Stripper" stayout=1
SetEnvIf user-agent "Image Sucker" stayout=1
SetEnvIf user-agent "InterGET" stayout=1
SetEnvIf user-agent "Internet Ninja" stayout=1
SetEnvIf user-agent "InternetSeer.com" stayout=1
SetEnvIf user-agent "jakarta" stayout=1
SetEnvIf user-agent "Java" stayout=1
SetEnvIf user-agent "JetCar" stayout=1
SetEnvIf user-agent "JOC Web Spider" stayout=1
SetEnvIf user-agent "kmccrew" stayout=1
SetEnvIf user-agent "larbin" stayout=1
SetEnvIf user-agent "LeechFTP" stayout=1
SetEnvIf user-agent "libwww" stayout=1
SetEnvIf user-agent "Mass Downloader" stayout=1
SetEnvIf user-agent "Maxthon$" stayout=1
SetEnvIf user-agent "microsoft.url" stayout=1
SetEnvIf user-agent "MIDown tool" stayout=1
SetEnvIf user-agent "miner" stayout=1
SetEnvIf user-agent "Mister PiX" stayout=1
SetEnvIf user-agent "NEWT" stayout=1
SetEnvIf user-agent "MSFrontPage" stayout=1
SetEnvIf user-agent "Navroad" stayout=1
SetEnvIf user-agent "NearSite" stayout=1
SetEnvIf user-agent "Net Vampire" stayout=1
SetEnvIf user-agent "NetAnts" stayout=1
SetEnvIf user-agent "NetSpider" stayout=1
SetEnvIf user-agent "NetZIP" stayout=1
SetEnvIf user-agent "nutch" stayout=1
SetEnvIf user-agent "Octopus" stayout=1
SetEnvIf user-agent "Offline Explorer" stayout=1
SetEnvIf user-agent "Offline Navigator" stayout=1
SetEnvIf user-agent "PageGrabber" stayout=1
SetEnvIf user-agent "Papa Foto" stayout=1
SetEnvIf user-agent "pavuk" stayout=1
SetEnvIf user-agent "pcBrowser" stayout=1
SetEnvIf user-agent "PeoplePal" stayout=1
SetEnvIf user-agent "planetwork" stayout=1
SetEnvIf user-agent "psbot" stayout=1
SetEnvIf user-agent "purebot" stayout=1
SetEnvIf user-agent "pycurl" stayout=1
SetEnvIf user-agent "RealDownload" stayout=1
SetEnvIf user-agent "ReGet" stayout=1
SetEnvIf user-agent "Rippers 0" stayout=1
SetEnvIf user-agent "SeaMonkey$" stayout=1
SetEnvIf user-agent "sitecheck.internetseer.com" stayout=1
SetEnvIf user-agent "SiteSnagger" stayout=1
SetEnvIf user-agent "skygrid" stayout=1
SetEnvIf user-agent "SmartDownload" stayout=1
SetEnvIf user-agent "sucker" stayout=1
SetEnvIf user-agent "SuperBot" stayout=1
SetEnvIf user-agent "SuperHTTP" stayout=1
SetEnvIf user-agent "Surfbot" stayout=1
SetEnvIf user-agent "tAkeOut" stayout=1
SetEnvIf user-agent "Teleport Pro" stayout=1
SetEnvIf user-agent "Toata dragostea mea pentru diavola" stayout=1
SetEnvIf user-agent "turnit" stayout=1
SetEnvIf user-agent "vikspider" stayout=1
SetEnvIf user-agent "VoidEYE" stayout=1
SetEnvIf user-agent "Web Image Collector" stayout=1
SetEnvIf user-agent "Web Sucker" stayout=1
SetEnvIf user-agent "WebAuto" stayout=1
SetEnvIf user-agent "WebCopier" stayout=1
SetEnvIf user-agent "WebFetch" stayout=1
SetEnvIf user-agent "WebGo IS" stayout=1
SetEnvIf user-agent "WebLeacher" stayout=1
SetEnvIf user-agent "WebReaper" stayout=1
SetEnvIf user-agent "WebSauger" stayout=1
SetEnvIf user-agent "Website eXtractor" stayout=1
SetEnvIf user-agent "Website Quester" stayout=1
SetEnvIf user-agent "WebStripper" stayout=1
SetEnvIf user-agent "WebWhacker" stayout=1
SetEnvIf user-agent "WebZIP" stayout=1
SetEnvIf user-agent "Widow" stayout=1
SetEnvIf user-agent "WWW-Mechanize" stayout=1
SetEnvIf user-agent "WWWOFFLE" stayout=1
SetEnvIf user-agent "Xaldon WebSpider" stayout=1
SetEnvIf user-agent "Yandex" stayout=1
SetEnvIf user-agent "Zeus" stayout=1
SetEnvIf user-agent "zmeu" stayout=1
SetEnvIf user-agent "CazoodleBot" stayout=1
SetEnvIf user-agent "discobot" stayout=1
SetEnvIf user-agent "ecxi" stayout=1
SetEnvIf user-agent "GT::WWW" stayout=1
SetEnvIf user-agent "heritrix" stayout=1
SetEnvIf user-agent "HTTP::Lite" stayout=1
SetEnvIf user-agent "HTTrack" stayout=1
SetEnvIf user-agent "ia_archiver" stayout=1
SetEnvIf user-agent "id-search" stayout=1
SetEnvIf user-agent "id-search.org" stayout=1
SetEnvIf user-agent "IDBot" stayout=1
SetEnvIf user-agent "Indy Library" stayout=1
SetEnvIf user-agent "IRLbot" stayout=1
SetEnvIf user-agent "ISC Systems iRc Search 2.1" stayout=1
SetEnvIf user-agent "LinksManager.com_bot" stayout=1
SetEnvIf user-agent "linkwalker" stayout=1
SetEnvIf user-agent "lwp-trivial" stayout=1
SetEnvIf user-agent "MFC_Tear_Sample" stayout=1
SetEnvIf user-agent "Microsoft URL Control" stayout=1
SetEnvIf user-agent "Missigua Locator" stayout=1
SetEnvIf user-agent "panscient.com" stayout=1
SetEnvIf user-agent "PECL::HTTP" stayout=1
SetEnvIf user-agent "PHPCrawl" stayout=1
SetEnvIf user-agent "PleaseCrawl" stayout=1
SetEnvIf user-agent "SBIder" stayout=1
SetEnvIf user-agent "Snoopy" stayout=1
SetEnvIf user-agent "Steeler" stayout=1
SetEnvIf user-agent "URI::Fetch" stayout=1
SetEnvIf user-agent "urllib" stayout=1
SetEnvIf user-agent "Web Sucker" stayout=1
SetEnvIf user-agent "webalta" stayout=1
SetEnvIf user-agent "WebCollage" stayout=1
SetEnvIf user-agent "Wells Search II" stayout=1
SetEnvIf user-agent "WEP Search" stayout=1
SetEnvIf user-agent "zermelo" stayout=1
SetEnvIf user-agent "ZyBorg" stayout=1
SetEnvIf user-agent "Indy Library" stayout=1
SetEnvIf user-agent "libwww-perl" stayout=1
SetEnvIf user-agent "Go!Zilla" stayout=1
SetEnvIf user-agent "TurnitinBot" stayout=1
SetEnvIf user-agent "MJ12bot" stayout=1
SetEnvIf user-agent "SemrushBot" stayout=1
SetEnvIf user-agent "AhrefsBot" stayout=1
SetEnvIf user-agent "DotBot" stayout=1
SetEnvIf user-agent "ZoominfoBot" stayout=1
SetEnvIf user-agent "BLEXBot" stayout=1
<IfModule !mod_authz_core.c>
deny from env=stayout
</IfModule>
<IfModule mod_authz_core.c>
<RequireAll>
Require all granted
Require not env stayout
</RequireAll>
</IfModule>
##### Common hacking tools and bandwidth hoggers block -- END

PKikas

JCSL Ltd 

Best SEO Cyprus and Web Design Services. 

 

nicholas
Akeeba Staff
Manager

Blocking the bots will definitely help, in the sense that useless bots will no longer consume much bandwidth. Just a few hundred bytes per request to receive an HTTP 403. The default list of bots is just what we have assembled from our sites and online sources. You can always process your server's access log files to see the bandwidth per user agent and add more if needed.

As for rate limiting crawlers, no, you cannot do that with .htaccess. If you return an HTTP error code the crawlers won't rate limit, they will simply report broken pages and remove them from their index. You can set the metadata and microdata of your pages to indicate they will never change or change infrequently. This is well outside the scope of Admin Tools; Admin Tools is a security extension. These optimisations are the realm of SEO tools. Check out https://weeblr.com/joomla-seo/4seo which seems to be doing exactly that and comes from a developer I trust (Weeblr is Yannick Gaultier, the developer of sh404SEF).

Nicholas K. Dionysopoulos

Lead Developer and Director

🇬🇷Greek: native 🇬🇧English: excellent 🇫🇷French: basic • 🕐 My time zone is Europe / Athens
Please keep in mind my timezone and cultural differences when reading my replies. Thank you!

System Task
system
This ticket has been automatically closed. All tickets which have been inactive for a long time are automatically closed. If you believe that this ticket was closed in error, please contact us.

Support Information

Working hours: We are open Monday to Friday, 9am to 7pm Cyprus timezone (EET / EEST). Support is provided by the same developers writing the software, all of which live in Europe. You can still file tickets outside of our working hours, but we cannot respond to them until we're back at the office.

Support policy: We would like to kindly inform you that when using our support you have already agreed to the Support Policy which is part of our Terms of Service. Thank you for your understanding and for helping us help you!