Support

Admin Tools

#39382 Help with blocking requests like GET /directory/page-7?start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150

Posted in ‘Admin Tools for Joomla! 4 & 5’
This is a public ticket

Everybody will be able to see its contents. Do not include usernames, passwords or any other sensitive information.

Environment Information

Joomla! version
3.10.12
PHP version
7.4
Admin Tools version
latest

Latest post by tampe125 on Friday, 25 August 2023 10:44 CDT

natecovington

Hi, I've been having trouble with bots scraping my site REALLY hard, ignoring robots.txt, etc.

Here's one from this morning that locked up my VPS:

157.55.39.206 - - [24/Aug/2023:08:18:20 -0400] "GET /directory/page-7?start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150&start=150 HTTP/1.1" 500 - "-" "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm) Chrome/103.0.5060.134 Safari/537.36"

Is there a way using Akeeba Admin Tools that I can strip this type of thing back to the base URL:

/directory/page-7

Without affecting and breaking parts of the site that I don't want.  This seems to only apply to K2 component, Author and Category views/layouts. 

I have a forum discussion going with the K2 developers where they share a URL normalizer plugin and a code snippet:
https://www.joomlaworks.net/forum/k2-en/63421-duplicate-k2-category-listing

But it didn't fix the issue.

Any tips or suggestions are much appreciated.

Thanks!
-Nate

 

natecovington

Not sure if the attachments went through the first time

tampe125
Akeeba Staff

Hello,

I'm sorry but that's not possible. If the bot is scraping the website and it's caught in a loop adding the same parameter over and over, there's nothing we can do. On the .htaccess side, if it's using a "legit" user agent, that's not different from a regular user. On the PHP side, the executable only sees one parameter with one value, so we can't block it.

If it's clogging your traffic, the only solution is to use a CDN like CloudFlare.

Davide Tampellini

Developer and Support Staff

🇮🇹Italian: native 🇬🇧English: good • 🕐 My time zone is Europe / Rome (UTC +1)
Please keep in mind my timezone and cultural differences when reading my replies. Thank you!

Support Information

Working hours: We are open Monday to Friday, 9am to 7pm Cyprus timezone (EET / EEST). Support is provided by the same developers writing the software, all of which live in Europe. You can still file tickets outside of our working hours, but we cannot respond to them until we're back at the office.

Support policy: We would like to kindly inform you that when using our support you have already agreed to the Support Policy which is part of our Terms of Service. Thank you for your understanding and for helping us help you!