Automating the scans (front-end scheduling URL)

[Tip]Tip

Consult the PHP File Change Scanner Scheduling page for detailed information, tailored to your site, without having to read this documentation page.

The front-end scheduling URL feature is intended to let you perform an unattended, scheduled scan of your site. This is not the recommended method to do it, though. You should only use this method if the regular command line CRON jobs are not supported by your server.

The front-end backup URL performs a single scan step and sends a redirection (HTTP 302) header to force the client to advance to the next page, which performs the next step and so forth. You will only see a message upon completion, should it be successful or not. There are a few limitations, though:

  • It is not designed to be run from a normal web browser, but from an unattended cron script, utilizing wget or cron as a means of accessing the function.

  • The script is not capable of showing progress messages.

  • Normal web browsers tend to be "impatient". If a web page returns a bunch of redirection headers, the web browser thinks that the web server has had some sort of malfunction and stop loading the page. It will also show some kind of "destination unreachable" message. Remember, these browsers are meant to be used on web pages which are supposed to show some content to a human. This behaviour is normal. Most browsers will quit after they encounter the twentieth page redirect response, which is bound to happen. If you ask for support about this we will tell you it's not an issue, there is nothign to fix, it's exactly as the browser is supposed to work.

  • Command line utilities, by default, will also give up loading a page after it has been redirected a number of times. For example, wget gives up after 20 redirects, curl does so after 50 redirects. Since Admin Tools redirects once for every of the several dozens of scan steps it is advisable to configure your command line utility with a large number of redirects; about 10000 should be more than enough for virtually all sites.

[Tip]Tip

Do you want to automate your scans despite your host not supporting CRON? Webcron.org fully supports Admin Tools' front-end scan scheduling feature and is dirt cheap - you need to spend about 1 Euro for a year of daily site scan runs. Just make sure you set up your Webcron CRON job time limit to be at least 10% more than the time it takes for Admin Tools to perform a scan of your site.

Before beginning to use this feature, you must set up Admin Tools to support the front-end scan scheduling option. First, go to Admin Tools' main page and click on the Options button. Find the option titled Enable front-end scheduling and set it to Yes. Below it, you will find the option named Secret key. In that box you have to enter a password which will allow your CRON job to convince Admin Tools that it has the right to request a backup to be taken. Think of it as the password required to enter the VIP area of a night club. After you're done, click the Save button on top to save the settings and close the dialog.

[Tip]Tip

Try entering a complex password here. Do note that special characters and non-latin letters need to be "URL escaped" (written as something like %20, i.e. percent sign followed by two hexadecimal digits) in the scheduling URL. The easiest way to get the correct URL is using the PHP File Scanner Scheduling button in Admin Tools' main page.

Most hosts offer a CPanel of some kind. There has to be a section for something like "CRON Jobs", "scheduled tasks" and the like. The help screen in there describes how to set up a scheduled job. One missing part for you would be the command to issue. Simply putting the URL in there is not going to work.

[Warning]Warning

If your host only supports entering a URL in their "CRON" feature, this will most likely not work with Admin Tools. There is no workaround. It is a hard limitation imposed by your host. We would like to help you, but we can't. As always, the only barrier to the different ways we can help you is server configuration. You can, however, use a third party service such as WebCron.org.

If you are on a UNIX-style OS host (usually, a Linux host) you most probably have access to a command line utility called wget. It's almost trivial to use:

wget --max-redirect=10000 "http://www.yoursite.com/index.php?option=com_admintools&

view=filescanner&key=YourSecretKey"

Of course, the line breaks are included for formatting clarity only. You should not have a line break in your command line!

[Important]Important

Do not miss the --max-redirect=10000 part of the wget command! If you fail to include it, the backup will not work with wget complaining that the maximum number of redirections has been reached. This is normal behavior, it is not a bug.

[Important]Important

YourSecretKey must be URL-encoded. You can use an online tool like http://www.url-encode-decode.com or simply consult the PHP File Change Scanner Scheduling page.

[Warning]Warning

Do not forget to surround the URL in double quotes. If you don't the scan will fail to execute! The reason is that the ampersand is also used to separate multiple commands in a single command line. If you don't use the double quotes at the start and end of the scheduling URL, your host will think that you tried to run multiple commands and load your site's homepage instead of the front-end scheduling URL.

If you're unsure, check with your host. Sometimes you have to get from them the full path to wget in order for CRON to work, thus turning the above command line to something like:

/usr/bin/wget --max-redirect=10000 "http://www.yoursite.com/index.php?option=com_admintools&

view=filescanner&key=YourSecretKey"

Contact your host; they usually have a nifty help page for all this stuff. Read also the section on CRON jobs below.

The ampersands above should be written as a single ampersand, not as an HTML entity (&). Failure to do so will result in a 403: Forbidden error message and no backup will occur. This is not a bug, it's the way wget works.

wget is multi-platform command line utility program which is not included with all operating systems. If your system does not include the wget command, it can be downloaded at this address: http://wget.addictivecode.org/FrequentlyAskedQuestions#download. The wget homepage is here: http://www.gnu.org/software/wget/wget.html. Please note that the option --max-redirect is available on wget version 1.11 and above.

Using a web browser or wget version 1.10 and earlier will most probably result into an error message concerning the maximum redirections limit being exceeded. This is not a bug. Most network software will stop dealing with a web site after it has redirected the request more than 20 times. This is a safety feature to avoid consuming network resources on misconfigured web sites which have entered an infinite redirection loop. Admin Tools uses redirections creatively, to force the continuation of the scan process without the need for client-side scripting. It is possible, depending on site size, Admin Tools configuration and server setup, that it will exceed the limit of 20 redirections while performing a site scan operation.