The PHP File Scanner

[Note]Note

This feature is only available in the distributed-for-a-fee Professional release of our software.

The very powerful PHP File Change Scanner feature can be used to perform a security scan of the PHP files included inside your site's root directory, as well as detect any modified or added files in subsequent runs. This feature is built upon our experience making the fastest, most stable, pure-PHP site backup engine with Akeeba Backup. Each scanned file also comes with a preliminary automatic security assessment ("threat score") which can give you a quick idea of how possible it is that the file in question could be suspicious.

The PHP File Change Scanner doesn't stop at scanning. Coupled with an array of handy features such as the ability to produce DIFF's (a synopsis of how modified files differ from the previous known copy), print and export the scan reports as well as the interactive report viewer which allows you to peek at the contents of each file, this feature can allow power users to detect and eliminate hacks much faster than using a purely manual method. You can also automate the run of the scanner engine using a standard CRON job (available for Joomla! 1.7 and later only), making sure that you always know what's going on with your site.

By default, only files with the extensions php, phps, phtml, php3 and inc will by scanned. This list is case sensitive, i.e. files with an extension of PHP (uppercase) will not be scanned. This is configurable in the component's Options.

The idea of this feature is to scan only PHP files, because the modification or addition thereof could signify a potential problem or hack of your site. The extensions we chose are those used by virtually all PHP executable files.

Keep in mind that not all hacking scripts are written in PHP. Some of them may be written in PERL, Python, Ruby, shell scripting, Python or they could be executable binaries. Some hackers may also place malicious PDFs, PNGs, Word documents etc which will infect your computer if you open them. None of those files will be scanned by Admin Tools's PHP File Change Scanner.

How does it work and what should I know?

The PHP File Change Scanner works by recursively enumerating all files and folders under your Joomla site's root. In each directory scanned it is looking for PHP files and compares them to their last known state, as recorded in the database. It will then report any changes, i.e. files which have been modified or added since the previous scan. The following paragraphs will explain how some aspects of the file scanning and reporting engine work.

Scope of the scan. Only files inside your Joomla! site's root and its subdirectories, no matter how deep, will be scanned. If you have placed PHP files outside of your site's root, they will not be scanned. Any readable directory under your site's root will be scanned, even if it does not belong to the current Joomla installation. For example, if you have additional sites or subdomains stored in subdirectories of your site's root, they will be scanned too.

Only PHP files are scanned. Only files with the extensions configured in the component's options will be scanned. As noted above, this defaults to PHP files. Only PHP files are meant to be scanned. Even though you can add other file types the results you get for them with regards to the Threat Score will be WRONG.

Directories automatically skipped. Admin Tools Professional will automatically skip scanning the following directories: tmp, cache, administrator/cache, log. These files contain temporary files, logs disguised as PHP files or cache files disguised as PHP files. The contents of neither of those directories is supposed to be directly accessible over the web – and that's why Joomla! allows you to relocate them to off-site locations. If you run across an extension which references files in those directories from a frontend or backend page, uninstall it a.s.a.p. as this is a sign of a developer not knowing what they are doing. Do note that you can exclude more directories or specific files in the component's options.

[Note]Note

Regarding the tmp and log directories, Admin Tools Professional will actually take a look at your Global Configuration settings and exclude the directory for temp-files and directory for log files specified in there. Usually these are the tmp and log directories respectively, hence the reference to those directories in the paragraph above.

File comparison terms. In order to determine if a file is modified, Admin Tools will compare its size, last modification time and md5 sum. If any of these do not match the previous scan's results, the file is considered modified. If there is no record of that file in a previous scan, the file is considered to be new.

When a file change is detected. A file change is detected only if the file is added or modified since the immediately previous scan. This means that if you scan now, modify a PHP file and scan again, it will show up as modified. If you perform a third scan right after the second one, the file will NOT be reported as changed. This is normal! The file was changed between the first and second scan, but not between the second and third scan. Exception to this rule are files with a non-zero Threat Score which have not been manually marked as “safe”.

Threat score calculation. Whenever Admin Tools Professional encounters a new or modified file, it calculates a "threat score". This is a weighed sum of potential security "red flags". Essentially, Admin Tools Professional runs a few heuristics against the PHP file in question, looking for code patterns which are commonly (but NOT NECESSARILY) used in hacking scripts and hacked files. Each of those patterns is assigned a "weight". The weight is multiplied by the number of occurrences of the pattern to give a score. The sum of these scores is what we call a "threat score". How to interpret it: the higher the threat score, the more probable it is that this could be a nefarious file and its contents should be manually assessed. Please note that a high threat score does not necessarily mean that the file is hacked or a hacking script. Likewise, a low but non-zero threat score (1-10) does not necessarily mean that the file in question is necessarily safe. Please take a look at the next few sections for more information.

Removing old scans has some consequences. When you remove an old scan, Admin Tools also removes all associated file alert records. If you have defined some files with a non-zero Threat Score as "Marked Safe" in this scan's report, then this information is lost when you delete this scan. As a result, subsequent scans will, again, report the file as "Suspicious".

Heavy database usage. In order for this feature to work, Admin Tools Professional needs to perform very heavy use of your database. There will be at least one database query for each and every PHP file on your site. An average site contains about 3,000 such files. Moreover, there will be one database query for each and every new or modified file.

Heavy resource usage. Scanning your site is a very CPU and memory intensive procedure. Admin Tools Professional has to scan your entire site, find the PHP files and for each on them read it, calculate an MD5 sum, read data from the database, compare it with the information already calculated and write data to the database. This does put a big strain on your server, similar to what you get when you're backing up your site.

Requirement for a writable temp-file directory. In order for this feature to work, we need to keep a temporary file in your site's temp-files directory (configurable in the Global Configuration page, usually it's tmp under your site's root). For this to be possible, your tmp directory has to be writable. In most likelihood it already is.

Depending on your file ownership and permissions, your tmp directory may be unwritable. In this case and this case only, you have to perform a trick to make it writable without compromising the security of your site. First, give that directory 0777 permissions. Then, upload (using FTP) a .htaccess file in your temp-files directory with the following contents:

<IfModule !mod_authz_core.c>
Order deny,allow
Deny from all
</IfModule>
<IfModule mod_authz_core.c>
<RequireAll>
Require all denied
</RequireAll>
</IfModule>

Give the .htaccess file you just uploaded 0444 permissions.

Remember to use Admin Tools' Permissions Configuration to set up the permissions of the directory to 777, otherwise the folder will become unwritable as soon as you use Admin Tools' Fix Permissions feature. The trick outlined above makes the temporary directory world-writable (anyone with access to the server can write to it). This is normally unsafe. However, it is unsafe only if anyone could access the files in that directory over the web, essentially being able to execute arbitrary PHP code. By uploading the .htaccess we mentioned, you made the directory inaccessible from the web. This means that a potential attacker could write arbitrary PHP files in this directory, but not execute them, therefore no longer posing a security risk. By changing the permissions of the .htaccess file to 0444 we made it read-only, so that a potential attacker can not override it, unless he has FTP access to your site (in which case your site is already hacked, so you shouldn't worry about the temp-files directory any more...).

Potential problems. As stated above, the file scan operation is very database, CPU and memory intensive. This can cause failure of the scan process due to one of several reasons, especially on lower-end hosts (usually: cheap or low quality shared hosts):

  • Memory exhaustion. Getting an out-of-memory error is not at all unlikely. We strongly recommend having at the very least 32Mb of available PHP memory. We recommend 64Mb to 128Mb for trouble-free operation. If you only have 16Mb or less of available PHP memory, the scan will most likely fail.

  • Exhausting your MySQL query limit. Some hosts have a limit on how many queries you can run per minute or per hour. Because the file scan is very database-intensive, you may exhaust this limit, causing the scan to crash.

  • MySQL server has gone away. Likewise, some hosts have set up MySQL (the database server) to forcibly close the connection if it doesn't receive data for a short time period, usually anything between 0.5 and 3 seconds. This could cause the infamous "MySQL server has gone away" error message, killing your scan.

  • Timeout. Calculating MD5 and diffs for large files is a very time consuming process. It is possible that PHP times out during that operation, especially on slow, low-end hosts.

  • Hitting the CPU usage limit. Many hosts enforce a CPU usage limit. Given that the file scan is a very CPU-intensive process, it is possible that you hit that limit. What usually happens is that the host kills the script causing the "excessive" CPU usage (our file scan operation).

All of the above manifest themselves as a 500 Internal Server Error message or a never ending scan process when trying to scan your site. Unfortunately, these are all server limitations and we can not work around them, while maintaining the usefulness of the PHP File Change Scanner feature. If you hit on those limitations, our recommendation is to switch to a higher quality host.