Support

Akeeba Backup for Joomla!

#8846 AWS3 Thousands of unexplained .zip files

Posted in ‘Akeeba Backup for Joomla! 4 & 5’
This is a public ticket

Everybody will be able to see its contents. Do not include usernames, passwords or any other sensitive information.

Environment Information

Joomla! version
n/a
PHP version
n/a
Akeeba Backup version
n/a

Latest post by nicholas on Wednesday, 29 June 2011 16:02 CDT

user9354
Have been using Amazon Simple Storage Solution with Akeeba Pro, and have literally thousands of small zip files in the bucket. S3 is configured with a single bucket, and a folder for each site. The unexplained files are in the bucket, but not in folders.
The backups are running fine, and the archives appear to split and store ok. They are visible in the site Administer Backup Files, with up to 64 part archives.
Viewing the files in Cloudberry doesn't tell me much, just referencing the AkeebaPro/S3PostProcessor.
I have attached a file for reference.
Using Lazy Scheduler; not sure if that is part of the problem or not. Per your advice, working on moving to chron jobs.
Any suggestion on how to eliminate these unwanted zip files would be appreciated.

nicholas
Akeeba Staff
Manager
I can't uncompress the ZIP file, but In understand what you mean. I think that's one of the many problems of the Lazy Scheduling plugin. You see why I want to kill it off? :)

Nicholas K. Dionysopoulos

Lead Developer and Director

πŸ‡¬πŸ‡·Greek: native πŸ‡¬πŸ‡§English: excellent πŸ‡«πŸ‡·French: basic β€’ πŸ• My time zone is Europe / Athens
Please keep in mind my timezone and cultural differences when reading my replies. Thank you!

user9354
Nicolas,
I switched everything over to Cron jobs, with both the 'backup.php' and 'altbackup.php' and still seeing these zip files. I converted one to text and attached; hopefully it makes more sense to you than it does to me. Also attaching a backup log.
What also troubles me is that they are not transferring to AS3, where with the Lazy plugin they would. Now they store locally and complete with the confirmation email sent as all is well, but the Cron email reports:
===============================================================================
!!!!! W A R N I N G !!!!!

Akeeba Backup issued warnings during the backup process. You have to review them and make sure that your backup has completed successfully. Always test a backup with warnings to make sure that it is working properly, by restoring it to a local server.
DO NOT IGNORE THIS MESSAGE! AN UNTESTED BACKUP IS AS GOOD AS NO BACKUP AT ALL.

===============================================================================
Peak memory usage: 6.15 Mb


OK, lets try that log file again, zipped.
Any advice would be appreciated.

nicholas
Akeeba Staff
Manager
OK, I know what's going on! As the log file states, PHP could nto resolve bravobackups.s3.amazonaws.com to an IP address. Why it previously worked and now it doesn't? Well, backup.php runs through the PHP CLI binary, whereas your server most likely uses the PHP CGI/FastCGI binary or the mod_php Apache module. Each one of these has a different set of configuration settings and runs under a user with different privileges. My guess is that a misconfiguration in your host's server doesn't allow PHP CLI to resolve the S3 hostname and causes this problem.

There are two solutions:
1. Quick'n'dirty: instead of backup.php use altbackup.php in your CRON job definition. The difference is that whereas backup.php runs the entire backup process through the PHP CLI binary (without going through your web site at all), altbackup.php actually calls a special Joomla! URL, so the backup is executed step-by-step by your web site. It's the same as running a backup from your site's back-end or from the obsolete Lazy Scheduling plugin (without Lazy Scheduling's bugs!)

2. Proper solution: ask your host to make sure that the PHP CLI binary can resolve hostnames to IP addresses.

Nicholas K. Dionysopoulos

Lead Developer and Director

πŸ‡¬πŸ‡·Greek: native πŸ‡¬πŸ‡§English: excellent πŸ‡«πŸ‡·French: basic β€’ πŸ• My time zone is Europe / Athens
Please keep in mind my timezone and cultural differences when reading my replies. Thank you!

user9354
Nicholas,
Well, wouldn't you know last night they all worked fine, both backup.php and altbackup.php.
Still getting the zip files, deleted 176 of them today. Perhaps it was just a DNS issue. Amazon has been a little flaky lately.
I do have a few jpa files that are marked 'site-unknown' which is strange because the cron email stated the site name throughout, then posted the file with the name unknown.
Also, the quota management doesn't seem to be working as the older files are not being deleted.
I will review next week and keep you posted.
Thanks for your help, and a fantastic product.
Paul

nicholas
Akeeba Staff
Manager
Hi Paul,

I'm glad all is working now!

Regarding the two issues you have:
- The site_unknown is a known issue. I have not been able to determine what's causing it yet. There is a block of code that's supposed to execute as soon as you enter Akeeba Backup's Control Panel page to update the necessary information. It looks like on some servers you have to visit that page, go to another Akeeba Backup page, return back to Control Panel, log out, log in again, re-visit Akeeba Backup's Control Panel and everything goes back to normal. It's a crazy solution and makes no more sense to me than you, but it worked on two affected sites (to my amazement!)

- I assume you refer to remote files quota management. It doesn't work on S3 because of the uploaded file permissions. I am diving into Amazon's documentation and I hope I'll have a solution ready by the next release.

Nicholas K. Dionysopoulos

Lead Developer and Director

πŸ‡¬πŸ‡·Greek: native πŸ‡¬πŸ‡§English: excellent πŸ‡«πŸ‡·French: basic β€’ πŸ• My time zone is Europe / Athens
Please keep in mind my timezone and cultural differences when reading my replies. Thank you!

user9354
Nicolas,
Thanks for the explanation of the known issues. I can deal with a 'unknown' file name by creating a folder.

Will look forward to the ability to manage quotas.

What I really would like to know is what is the deal with all the little zip files? They are still being written, hundreds a day. Any idea what is causing them, and how we can eliminate it?

Thanks,
Paul

post script
Also getting backups that run but fail to transfer:
[110501 00:25:20] Initializing post-processing engine
[110501 00:25:20] 3 files to process found
[110501 00:25:20] Beginning post processing file /home/midmich/tmp/site-mmic.us-20110501-000014.jpa
[110501 00:25:20] S3 -- Legacy (single part) upload of site-mmic.us-20110501-000014.jpa
[110501 00:26:00] Failed to process file /home/midmich/tmp/site-mmic.us-20110501-000014.jpa
[110501 00:26:00] Error received from the post-processing engine:
[110501 00:26:00] AEUtilAmazons3::putObject(): [6] Couldn't resolve host 'bravobackups.s3.amazonaws.com'
[110501 00:26:00] Not removing processed file /home/midmich/tmp/site-mmic.us-20110501-000014.jpa
[110501 00:26:00] ----- Finished operation 2 ------

Likely part of the problem, but again this site had previously posted to AWS3 with Lazy.

Thanks,
Paul

nicholas
Akeeba Staff
Manager
Please take a look at the error message you got:

Couldn't resolve host 'bravobackups.s3.amazonaws.com'

This means that the DNS resolution through the command-line isn't working. You have to ask your host to take a look at that. I can't do anything about it. We just tell PHP's cURL extension to PUT (that's an HTTP command, hence the capital letters) information to a specific URL. In its turn, cURL will ask the operating system to resolve the domain name. This is the part that fails, which implies a problem with the server's configuration.

Nicholas K. Dionysopoulos

Lead Developer and Director

πŸ‡¬πŸ‡·Greek: native πŸ‡¬πŸ‡§English: excellent πŸ‡«πŸ‡·French: basic β€’ πŸ• My time zone is Europe / Athens
Please keep in mind my timezone and cultural differences when reading my replies. Thank you!

user9354
Nicholas,
Any information on the little zip files being posted to AS3?
This is an ongoing problem.
Paul

nicholas
Akeeba Staff
Manager
Are you using the Akeeba Backup Lazy Scheduling plugin? If yes, please disable it. If not, please ZIP and attach the entire log file of your last backup process. As I have explained before, unless I see the entire log, top to bottom, I can not know what is going on with your backup and I can not be of any assistance at all as I am trying to solve something I have no other way of knowing what it is :(

Nicholas K. Dionysopoulos

Lead Developer and Director

πŸ‡¬πŸ‡·Greek: native πŸ‡¬πŸ‡§English: excellent πŸ‡«πŸ‡·French: basic β€’ πŸ• My time zone is Europe / Athens
Please keep in mind my timezone and cultural differences when reading my replies. Thank you!

user9354
Nicholas,
We are not using the lazy plugin, but cron.
Not sure which sites are the problem, so I have posted two log files; one succeeded and one failed.
Thanks,
Paul

nicholas
Akeeba Staff
Manager
You have set up a small part size for split archives (accessible in Archiver Engine -> Configure -> Part size for split archives). This splits the backup into many smaller files. In the post-processing options you have chosen to process each archive part directly after its creation. This is why while the backup is still in progress part files are being transferred to S3. At the end of the log, I can see that you run out of disk space. That's why the backup fails and you still have all those hundreds of files sent to S3.

My advice is three-fold:
1. Make sure that you have more free disk space available for the backup to run
2. Exclude any large files from the backup that you don't need to be in the backup - e.g. large media files are better left outside of your regular backup and only stored in one copy off-line. Makes the backup much faster.
3. Try increasing the part size for split archives to reduce the number of files being sent to S3. Since you're running the backup from the command line there is no time limit, so you won't be timing out while transferring the large files.

Nicholas K. Dionysopoulos

Lead Developer and Director

πŸ‡¬πŸ‡·Greek: native πŸ‡¬πŸ‡§English: excellent πŸ‡«πŸ‡·French: basic β€’ πŸ• My time zone is Europe / Athens
Please keep in mind my timezone and cultural differences when reading my replies. Thank you!

user9354
Well, we routinely limit the backup by excluding unnecessary files.
Why does the component and documentation state:
Remember to set a split archive size of 2-30Mb or you risk backup failure due to timeouts!
I'll turn it off and hope for the best.
Why would the backup fail to upload for:
[110612 00:22:26] AEUtilAmazons3::putObject(): [6] Couldn't resolve host 'bravobackups.s3.amazonaws.com'
When it had previously done so on more than one occasion?
THanks,
Paul

nicholas
Akeeba Staff
Manager
First make sure that your bucket name in Amazon S3 is in all lowercase letters, I.e. bravobackups and not BravoBackups. If that doesn't help, disable the Use SSL option in Akeeba Backup's S3 configuration (some hosts don't play well with SSL connections). If that still doesn't help, check with your host if they can run a traceroute to this host name. If they can't, they have a routing or DNS issue. If they can, please post back.

Nicholas K. Dionysopoulos

Lead Developer and Director

πŸ‡¬πŸ‡·Greek: native πŸ‡¬πŸ‡§English: excellent πŸ‡«πŸ‡·French: basic β€’ πŸ• My time zone is Europe / Athens
Please keep in mind my timezone and cultural differences when reading my replies. Thank you!

user9354
Well, at this point I have to give a big thumbs down to the idea of using Akeeba Pro to schedule offsite backups using Amazon Simple Storage Services.
No resolution to the daily creation of hundreds of unexplained zip files.
No remote quota management.
Site names still not correct.

There have been updates to Akeeba during the life of this thread that have not addressed the problem.

I have complied with each instruction to no avail.

nicholas
Akeeba Staff
Manager
Sorry to hear that. Akeeba Backup has been used by thousands of website owners to automatically backup their sites to S3, including this site and all of the other sites I manage. I never had problems with good quality hosts like Rochen, CloudAccess.net, iRedHost or RackSpace CloudSites.

The numerous unexplained ZIP files seem to be related with the DNS issues your host has. Please note that in their response they used the command line tool dig, presumably running not on the server but on their local machine. They were debugging the WRONG PROBLEM. The problem IS NOT that the domain doesn't resolve in general. Of course it resolves. I checked that myself. The problem is that the domain does not resolve using PHP's cURL extension running on their server where your site is stored. They did absolutely nothing to debug this issue so I insist that the problem is with their server setup. How is it possible that me and thousands of other people using decent hosts are able to backup our sites daily to Amazon S3 without any such issues? Coincidence? I don't think so.

The remote quota management requires the post-processing to work correctly which, under the current flakey DNS conditions on your host, is impossible.

The unknown_host issue was resolved two releases ago, so it should not be a factor any more. It was caused by a bug in Live Update which prevented Akeeba Backup to update some useful information in its internal component parameters storage, one of them being the site name which is used during backups made with backup.php. So, are you using Akeeba Backup 3.3.b1? If not, then yes, that's an unresolved problem in older releases. That's why I publish new releases, to resolve these known issues.

What you have is hosting configuration issues. I consider it very unfair to give thumbs down to a software which does work and to the person who actually tries to help you with a problem not caused by his software but with something completely outside his control. The easiest way out for me would be to tell you "your host sucks, I don't wanna deal with it, have 'em fix it". But that would be an utterly stupid thing to do, because it would mean that I don't care about you and what you want to do. Instead, I am spending time trying to help you, because I appreciate the fact that you want reliable backups for your site and that's what I want to help you achieve.

So, do you want me to help you direct your host to fixing this server configuration problem? I can write a couple of test scripts to demonstrate them what the problem is so that they can track it down. What do you say? Do you want this issue fixed or not?

Nicholas K. Dionysopoulos

Lead Developer and Director

πŸ‡¬πŸ‡·Greek: native πŸ‡¬πŸ‡§English: excellent πŸ‡«πŸ‡·French: basic β€’ πŸ• My time zone is Europe / Athens
Please keep in mind my timezone and cultural differences when reading my replies. Thank you!

user9354
Nicholas,
Thank you again for your prompt and professional reply.
I apologize if I allowed my frustration to get out of hand.
I do appreciate your continued assistance, and still feel you have a very high quality product.
Of course I do want this issue resolved. Rather than asking you to go further, I will use your comments above to prod Gator again to investigate the issue.
Once I have a response I will update you.
Paul

nicholas
Akeeba Staff
Manager
You're welcome! I understand your frustration. I have been in your shoes as the user of other software. As I said, if HostGator is not able to track down the issue, give me a ping. I can prepare a test script which will allow them to step through the file transfer process, identify the issue and -hopefully- fix it. We're here to help!

Nicholas K. Dionysopoulos

Lead Developer and Director

πŸ‡¬πŸ‡·Greek: native πŸ‡¬πŸ‡§English: excellent πŸ‡«πŸ‡·French: basic β€’ πŸ• My time zone is Europe / Athens
Please keep in mind my timezone and cultural differences when reading my replies. Thank you!

user9354
Gators response:
Hello,
Let me break it down and see what else he needs. All of our troubleshooting steps were performed ON your server from the root account. Let us know if he needs any more information from our side.

Thanks for using HostGator, and if there is anything else that we can help you with, please just reply to this ticket. Please show him the third line down, which is the result of php's gethostbyname command.

I'm sorry that you are having this much trouble.

root@bravosmartwebdesign [~]# hostname
bravosmartwebdesign.bravosmartwebdesign.com

root@bravosmartwebdesign [~]# php 72.21.203.149

root@bravosmartwebdesign [~]# dig bravobackups.s3.amazonaws.com

; <<>> DiG 9.3.6-P1-RedHat-9.3.6-16.P1.el5 <<>> bravobackups.s3.amazonaws.com
;; global options: printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 2114
;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 8, ADDITIONAL: 8

;; QUESTION SECTION:
;bravobackups.s3.amazonaws.com. IN A

;; ANSWER SECTION:
bravobackups.s3.amazonaws.com. 12 IN CNAME s3-directional-w.amazonaws.com.
s3-directional-w.amazonaws.com. 3565 IN CNAME s3-1-w.amazonaws.com.
s3-1-w.amazonaws.com. 12 IN A 72.21.194.23

;; AUTHORITY SECTION:
s3-1-w.amazonaws.com. 3559 IN NS ns-942.amazon.com.
s3-1-w.amazonaws.com. 3559 IN NS ns-911.amazon.com.
s3-1-w.amazonaws.com. 3559 IN NS ns-912.amazon.com.
s3-1-w.amazonaws.com. 3559 IN NS ns-921.amazon.com.
s3-1-w.amazonaws.com. 3559 IN NS ns-922.amazon.com.
s3-1-w.amazonaws.com. 3559 IN NS ns-923.amazon.com.
s3-1-w.amazonaws.com. 3559 IN NS ns-924.amazon.com.
s3-1-w.amazonaws.com. 3559 IN NS ns-941.amazon.com.

;; ADDITIONAL SECTION:
ns-911.amazon.com. 3555 IN A 207.171.178.13
ns-912.amazon.com. 3552 IN A 204.246.162.5
ns-921.amazon.com. 3573 IN A 72.21.192.209
ns-922.amazon.com. 3576 IN A 72.21.208.208
ns-923.amazon.com. 3579 IN A 72.21.204.208
ns-924.amazon.com. 3582 IN A 72.21.208.210
ns-941.amazon.com. 3594 IN A 204.246.160.5
ns-942.amazon.com. 3597 IN A 204.246.160.7

;; Query time: 12 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Wed Jun 29 10:12:16 2011
;; MSG SIZE rcvd: 418


root@bravosmartwebdesign [~]# host bravosmartwebdesign.bravosmartwebdesign.com
bravosmartwebdesign.bravosmartwebdesign.com has address 174.123.71.18
bravosmartwebdesign.bravosmartwebdesign.com mail is handled by 0 bravosmartwebdesign.bravosmartwebdesign.com.

root@bravosmartwebdesign [~]# host bravobackups.s3.amazonaws.com
bravobackups.s3.amazonaws.com is an alias for s3-directional-w.amazonaws.com.
s3-directional-w.amazonaws.com is an alias for s3-1-w.amazonaws.com.
s3-1-w.amazonaws.com has address 72.21.194.23


Regards,

Joey C.
Linux Systems Administrator
HostGator.com LLC

-Paul

nicholas
Akeeba Staff
Manager
Hi Paul,

I am attaching a ZIP file with a testing script (s3test.php). Please extract the ZIP file and edit the file. On the top of the file there are three lines where you can enter your access key, secret key and bucket name.

Upload that file to your site's root and access it as http://www.yoursite.com/s3test.php. Please paste the results here. It should copy the file key.jpg from images/stories to your S3 bucket's root. I just tested it locally and on two live servers with absolute success.

The test script uses the exact code found in Akeeba Backup Professional to perform Amazon S3 uploads. If it fails to upload the file to your bucket, please send me a Personal Message (I am user "nicholas") with the following information:
- A link back to this thread (so that I know why you sent me the Personal Message)
- URL to your site
- FTP connection information
- (optional) Super Administrator connection information to your site.
(a subdomain of your site and an FTP account limited to the subdomain will do just fine).
This will allow me to trace the code directly on your server to make sure that everything works as it is supposed to. If you optionally give me Super Administrator access I will also be able to check your Akeeba Backup configuration, just in case we've missed something important in our forum posts.

Please allow 24 hours since your PM for me to check out your site. Your request will receive priority, but debugging on a live site may take a moderately long time before it is thoroughly complete.

Thank you in advance!

Nicholas K. Dionysopoulos

Lead Developer and Director

πŸ‡¬πŸ‡·Greek: native πŸ‡¬πŸ‡§English: excellent πŸ‡«πŸ‡·French: basic β€’ πŸ• My time zone is Europe / Athens
Please keep in mind my timezone and cultural differences when reading my replies. Thank you!

Support Information

Working hours: We are open Monday to Friday, 9am to 7pm Cyprus timezone (EET / EEST). Support is provided by the same developers writing the software, all of which live in Europe. You can still file tickets outside of our working hours, but we cannot respond to them until we're back at the office.

Support policy: We would like to kindly inform you that when using our support you have already agreed to the Support Policy which is part of our Terms of Service. Thank you for your understanding and for helping us help you!