Support

Site Restoration

#42341 Broken CLI-based DB restore with charset_conversion

Posted in ‘Site restoration’
This is a public ticket

Everybody will be able to see its contents. Do not include usernames, passwords or any other sensitive information.

Environment Information

PHP version
8.3.x
CMS Type
Joomla!
CMS Version
5.3.x
Backup Tool Version
10.1.0
Kickstart version
n/a

Latest post by nicholas on Tuesday, 14 October 2025 04:45 CDT

jduerscheid

After updating Akeeba to the latest version released yesterday, I see a sudden spike in failed backup restorations. They all fail with comparable messages:

> COLLATION 'latin1_german2_ci' is not valid for CHARACTER SET 'utf8mb4' 
> COLLATION 'ascii_general_ci' is not valid for CHARACTER SET 'utf8mb4'

An example raw query would be:

CREATE TABLE `bmwe_realtimeanalytics_heatmap` (
`id` int NOT NULL AUTO_INCREMENT,
`selector` varchar(500) CHARACTER SET utf8mb4 COLLATE ascii_general_ci NOT
NULL,
`pageurl` varchar(166) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci
NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `idx_selector_onpage` (`selector`,`pageurl`)
) ENGINE=MyISAM AUTO_INCREMENT=108 DEFAULT CHARSET=utf8mb4

I standardize the charset on restore using the charset_conversion = true flag, and according to the release notes, a fix has been applied there - is this the cause of my issue? Is there a workaround?

nicholas
Akeeba Staff
Manager

The individual columns COLLATE statements cannot be converted unless they have a collation based on the Unicode character set (utf8, utf8mb3, or utf8mb4).

If you have an extension which requires the ASCII collation, it is incompatible with Unicode. You need to disable charset_conversion, utf8tables, and utf8mb4 in this case. After restoration you should go through each table manually and figure out its character set and collation.

If this collation was created in error, follow the instructions above to restore the site. After that, remove the erroneous collation from the column (use the *_general_ci collation matching the table's character set). Then take a new backup and restore it enabling charset_conversion and utf8mb4.

Nicholas K. Dionysopoulos

Lead Developer and Director

🇬🇷Greek: native 🇬🇧English: excellent 🇫🇷French: basic • 🕐 My time zone is Europe / Athens
Please keep in mind my timezone and cultural differences when reading my replies. Thank you!

jduerscheid

Was there any chance related to the conversion in the last release? I'm asking as the same site with the exact same configuration could be successfully backed up and restored with the older version.

nicholas
Akeeba Staff
Manager

There was actually a bug in the previous release which would remove the COLLATE statements. This was never the intention, and it's absolutely not the correct behaviour! In fact, this behaviour did cause my site transfers from MariaDB 12 to MySQL 8 to fail – that's how I discovered the bug, oops! Collations need to be preserved when defined, and transformed to an equivalent collation under the new character set.

The crux of the problem here is that you have a collation (ascii_general_ci) which does not nominally belong to the same character set as the original table. Either by design, or by dumb luck, the original table's character set and that column's collation were compatible. This could be the case, for example, if you had a table with the latin1 character set. The ascii_general_ci collation would work because, technically, latin1 is a superset of ASCII (ASCII is the first 128 characters of the latin1 character set, latin1 adds accented characters and some graphics characters in the top 128 positions of the 8-bit codepage). However, when converting the table to Unicode character set there's no equivalent collation exactly because ascii_general_ci is based on a different character set than the original table's. We have no way to know what the person designing the database schema was thinking (or if they were thinking at all), therefore we have no way to arrive at an equivalent collation.

If the original table's character set was latin1 and the collation of the column was latin1_general_ci we could easily find equivalency: the character set becomes utf8mb4, therefore the equivalent collation is utf8mb4_general_ci.

That's a long way to say that the problem is that the original table is structured in an ambiguous way. We can't magically read the mind of the person who applied that collation. Was there a reason? If there was a reason and we do change the collation we will break something. If there was no reason, they just screwed up and should've known better. Still, we can't guess what is going on. So, we don't change anything and let the database restoration error out in this case so you get to talk to the person who created this table, figure out what they meant, change it on your server, and then take another backup without ambiguous collations which can be restored correctly.

Nicholas K. Dionysopoulos

Lead Developer and Director

🇬🇷Greek: native 🇬🇧English: excellent 🇫🇷French: basic • 🕐 My time zone is Europe / Athens
Please keep in mind my timezone and cultural differences when reading my replies. Thank you!

Support Information

Working hours: We are open Monday to Friday, 9am to 7pm Cyprus timezone (EET / EEST). Support is provided by the same developers writing the software, all of which live in Europe. You can still file tickets outside of our working hours, but we cannot respond to them until we're back at the office.

Support policy: We would like to kindly inform you that when using our support you have already agreed to the Support Policy which is part of our Terms of Service. Thank you for your understanding and for helping us help you!