Comments - Migrating from 'innodb' checksum to 'crc32' checksum
Content reproduced on this site is the property of its respective owners,
and this content is not reviewed in advance by MariaDB. The views, information and opinions
expressed by this content do not necessarily represent those of MariaDB or any other party.
As you can read in MDEV-25105, no action is needed when you are migrating to the MariaDB 10.2 default
innodb_checksum_algorithm=crc32or the MariaDB 10.5 default (MDEV-19534)innodb_checksum_algorithm=full_crc32that was introduced in MariaDB 10.4.The
full_crc32checksum was introduced, because the old approach to checksums was insecure. In the old design, any page of a data file can use any of the previously supported checksum algorithms. The older checksums were extremely slow to compute and insecure; as a work-around someone got the idea to make it possible to disable page checksums. Thecrc32checksum is a little better if you ignore MDEV-17958, but it does not cover the entire data page and it is being computed as an insecure way (an exclusive OR of several CRC-32C checksums).Before MariaDB 10.4, the most secure option was to set
innodb_checksum_algorithm=strict_crc32so that only thecrc32checksum is allowed on any data pages. A corrupted page could accidentally carry a checksum that is valid when using one of the other algorithms.The
full_crc32checksum format is identified by a flag in the first page of the data file. If a file was created in this format, all its data pages must always use the same checksum, no matter whatinnodb_checksum_algorithmwas later set to. Another guarantee is that all unused parts of data pages will be filled with NUL bytes.Data files that were created before MySQL 5.1.48 or MariaDB 5.1.48 could contain arbitrary garbage in unused parts of data pages. Even though we try to keep this in mind, sometimes bugs like MDEV-27800 happen. The only way to convert files to the secure
innodb_checksum_algorithm=full_crc32format is to rebuild the tables. I believe that the system tablespace can only be converted by dumping and reloading the entire database.It is not feasible to implement logic in a tool like
innochecksumto try to determine which parts of each page might be unused and would need to be zeroed out, because this cannot always be determined by looking at an individual data page in isolation from others.Thank you for the detailed explanation.
The issue that I am facing right now. I am trying to upgrade MariaDB from 10.4.10 to 10.11.4. The checksum algorithm used in 10.4.10 is innodb. The checksum we are trying to use in 10.11.4 is full_crc32. We take physical backups in 10.4.10. The restore process includes innochecksum utility to compare checksums generated with stored checksums to identify any damages to data pages.
When I try restore of the backup taken in 10.4.10 with innodb checksum to MariaDB 10.11.4 with full_crc32 checksum. Checksum comparison fails for each innodb tables in my database at the first page complaining as following;
{ Fail: page::0 invalid Exceeded the maximum allowed checksum mismatch count::1 current::0 Error: Page 0 checksum mismatch, can't continue. }As mentioned previously we use innochecksum utility to make this comparison. The initial understanding is that innochecksum utility has removed deprecated checksum algorithms from use (innnodb, none). Hence, it will not be supported. MDEV-25105
Then we decided to migrate the checksum algorithm to a algorithm(crc32) supported in innochecksum utility and MariaDB 10.4.10 (full_crc32 was not considered as it was not introduced in 10.4.10 version), then, take physical backups. Hence, the physical backups taken would be compatible with MariaDB 10.11.4 if we use crc32/full_crc32 algorithm.
My followup questions:
I realize that it might be helpful to add back the
innodbandnonechecksum algorithms to theinnochecksumutility. This would require some code refactoring, duplicating some InnoDB source code instead of compiling that code with special options for inclusion in<<code>innochecksum<</code>>.On the other hand, you could use the
innochecksumutility from an older release to verify that the page checksums are valid and to convert them to thecrc32checksum. If you do not care about the verification step, then you should be able to blindly rewrite the page checksums to thecrc32format.Still, I think that the best solution would be to rebuild all InnoDB tables, simply by
OPTIMIZE TABLE. To see which tables are not in the full_crc32 format yet, try the following trick from MDEV-32230:SELECT name FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESPACES WHERE row_format IS NULL;I am afraid that the only way to convert the InnoDB system tablespace to the
full_crc32format would be to rebuild the entire database from a logical dump. The next best thing that you could do is to useinnochecksumto convert each page to thecrc32checksum, and then start the server withinnodb_checksum_algorithm=strict_full_crc32so that new files will be created in thefull_crc32format, and for any old-format files only thecrc32page checksums will be allowed.Thank you for this Marko. The solutions provided are of real help for us. One last question, does the
innodchecksumutility supportfull_crc32validation?"and then start the server with
innodb_checksum_algorithm=strict_full_crc32so that new files will be created in the full_crc32 format, and for any old-format files only the crc32 page checksums will be allowed."In my understanding from the above statement, I believe we would end up having mix of crc32(old pages) and full_crc32(new pages), will this cause any issues during checksum validation with
innodchecksumutility?No,
innochecksumdoes not fully supportfull_crc32- see MDEV-26907. Ifinnochecksumis a required part of your workflow, you could stick withcrc32.Noted and Thank you Ian!