Atomic DDL's

You are viewing an old version of this article. View the current version here.

In MariaDB 10.6.1 we have improved reability for DDL's (Data definition language operations) to make most of them operations Atomic and the rest crash safe, even if the server would crash in the middle of the operation.

The design of Atomic/Crash-safe DDL (MDEV-17567) allows it to work with all storage engines.

Definitions

  • Atomic means that either the operations succeeds (and is logged to the binary log or is complete reversed.
  • Crash safe means that in case of a crash, after the server has restarted, all tables are consistent, there is no temporary files or tables on disk and the binary log matches the status of the server.
  • DDL Data definition langue.
  • DML Data manipulation language.
  • 'DDL recovery log' or 'DDL log' for short, is the new log file, ddl-recovery.log by default, that stores all DDL operations in progress. This is used to recover the state of the server in case of sudden crash.

Background why Atomic DDL's are needed

Before 10.6, in case of a crash, there was a small possibility that one of the following things could happen:

  • There could be left some temporary tables starting with #sql-alter or #sql-shadow or temporary files ending with ''.
  • The table in the storage engine and the table's .frm file could be out of sync.
  • During a multi-table rename, only some of the tables where renamed.

Which DDL operations are now Atomic

  • ALTER SEQUENCE is not listed above as it is internally implemented as a DML.

Which DDL operations are now crash safe

DROP TABLE of multiple tables.

DROP TABLE over multiple tables is treated as if every DROP is a separate, Atomic operations. This means that after a crash, all fully, or partly, dropped tables will be dropped and logged to the binary log. The not dropped tables will be left untouched.

CREATE OR REPLACE TABLE

CREATE OR REPLACE TABLE foo is implemented as:

DROP TABLE IF EXISTS foo;
CREATE TABLE foo ...

This means that if there is a crash during CREATE TABLE then the original table 'foo' will be dropped even if the new table was not created. If the table was not re-created, the binary log will contain the DROP TABLE.

DROP DATABASE

DROP DATABASE is implemented as:

loop over all tables
  DROP TABLE table

Each DROP TABLE is atomic, but in case of a crash, things will work the same way as DROP TABLE with multiple tables.

Atomic with different storage engines

Atomic/Crash-safe DDL's work with all storage engines that has either atomic DDL's internally or are able to re-execute DROP or RENAME in case of failure.

This should be true for most storage engines. The ones that still need some work are:

The DDL log recovery file

The new startup option --log-ddl-recovery=path (ddl-recovery.log by default) can be used to specify the place for the DDL log file. This is mainly useful in the case when one has a file system on persistent memory as there is a lot of sync on this file during DDL operations.

This file contains all DDL operations that are in progress.

At MariaDB server startup, the DDL log file is copied to a file with the same base name but with a -backup.log suffix. This is mainly done to be able to find out what went wrong if recovery would fail.

If the server would crash during recovery (unlikely but possible), the recovery will continue where was before. The recovery will retry each entry up to 3 times before giving up and proceed with the next entry.

Conclusions

  • We believe that a clean separation of layers leads to an easier-to-maintain solution. The Atomic DDL implementation in MariaDB 10.6 introduced minimal changes t o the storage engine API, mainly for native ALTER TABLE.
  • In our InnoDB implementation, no file format changes were needed on top of the RENAME undo log that was introduced in MariaDB Server 10.2.19 for a backup-safe TRUNCATE reimplementation. Correct use of sound design principles (write-ahead logging and transactions; also file creation now follows the ARIES protocol) is sufficient. We removed the hacks (at most one CREATE or DROP per transaction) and implemented correct <<code>rollback<</code>> and <<code>>purge<</code>> triggers for the InnoDB SYS_INDEXES table.
  • Numerous DDL recovery bugs in InnoDB were found and fixed quickly thanks to https://rr-project.org. We are still working on one: data files must not be deleted before the DDL transaction is committed.

Thanks to Atomic/Crash-safe DDL's, the MariaDB server is now much more stable and reliable in unstable environments. There is still ongoing work to fix the few remaining issues mentioned above to make all DDL operations Atomic. The target for these are MariaDB 10.7.

See also

  • MDEV-17567 Atomic DDL. This MDEV entry links to all other entries related to Atomic operations that contains a lot of information how things are implemented.

Comments

Comments loading...
Content reproduced on this site is the property of its respective owners, and this content is not reviewed in advance by MariaDB. The views, information and opinions expressed by this content do not necessarily represent those of MariaDB or any other party.