Binlog group commit and innodb_flush_log_at_trx_commit

You are viewing an old version of this article. View the current version here.

Note: This page describes features in an unreleased version of MariaDB.

Unreleased means there are no official packages or binaries available for download which contain the features. If you want to try out any of the new features described here you will need to get and compile the code yourself.

MariaDB starting with 10.0

MariaDB 10.0 introduces a performance improvement for group commit for InnoDB/XtraDB transactions when the binary log is enabled.

When both --innodb-flush-log-at-trx-commit=1 (the default) and binlog are enabled, there is now one less sync to disk inside InnoDB during commit (2 syncs shared between a group of transactions instead of 3).

Durability of commits is not decreased this is because even if the server crashes before the commit is written to disk by InnoDB, it will be recovered from the binlog at next server startup (and it is guaranteed that sufficient information is synced to disk so that such a recovery is always possible).

The old behavior, with 3 syncs to disk per (group) commit (and consequently lower performance), can be selected with the new --innodb-flush-log-at-trx-commit=3 value. There is normally no benefit to doing this, however there are a couple of edge cases to be aware of:

  • If using --flush-log-at-trx-commit=1 and --log-bin but --sync-binlog=0, then commits are not guaranteed durable inside InnoDB/XtraDB after commit. This is because events can be lost from the binlog in the event of a crash when --sync-binlog=0. In this case --innodb-flush-log-at-trx-commit=3 can be used to get durable commits in InnoDB/XtraDB. However, one should be aware that a crash is nevertheless likely to cause commits to be lost in the binlog, leaving the binlog and InnoDB inconsistent with each other. Thus --sync-binlog=1 is recommended. It has much less penalty in MariaDB 5.3 and later compared to older versions of MariaDB and MySQL.
  • XtraBackup only sees commits that have been flushed to the redo log, so with the new optimization there may be a small delay (normally at most 1 second) between when a commit happens and when the commit will be included in an XtraBackup. Note that the XtraDB backup will still be fully consistent with itself and the binlog. This is normally not an issue, as a backup usually takes many seconds, and includes all transactions committed up to the end of the backup, so it will be rather random anyway exactly which commit is or is not included in the backup. It is just mentioned here for completeness.

Comments

Comments loading...
Content reproduced on this site is the property of its respective owners, and this content is not reviewed in advance by MariaDB. The views, information and opinions expressed by this content do not necessarily represent those of MariaDB or any other party.