Group Commit for the Binary Log
MariaDB starting with 5.3
Group commit for the binary log was first introduced in MariaDB 5.3
Group commit is an important optimization in MariaDB 5.3 when the server is run with
sync_binlog=1. These settings
are needed to ensure that if a server crashes, any transaction that was
reported as committed prior to the time of crash will still be present in the
database after crash recovery ("durability", the "D" in ACID). Both settings
are also necessary to be able to recover into a consistent state at all after a
crash; if both are not set, it is possible after a crash to end up in a state
where a transaction in the binary log is missing in InnoDB/XtraDB, or vice
versa, which breaks replication for slaves.
When these settings are in effect, the server does an
fdatasync() or similar) call on the XtraDB transaction log file / binary
log file during
COMMIT in order to ensure that data is stored durably on
the disk. An
fsync() is a time-consuming operation, and can easily limit
throughput in terms of the number of commits per second which can be
The idea with group commit is to amortise the costs of each
multiple commits from multiple parallel transactions. If there are say 10
transactions in parallel trying to commit, we can force all of them to disk at
once with a single
fsync(), rather than do one
fsync() for each. This
can greatly reduce the need for
fsync() calls, and consequently greatly
improve the commits-per-second throughput.
Group commit works automatically; there are no options needed to enable it, and there are no negative consequences (performance or reliability or otherwise) from it.
However, to see the positive effects of group commit, the workload must have
sufficient parallelism. Normally, at least three parallel transactions are
needed for group commit to be effective: While the first transaction is
fsync() to complete, the other two transactions will queue up
waiting for their turn to call
fsync(). When the first transaction is done,
fsync() can be used for the two queued-up transactions, saving in
this case one of the three
In addition to sufficient parallelism, it is also necessary to have enough
transactions per second wanting to commit that the
fsync() calls are a
bottleneck. If no such bottleneck exists (i.e. transactions never or rarely
need to wait for the
fsync() of another to complete), group commit will
provide little to no improvement.
Binlog_group_commits is the total number of group commits done to the binary log (a group commit is when a group of transactions are written into the binary log
together), sharing a single
fsync() call). When
sync_binlog=1, then this
is the number of
fsync()'s done for transaction commits to the binary log
Thus the extent to which group commit is effective at reducing the number of
fsync() calls on the binary log can be determined by the ratio between
these two status variables.
binlog_commits will always be as high or higher
binlog_group_commits; the bigger the difference, the more effective
group commit was at reducing
Since MariaDB 10.1.5 and MariaDB 10.0.18,
binlog_group_commit_trigger_lock_wait can be used to examine which triggers caused a group commit to be made.
To query these variables, use a statement such as:
SHOW GLOBAL STATUS LIKE 'binlog_%commit%';
Information about the effectiveness of group commit inside XtraDB can be
obtained from similar status variables such as innodb_data_fsyncs; however,
the data may be slightly harder to interpret as XtraDB/InnoDB sometimes uses
fsync() for purposes other than (group) commits.
For more information about the design and implementation of group commit, see MWL#116.