MariaDB 10 GTID Explained
MariaDB replication in general works as follows: on a master server, all updates to the database are written into the binary log as binlog events, a slave server connects to the master and reads the binlog events and applies the events locally to replicate the same changes as done on the master. A server can be both a master and a slave at the same time, it is thus possible for binlog events to replicated through multiple levels of servers. A slave server keeps track of the position in the master's binlog of the last event applied on the slave (this allows the slave server to re-connect and resume from where it left off after replication has been temporarily stopped). Since version 10.0.2, MariaDB supports Global Transaction IDs for replication and this is automatically enabled. A Global Transaction ID, or GTID, consists of three numbers separated with dashes '-': the first number 0 is the domain ID, which is specific for global transaction ID, the second number is the server ID (the same used in old-style replication), the third number is the sequence number.
The Global Transaction ID introduces a new event attached to each event group in the binlog (an event group is a collection of events that are always applied as a unit): as an event group is replicated from master server to slave server, the global transaction ID is preserved. Since the ID is globally unique across the entire group of servers, this makes it easy to uniquely identify the same binlog events on different servers that replicate each other. (this was not easily possible before MariaDB 10.0.2).
The Domain ID
When events are replicated from a master server to a slave server, the events are always logged into the slave's binlog in the same order that they were read from the master's binlog. The slave to keep track of its current position in the replication uses this consistent binlog order. Basically, the slave remembers the GTID of the last event group replicated from the master. When reconnecting to a master, whether the same one or a new one, it sends this GTID position to the master, the master starts sending events from the first event after the corresponding event group. If user updates are done independently on multiple servers at the same time, then in general it is not possible for binlog order to be identical across all servers. This can happen when using MariaDB 10 multi-source replication, with multi-master ring topologies or if manual updates are done on a slave that is replicating from active master. If the binlog order is different on the new master from the order on the old master, then it is not sufficient for the slave to keep track of a single GTID to completely record the current state. The domain ID, the first component of the GTID, is used to handle this situation.
The Server ID
The server ID is set to the server ID of the server where the event group is first logged into the binlog. The sequence number is increased on a server for every event group logged. Since server IDs must be unique for every server, this makes the server_id-sequence_number pair, and hence the whole GTID, globally unique.
Benefits of the GTID
It has become easy to change a slave server to connect to and replicate from a different master server. The slave remembers the GTID of the last event group applied from the old master. It is than easy to know where to resume replication on the new master, since the GTIDs are know throughout the entire replication hierarchy. When using old-style replication the slave knows only the specific file name and offset of the old master server of the last event applied and there is no simple way to guess from this the correct file name and offset on a new master.
Crash Safe Slave
The state of the slave is recorded in a crash-safe way. The slave keeps track of its current position (the GTID of the last transaction applied) in a system InnoDB table mysql.gtid_slave_pos. Updates to the state are done in the same transaction as the updates to the data. This makes the state crash-safe: if the slave server crashes, crash recovery on restart will make sure that the recorded replication position matches what changes were actually replicated. For old-style replication, the state is recorded in a file relay-log.info, which is updated independently of the actual data changes and can easily get out of sync if the slave server crashes.
Setting up a new slave:
SET GLOBAL GTID_SLAVE_POS = BINLOG_GTID_POS("master-bin.00024", 1600); CHANGE MASTER TO master_host="10.2.3.4", master_use_gtid=slave_pos; START SLAVE;
Switch a slave to use GTID
STOP SLAVE; CHANGE MASTER TO master_use_gtid=current_pos; START SLAVE;
STOP SLAVE; CHANGE MASTER TO master_host="10.2.3.5"; START SLAVE;