March 7, 2017

Replication Manager, Flashback and Much More!

Note: Replication Manager is developed and maintained by community members. It can be used with MariaDB Server, but it is not supported with an enterprise subscription.

MariaDB 10.2.4 has fantastic new features that perfectly match Replication Manager's ultimate goals: transparent automated failover on MariaDB master slave architecture (with as little as possible lost in transaction:)).  We are going to explore those new features and how Replication Manager uses them for your benefit! 

The first feature is constant binlog fetching from remote master via mysqlbinlog.

Replication Manager will use this feature when your old master comes back to live.  It will take a snapshot of the transactions events differences from the position where the new elected master was introduced and the current position of the joiner old master.

Those events are saved in a crash directory in the replication-manager working directory for later use.  

Another exciting feature is the binlog flashback.  

Replication Manager will use this feature to reintroduce the old master in the active topology in multiple cases.

The first case, when semi-sync replication was synced during the crash: This is good as it saves you from recovering the dead master from a backup of the new master.

The picture looks like this: In semi-sync the transactional state of the dead master can be ahead of the new leader, as the sync part in the feature name refers to the state of the client, not the state of the database.

I'll try a metaphor:

A regular MariaDB replication GOAL is to make sure you mostly never lose a transaction under HA by accepting the nature of an unpredictable future. So If this were a performance of a show, you could enter even if you are a criminal. If you disturb the show, the show can recover: in this example, if the show gets disturbed, Replication Manager will transfer you and others to the same show at a later time, or in another place, or begin again at the same position in the show. Semi-sync is the speed of light delay, as if the event has already happened on the active stage but never made it to your eyes. We will transfer you before or after that time that is under your control and make sure that the show is closely synchronized!

So in semi-sync, the state is SYNC, the show stops in a state that is ahead of where others would be stuck with a "delayed show".  Since clients' connections have never seen such differences, you can flashback by rewinding the show from when the disturbance occurred and continue the show from the same point in another location.

This is exactly the same concept as a delayed broadcast. If going to the bathroom takes more time than the broadcast delay, you may have lost some important parts of the story when you press resume.

The second case is when semi-sync delay is passed or you have been not running semi sync replication:  We can resume the show but you have possibly lost events, Replication Manager can or flashback or use dump for recovering.  

Let examine the options available to make this happen.

      # MARIADB >= 10.2
      # ---------------
      mariadb-binary-path = "/usr/sbin"
      # REJOIN
      # --------
      autorejoin = true
      autorejoin-semisync = true
      autorejoin-flashback = true
      autorejoin-mysqldump = false

Don't forget to set in the cluster configuration that you want to auto resume:

      interactive = false

The default of Replication Manager is to alert on failure, not to do the job of failover.

Another exciting feature is the “no slaves behind” availability.

Can you use your slaves and transparently load balance reads with replication-manager topology? The answer was maybe with MaxScale read write splitting, but only if you didn’t care about reading delayed slave in auto commit workload.

For example, insert followed by close connection and passing the ball to another micro service that read that same data, would be insecure.

Now there is a possibility to configure read write splitter to failback to master under some replication delay lower than the one setup via no slaves behinds.

It brings the solution to slow down the master commit workload under that delay so that read on slave can become committed read!

Extra features:

The new Replication Manager release also addresses some requirement to manage multi clusters management within the same replication-manager (note the change in the configuration file).

      [cluster1]
      specific options for this cluster 
      [cluster3]  
      specific options for this cluster 
      [default] 

If you have a single cluster just use default.

In the console mode one can switch cluster using Ctrl-P  & Ctrl-N and in HTTP mode a drop box is available to switch the active cluster view. 

Some respected members of the community have addressed some possible issues and here with the choice made to separate failover logic in Replication Manager instead of putting it directly in MariaDB MaxScale proxy. This new release addresses such concerns. 

Let's look at the core new features of Replication Manager when it comes to MaxScale Proxy.

      failover-falsepositive-heartbeat = true
      failover-falsepositive-heartbeat-timeout = 3
      failover-falsepositive-maxscale = true
      failover-falsepositive-maxscale-timeout = 14

One can get this just by the names. Having separate pieces make it possible for better false positive detection of leader death. Here all your slaves acting have a leader failure detection and MaxScale does as well. This is on top of all previous checks and conditions checks.

      failcount = 5
      failover-max-slave-delay = 30
      failover-limit = 3
      failover-at-sync = true
      failover-time-limit = 10 

Stay tuned as more time will pass and the failover-falsepositive method will be added as it is already in the roadmap. I guess this task is addressing some of our fellow ace director musings found here.  Also, Etcd is already in the roadmap and will be worked on in the future and receive contributions for sure!

While failover and MaxScale monitoring can be tricky (as noted by Johan and Shlomi), Replication Manager is addressing the issue of the last slave available slave being elected as the new master.

In this case MaxScale is lost without a topology and this will be similar to having a single slave for HA. The solution to address this issue is to let Replication Manager fully drive MaxScale server state.

      maxscale-monitor = false
      # maxinfo|maxadmin
      maxscale-get-info-method = "maxinfo"
      maxscale-maxinfo-port = 4002
      maxscale-host = "192.168.0.201"
      maxscale-port = 4003
      maxscale-user = "admin"
      maxscale-pass = "mariadb"

By setting MaxScale monitoring = false replication-manager.  Tell MaxScale to disable monitoring and it will impose server status to MaxScale.

Don't forget to simply activate maxscale usage.

      maxscale = true 

Last but not least of the Replication Manager’s new features is the tracking of metrics via an embedded carbon graphite server. This internal server can be a relay for graphite for custom reporting and is also used by the HTTP server of Replication Manager.

 

replicationmanager.png

 

      graphite-metrics = true
      graphite-carbon-host = "127.0.0.1"
      graphite-carbon-port = 2003
      graphite-embedded = true
      graphite-carbon-api-port = 10002
      graphite-carbon-server-port = 10003
      graphite-carbon-link-port = 7002
      graphite-carbon-pickle-port = 2004
      graphite-carbon-pprof-port = 7007

All those features are passing the new non regression test cases and can be found in the dev branch of Replication Manager.

Are you thinking, "Hey this is good technical content but my team does not know much about replication technical details?" No worries! We DO have helpers in Replication Manager to enforce best practices, and it's always best to plan HA before starting any new serious DB project.

      force-slave-heartbeat= true
      force-slave-gtid-mode = true
      force-slave-semisync = true
      force-slave-readonly = true
      force-binlog-row = true
      force-binlog-annotate = true
      force-binlog-slowqueries = true
      force-inmemory-binlog-cache-size = true
      force-disk-relaylog-size-limit = true
      force-sync-binlog = true
      force-sync-innodb = true
      force-binlog-checksum = true

*Note that some following enforcements do not get covered by test cases and we would welcome any contributors.