December 14, 2017

MariaDB ColumnStore Data Redundancy – A Look Under the Hood

In this blog post, we take a close look at  MariaDB ColumnStore data redundancy, a new feature of MariaDB AX. This feature enables you to have highly available storage and automated PM failover when using local disk storage.

MariaDB ColumnStore data redundancy leverages an open source filesystem called GlusterFS, maintained by RedHat. GlusterFS is an open source, distributed file system that provides continued access to data and is capable of scaling very large data. To enable data redundancy you must install and enable GlusterFS prior to running postConfigure. For more information on this topic, refer to Preparing for MariaDB ColumnStore Installation - 1.1.X. Failover is configured automatically by MariaDB ColumnStore, so that if a physical server experiences a service interruption, data is still accessible from another PM node. 

During postConfigure you are prompted to enter number of redundant copies for each dbroot:

Enter Number of Copies [2-N] (2) >

N = Number of PMs. (An actual number is displayed by postConfigure)

On a multi-node install with internal storage the DBRoots are tied directly to a single PM. 

image2.png

 

On a multi-node install with data redundancy, replicated GlusterFS volumes are created for each DBRoot. To users on the outside this appears to be the same as above. Under the hood, a DBRoot is now a gluster volume, where a gluster volume is a collection of gluster bricks that map to directories on the local file system located here:

/usr/local/mariadb/columnstore/gluster/brick(n) (Default). 

This directory will contain the subdirectories brick1 to brick[n], where n = copies configured. Note: Bricks are numbered sequentially on each PM as they are created by MariaDB ColumnStore and are not related to each other or to DBRoot IDs.

Visually:

image5.png
A Three-PM Installation with Data Redundancy Copies = 2

 

In mcsadmin getStorageConfig this is displayed in text form, like this:

Data Redundant Configuration

Copies Per DBroot = 2
DBRoot #1 has copies on PMs = 1 2 
DBRoot #2 has copies on PMs = 2 3 
DBRoot #3 has copies on PMs = 1 3 

The number of copies can be increased as high as the number of PMs. For a three-PM system, that would that would look like this:

image4.png
A Three-PM Installation with Data Redundancy Copies = 3

 

It is important to note that as the number of copies increases, the amount of network resources for distributing redundant data between PMs also increases. Configuration of number of copies should be kept as low as your data-redundancy requirements allow. Alternatively, if hardware configurations allow, a dedicated network can be configured during installation with postConfigure to help offload gluster network data.

MariaDB ColumnStore assigns DBRoots to a PM by using GlusterFS to mount a dbroot to its associated data directory and used as normal.

PM1:

mount -tglusterfs PM1:/dbroot1 /usr/local/mariadb/columnstore/data1

PM2:

mount -tglusterfs PM2:/dbroot2 /usr/local/mariadb/columnstore/data2

PM3:

mount -tglusterfs PM3:/dbroot3 /usr/local/mariadb/columnstore/data3

At this point when a change is made to any files in a data(n) directory, it is copied to the connected brick. Only the assigned bricks are mounted as the logical DBRoots. The unassigned bricks are standby copies waiting for a failover event.

image3.png

Three-PM Data Redundancy Copies = 2

 

A failover occurs when a service interruption is detected from a PM. In a normal local disk installation, data stored on the dbroot for that module would be inaccessible. With data redundancy, a small interruption occurs while the DBRoot is reassigned to the secondary brick. 

In our example system, PM #3 has lost power. PM #1 would be assigned DBRoot3 along with DBRoot1 since it has been maintaining the replica brick for DBroot3. PM #2 will see no change.

image1.png

Three-PM Data Redundancy Copies = 2 & Failure of PM #3

 

When PM #3 returns, data changes for DBRoot3 and DBRoot2 will be synced across bricks for the volumes by GlusterFS. PM #3 returns to operational and DBRoot3 is unmounted from PM #1 and returned to PM #3.

image3.png

Three-PM Data Redundancy Copies = 2 & PM #3 Recovered

 

This is only a simple example meant to illustrate how MariaDB ColumnStore with data redundancy leverages GlusterFS to provide a simple and effective way to keep your data accessible through service interruptions. 

We are excited to offer data redundancy as part of MariaDB ColumnStore 1.1, which is available for download as part of MariaDB AX, an enterprise open source solution for modern data analytics and data warehousing.

About Ben Thompson

Ben Thompson is Software Engineer on MariaDB ColumnStore team.

Read all posts by Ben Thompson