MariaDB Adds Xpand for Distributed SQL

With the release of MariaDB Platform X5, we’re taking the versatility of MariaDB to another level by providing a distributed SQL database solution for our customers. This is an entirely new capability for MariaDB and provides customers with increased flexibility to meet changing workload needs. The Xpand smart engine allows customers to scale beyond the InnoDB storage engine’s sweet spot of high-performance mixed read/write workloads on a single node with the option of adding scale via replication and employing a highly-available fault-tolerant distributed solution for large-scale workloads.

Since MariaDB Xpand is a storage engine, you have the flexibility to scale on a per table basis. Start by using Xpand for just a single table and expand the usage as your needs grow beyond what a single node can handle. Increase the use of distributed SQL as your enterprise needs grow beyond replication or clustering. When data or query volumes increase to the point of degrading performance, use Xpand to distribute tables or the entire database for improved throughput and concurrency. Xpand has built-in high-availability and elasticity, so nodes can be added or removed transparently as needed to scale-out.

Just as with ColumnStore, our columnar smart engine, cross engine JOINs are possible (and encouraged) between replicated and distributed tables. Unlike other Distributed SQL implementations that distribute the entire database and have, therefore, significant overhead on smaller tables, MariaDB allows the combined use of InnoDB for replicated small data sets and massive distributed data sets via Xpand.

How Xpand Works

Xpand deployments consist of Enterprise Server instances with the Xpand plugin installed and the Xpand nodes running alongside each MariaDB instance. For the best performance, the Enterprise Server and the Xpand node can be installed on separate physical servers.
Just as for master-replica or Galera cluster topologies, MaxScale can be added as a proxy layer on top to manage connections and transparently fail over between the frontend Enterprise Server instances with replicated smaller data sets in InnoDB.

MariaDB Xpand Architecture

MariaDB Xpand Architecture

Each table using Xpand is split into a number of slices. Each slice is stored on a primary node and then replicated to one or more other nodes to ensure fault tolerance. Each Xpand node can perform both reads and writes. And each node has a map of the data distribution.

For read operations, the major part of the query is pushed down to Xpand where the query is evaluated and relevant portions of the query are then sent to the appropriate Xpand nodes. MariaDB Enterprise Server collects the return data from the Xpand nodes to generate a result-set.

For write operations, MariaDB Xpand uses a component called the “rebalancer” to automatically and transparently distribute data across the available Xpand nodes.

HA and Fault Tolerant

MariaDB Xpand is fault tolerant by design. It can sustain a node failure without suffering data loss. Should an Xpand node fail, the system leverages replicas of the data slices on other Xpand nodes. The rebalancer then auto-heals, creating additional replicas of the data originally stored on the failed Xpand node. It performs the entire operation automatically, transparently, and in the background, requiring no intervention from the administrator.

In a similar fashion Xpand can also manually expand or shrink as the need changes. Nodes added to the Xpand distributed SQL layer are initially empty. The rebalancer then starts moving slices from other nodes to the new nodes and eventually all data is evenly balanced between all the nodes so that Xpand can utilize all available hardware resources.

For More Information

The addition of Xpand and distributed SQL to MariaDB Platform X5 creates new possibilities for our enterprise customers enabling them to scale on a per table basis and gain the benefits from highly-available fault-tolerant distributed SQL solution as their needs grow.