Architecture of MariaDB Xpand
This page is part of MariaDB's Documentation.
The parent of this page is: Architecture of MariaDB Database Products
Topics on this page:
Overview
MariaDB Xpand provides ACID-compliant distributed SQL, high availability, fault tolerance, write scaling, and horizontal scale-out for transactional workloads.
Distributed SQL databases consist of multiple database nodes working together with each node storing and querying a subset of the data.
MariaDB Xpand is a distributed SQL database that scales efficiently for modern massive-workload web applications that require strong consistency and data integrity. MariaDB Xpand achieves strong consistency through synchronous writes to replicas. MariaDB Xpand is designed for large data-set transactional workloads that require high performance, such as online transactional processing (OLTP).
This page describes the architecture of MariaDB Xpand and the role of each component.
Benefits
ACID-compliant distributed SQL for modern web applications with massive workloads that require high performance and strong consistency, such as online transactional processing (OLTP)
Elastic scale-out and scale-in to adapt for changes in workload or capacity requirements
High Availability (HA) and fault tolerance
Sophisticated data distribution partitions Xpand horizontally into slices, and distributes multiple copies of the slices among nodes as replicas which are managed automatically for data protection and load rebalancing
Read and write scaling, with a shared-nothing architecture
Parallel query evaluation for efficient execution of complex queries such as JOINs and aggregate operations
Columnar indexes can be used in Xpand 6:
Columnar indexes are best suited for queries that "scan" much or all of a table. This includes queries that perform aggregations on large amounts of data, queries that select a subset of columns from large tables, and queries that perform a full table scan of a fact table. Queries tend to match this pattern in analytical, online-analytical processing (OLAP), and data warehousing workloads.
With composite (multi-column) Columnar indexes, filtration is not dependent on column order. If a Columnar index is defined with
COLUMNAR INDEX (a, b, c)
, and the query filters withWHERE b = 1 AND c = 2
, the Columnar index can be used to filter the results.
Topologies
MariaDB Xpand supports multiple topologies. MariaDB products can be deployed in many different topologies. The topology on this page is representative. MariaDB products can be deployed to form other topologies, leverage advanced product capabilities, or combine the capabilities of multiple topologies.
The Xpand Topology consists of:
One or more MaxScale nodes
Three or more Xpand nodes
The MaxScale nodes:
Monitor the health and availability of each Xpand node using the Xpand Monitor (
xpandmon
)Accept client and application connections
Route queries to Xpand nodes using the Read Connection (
readconnroute
) or Read/Write Split (readwritesplit
) routers.
The Xpand nodes:
Receive queries from MaxScale
Store data in a distributed manner
Execute queries using parallel query evaluation
The Xpand Topology can also be deployed in a multi-zone environment:
Three or more zones
Each zone should be connected by low latency network connections
Each zone must have the same number of Xpand nodes
Distributed SQL
MariaDB Xpand provides distributed SQL capabilities that support modern massive-workload web applications which require strong consistency and data integrity.
A distributed SQL database consists of multiple database nodes working together, with each node storing and querying a subset of the data. Strong consistency is achieved by synchronously updating each copy of a row.
MariaDB Xpand uses a shared-nothing architecture. In a shared-nothing distributed computing architecture nodes do not share the same memory or storage. Xpand's shared-nothing architecture provides important benefits, including read and write scaling, and storage scalability.
Data Distribution
This documentation explains how Xpand distributes data sets across a large number of independent nodes, as well as provides reasoning behind some of our architectural decisions.
For additional information, see "Data Distribution with MariaDB Xpand".
Consistency, Fault Tolerance, and Availability
Xpand provides a consistency model that can scale using a combination of intelligent data distribution, multi-version concurrency control (MVCC), and Paxos. Our approach enables Xpand to scale writes, scale reads in the presence of write workloads, and provide strong ACID semantics.
Evaluation Model
Xpand uses parallel query evaluation for simple queries and Massively Parallel Processing (MPP) for analytic queries (similar to columnar stores).
Concurrency Control
Xpand uses a combination of Multi-Version Concurrency Control (MVCC) and 2 Phase Locking (2PL) to support mixed read-write workloads. In our system, readers enjoy lock-free snapshot isolation while writers use 2PL to manage conflict. The combination of concurrency controls means that readers never interfere with writers (or vice-versa), and writers use explicit locking to order updates.
Rebalancer
The Rebalancer is an automated system for maintaining a healthy distribution of data in the cluster. It's the Rebalancer's job to respond to an "unhealthy" cluster by modifying the distribution and placement of data. The Rebalancer is an online process that effects changes to the cluster with minimal interruption to user operations. It relieves the database administrator from the burden of manually manipulating data placement.
For additional information, see "MariaDB Xpand Rebalancer".
Query Optimizer
At the core of Xpand Query Optimizer is the ability to execute one query with maximum parallelism and many simultaneous queries with maximum concurrency. This is achieved via a distributed query planner and compiler and a distributed shared-nothing execution engine.
For additional information, see "Query Optimizer for MariaDB Xpand".