MariaDB Enterprise ColumnStore Storage Requirements

MariaDB Enterprise ColumnStore has specific storage requirements.

Storage Options

MariaDB Enterprise ColumnStore supports multiple storage types:

Storage Type

Description

S3-Compatible Object Storage

  • S3-compatible object storage is optional, but recommended

  • Enterprise ColumnStore can use S3-compatible object storage to store data.

  • With multi-node Enterprise ColumnStore, the Storage Manager directory should use shared local storage for high availability.

Shared Local Storage

  • Required for multi-node Enterprise ColumnStore with high availability

  • Enterprise ColumnStore can use shared local storage to store data and metadata.

  • If S3-compatible storage is used for data, the shared local storage will only be used for the Storage Manager directory.

Non-Shared Local Storage

  • Appropriate for single-node Enterprise ColumnStore

  • Enterprise ColumnStore can use non-shared local storage to store data and metadata.

Object Storage Topology

Enterprise ColumnStore Topology with S3

Shared Local Storage Topology

Enterprise ColumnStore Topology with Shared Local Storage

S3-Compatible Object Storage

MariaDB Enterprise ColumnStore supports S3-compatible object storage.

S3-compatible object storage is optional, but highly recommended. If S3-compatible object storage is used, Enterprise ColumnStore requires the Storage Manager directory to use shared local storage (such as NFS) for high availability.

S3-compatible object storage is:

  • Compatible: Many object storage services are compatible with the Amazon S3 API.

  • Economical: S3-compatible object storage is often very low cost.

  • Flexible: S3-compatible object storage is available for both cloud and on-premises deployments.

  • Limitless: S3-compatible object storage is often virtually limitless.

  • Resilient: S3-compatible object storage is often low maintenance and highly available, since many services use resilient cloud infrastructure.

  • Scalable: S3-compatible object storage is often highly optimized for read and write scaling.

  • Secure: S3-compatible object storage is often encrypted-at-rest.

Many S3-compatible object storage services exist. MariaDB Corporation cannot make guarantees about all S3-compatible object storage services, because different services provide different functionality.

If you have any questions about using specific S3-compatible object storage with MariaDB Enterprise Columnstore, contact us.

Storage Manager

MariaDB Enterprise ColumnStore's Storage Manager enables remote S3-compatible object storage to be efficiently used. The Storage Manager uses a persistent local disk cache for read/write operations, so that network latency has minimal performance impact on Enterprise ColumnStore. In some cases, it will even perform better than local disk operations.

Enterprise ColumnStore only uses the Storage Manager when S3-compatible storage is used for data.

Storage Manager is configured using storagemanager.cnf.

Storage Manager Directory

MariaDB Enterprise ColumnStore's Storage Manager directory is at the following path by default:

  • /var/lib/columnstore/storagemanager

To enable high availability when S3-compatible object storage is used, the Storage Manager directory should use shared local storage and be mounted on every ColumnStore node.

S3 API

MariaDB Enterprise ColumnStore can use any object store that is compatible with the Amazon S3 API.

Many object storage services are compatible with the Amazon S3 API, and compatible object storage services are available for cloud deployments and on-premise deployments, so vendor lock-in is not a concern.

S3-Compatible Object Storage for On-Premises Deployments

In addition to cloud deployments, S3-compatible object storage is also available for on-premises deployments. Many storage vendors provide S3-compatible object storage for on-premises deployments. MariaDB Corporation does not recommend a specific storage vendor.

If you have any questions, please contact us.

Create an S3 Bucket

When S3-compatible storage is used for Enterprise ColumnStore, all nodes access the data from the same bucket.

You must create the S3 bucket during the deployment process before you start Enterprise ColumnStore.

You must also configure Enterprise ColumnStore's S3 Storage Manager to use the S3 bucket section.

Configure the S3 Storage Manager

When you want to use S3-compatible storage for Enterprise ColumnStore, you must configure Enterprise ColumnStore's S3 Storage Manager to use S3-compatible storage.

To configure Enterprise ColumnStore to use S3-compatible storage, edit /etc/columnstore/storagemanager.cnf:

[ObjectStorage]

service = S3

[S3]
bucket = your_columnstore_bucket_name
endpoint = your_s3_endpoint
aws_access_key_id = your_s3_access_key_id
aws_secret_access_key = your_s3_secret_key
# iam_role_name = your_iam_role
# sts_region = your_sts_region
# sts_endpoint = your_sts_endpoint
# ec2_iam_mode=enabled
# port_number = your_port_number

[Cache]
cache_size = your_local_cache_size
path = your_local_cache_path

The S3-compatible object storage options are configured under [S3]:

  • The bucket option must be set to the name of the bucket that you created in the Create an S3 Bucket step.

  • The endpoint option must be set to the endpoint for the S3-compatible object storage.

  • The aws_access_key_id and aws_secret_access_key options must be set to the access key ID and secret access key for the S3-compatible object storage.

  • To use a specific IAM role, you must uncomment and set iam_role_name, sts_region, and sts_endpoint.

  • To use the IAM role assigned to an EC2 instance, you must uncomment ec2_iam_mode=enabled.

  • To use a non-default port number, you must set port_number to the desired port.

The local cache options are configured under [Cache]:

  • The cache_size option is set to 2 GB by default.

  • The path option is set to /var/lib/columnstore/storagemanager/cache by default.

Ensure that the specified path has sufficient storage space for the specified cache size.

Shared Local Storage

MariaDB Enterprise ColumnStore can use shared local storage.

Shared local storage is required for high availability. The specific shared local storage requirements depend on whether Enterprise ColumnStore is configured to use S3-compatible object storage:

  • When S3-compatible object storage is used, Enterprise ColumnStore requires the Storage Manager directory to use shared local storage for high availability.

  • When S3-compatible object storage is not used, Enterprise ColumnStore requires the DB Root directories to use shared local storage for high availability.

The most common shared local storage options for on-premises and cloud deployments are:

  • NFS (Network File System)

  • GlusterFS

The most common shared local storage options for AWS (Amazon Web Services) deployments are:

  • EBS (Elastic Block Store) Multi-Attach

  • EFS (Elastic File System)

The most common shared local storage option for GCP (Google Cloud Platform) deployments is:

  • Filestore

Shared Local Storage Options

The most common options for shared local storage are:

Shared Local Storage

Description

EBS (Elastic Block Store) Multi-Attach

  • EBS is a high-performance block-storage service for AWS (Amazon Web Services).

  • EBS Multi-Attach allows an EBS volume to be attached to multiple instances in AWS. Only clustered file systems, such as GFS2, are supported.

  • For deployments in AWS, EBS Multi-Attach is a recommended option for the Storage Manager directory, and Amazon S3 storage is the recommended option for data.

EFS (Elastic File System)

  • EFS is a scalable, elastic, cloud-native NFS file system for AWS (Amazon Web Services).

  • For deployments in AWS, EFS is a recommended option for the Storage Manager directory, and Amazon S3 storage is the recommended option for data.

Filestore

NFS (Network File System)

  • NFS is a distributed file system.

  • If NFS is used, the storage should be mounted with the sync option.

  • For on-premises deployments, NFS is the recommended option for the Storage Manager directory, and any S3-compatible storage is the recommended option for data.

GlusterFS

  • GlusterFS is a stable and well-designed distributed file system.

  • GlusterFS supports replication and failover.

Directories Requiring Shared Local Storage for HA

Multi-node MariaDB Enterprise ColumnStore requires some directories to use shared local storage for high availability. The specific requirements depend on if MariaDB Enterprise ColumnStore is configured to use S3-compatible object storage:

Using S3-Compatible Object Storage?

Directories to use Shared Local Storage

Yes

Storage Manager directory

No

DB Root directories

DB Root Directories

MariaDB Enterprise ColumnStore stores data in DB Root directories when S3-compatible object storage is not configured.

In multi-node Enterprise ColumnStore, each node has its own DB Root directory.

The DB Root directories are at the following path by default:

  • /var/lib/columnstore/dataN

The N in dataN represents a range of integers that starts at 1 and stops at the number of nodes in the deployment. For example, with a 3-node Enterprise ColumnStore deployment, this would refer to the following directories:

  • /var/lib/columnstore/data1

  • /var/lib/columnstore/data2

  • /var/lib/columnstore/data3

To enable high availability for the DB Root directories, each directory should be mounted on every ColumnStore node using shared local storage.

EBS (Elastic Block Service) Multi-Attach

EBS is a high-performance block-storage service for AWS (Amazon Web Services). EBS Multi-Attach allows an EBS volume to be attached to multiple instances in AWS. Only clustered file systems, such as GFS2, are supported. For Enterprise ColumnStore deployments in AWS, EBS Multi-Attach is a recommended option for the Storage Manager directory and Amazon S3 storage is the recommended option for data.

Consult the vendor's documentation for details on how to configure EBS Multi-Attach.

EFS (Elastic File System)

EFS is a scalable, elastic, cloud-native NFS file system for AWS (Amazon Web Services). For Enterprise ColumnStore deployments in AWS, EFS is a recommended option for the Storage Manager directory and Amazon S3 storage is the recommended option for data.

Consult the vendor's documentation for details on how to configure EFS.

Filestore

Filestore is high-performance, fully managed storage for for GCP (Google Cloud Platform). For Enterprise ColumnStore deployments in GCP, Filestore is the recommended option for the Storage Manager directory and Google Object Storage (S3-compatible) is the recommended option for data.

Consult the vendor's documentation for details on how to configure Filestore.

NFS (Network File System)

NFS is an easy-to-use distributed file system. NFS is available in most Linux distributions. If NFS is used for an Enterprise ColumnStore deployment, the storage should be mounted with the sync option. For on-premises deployments, NFS is the recommended option for the Storage Manager directory and any S3-compatible storage is the recommended option for data.

Consult the vendor's documentation for details on how to configure NFS.

GlusterFS

GlusterFS is a stable and well-designed distributed file system. It is one of the options to choose as your shared local storage, but it is not the only option.

Note

GlusterFS is a shared local storage option, but it is not one of the recommended options. For more information, see Recommended Storage Options.

To use GlusterFS:

  1. Install GlusterFS on each cluster node.

  2. Add cluster nodes as peers

  3. Create and mount the GlusterFS volumes.

GlusterFS Installation

GlusterFS must be installed on each Enterprise ColumnStore cluster node.

The specific steps to install GlusterFS depend on the platform.

Install GlusterFS via YUM (RHEL/CentOS)

To install GlusterFS, perform the following steps on each cluster node.

  1. Install the repository for GlusterFS.

    On CentOS 8:

    $ sudo yum install --enablerepo=PowerTools centos-release-gluster7
    

    On CentOS 7:

    $ sudo yum install centos-release-gluster7
    
  2. Install GlusterFS.

    On CentOS 8:

    $ sudo yum install --enablerepo=PowerTools glusterfs-server
    

    On CentOS 7:

    $ sudo yum install glusterfs-server
    
  3. Start GlusterFS and configure it to start automatically:

    $ sudo systemctl start glusterd
    $ sudo systemctl enable glusterd
    

Install GlusterFS via APT (Ubuntu)

Install GlusterFS by performing the following steps on each cluster node.

  1. Install a dependency:

    $ sudo apt install software-properties-common
    
  2. Install the repository for GlusterFS:

    $ sudo add-apt-repository ppa:gluster/glusterfs-7
    $ sudo apt update
    
  3. Install GlusterFS:

    $ sudo apt install glusterfs-server
    
  4. Start GlusterFS and configure it to start automatically:

    $ sudo systemctl start glusterd
    $ sudo systemctl enable glusterd
    

Install GlusterFS via APT (Debian)

Install GlusterFS by performing the following steps on each cluster node.

  1. Add the GlusterFS GPG key to APT:

    $ wget -O - https://download.gluster.org/pub/gluster/glusterfs/LATEST/rsa.pub | apt-key add -
    
  2. Install the repository for GlusterFS:

    $ DEBID=$(grep 'VERSION_ID=' /etc/os-release | cut -d '=' -f 2 | tr -d '"')
    $ DEBVER=$(grep 'VERSION=' /etc/os-release | grep -Eo '[a-z]+')
    $ DEBARCH=$(dpkg --print-architecture)
    $ echo deb https://download.gluster.org/pub/gluster/glusterfs/LATEST/Debian/${DEBID}/${DEBARCH}/apt ${DEBVER} main > /etc/apt/sources.list.d/gluster.list
    $ sudo apt update
    
  3. Install GlusterFS:

    $ sudo apt install glusterfs-server
    
  4. Start GlusterFS and configure it to start automatically:

    $ sudo systemctl start glusterd
    $ sudo systemctl enable glusterd
    

Install GlusterFS via ZYpp (SLES)

Install GlusterFS by performing the following steps on each cluster node.

  1. Install GlusterFS:

    $ sudo zypper install glusterfs
    
  2. Start GlusterFS and configure it to start automatically:

    $ sudo systemctl start glusterd
    $ sudo systemctl enable glusterd
    

Probe the GlusterFS Peers

Before you can create a volume with GlusterFS, you need to probe each cluster node from a peer cluster node.

  1. On the primary node, probe all of the other cluster nodes:

    $ sudo gluster peer probe mcs2
    $ sudo gluster peer probe mcs3
    
  2. On one of the replica nodes, probe the primary node:

    $ sudo gluster peer probe mcs1
    
  3. On the primary node, check the peer status:

    $ sudo gluster peer status
    Number of Peers: 2
    
    Hostname: mcs2
    Uuid: 3c8a5c79-22de-45df-9034-8ae624b7b23e
    State: Peer in Cluster (Connected)
    
    Hostname: mcs3
    Uuid: 862af7b2-bb5e-4b1c-8311-630fa32ed451
    State: Peer in Cluster (Connected)
    

GlusterFS Volumes

Before GlusterFS can be used with MariaDB Enterprise ColumnStore, the GlusterFS volumes need to be created. Each volume should have the same number of replicas as the number of ColumnStore nodes.

The required volumes are defined in the Directories Requiring Shared Local Storage for HA section, and they depend on whether you are using S3-compatible object storage or shared local storage:

GlusterFS Volumes for S3-Compatible Object Storage

If you are using S3-compatible object storage, the following directories need to use GlusterFS:

Perform the following procedure:

  1. On each cluster node, create the directory for each brick under the /brick directory:

    $ sudo mkdir -p /brick/storagemanager
    
  2. On the primary server, create the Gluster volumes:

    $ sudo gluster volume create storagemanager \
       replica 3 \
       mcs1:/brick/storagemanager \
       mcs2:/brick/storagemanager \
       mcs3:/brick/storagemanager \
       force
    
  3. On the primary server, start the volumes:

    $ sudo gluster volume start storagemanager
    
  4. On each cluster node, create mount points for the volumes:

    $ sudo mkdir -p /var/lib/columnstore/storagemanager
    
  5. On each cluster node, add the mount points to /etc/fstab:

    127.0.0.1:storagemanager /var/lib/columnstore/storagemanager glusterfs defaults,_netdev 0 0
    
  6. On each cluster node, mount the volumes with the mount utility:

    $ sudo mount -a
    

GlusterFS Volumes without S3-Compatible Object Storage

If you are not using S3-compatible object storage, the following directories need to use GlusterFS:

Perform the following procedure:

  1. On each cluster node, create the directory for each brick under the /brick directory:

    $ sudo mkdir -p /brick/data1
    $ sudo mkdir -p /brick/data2
    $ sudo mkdir -p /brick/data3
    
  2. On the primary server, create the Gluster volumes:

    $ sudo gluster volume create data1 \
       replica 3 \
       mcs1:/brick/data1 \
       mcs2:/brick/data1 \
       mcs3:/brick/data1 \
       force
    $ sudo gluster volume create data2 \
       replica 3 \
       mcs1:/brick/data2 \
       mcs2:/brick/data2 \
       mcs3:/brick/data2 \
       force
    $ sudo gluster volume create data3 \
       replica 3 \
       mcs1:/brick/data3 \
       mcs2:/brick/data3 \
       mcs3:/brick/data3 \
       force
    
  3. On the primary server, start the volumes:

    $ sudo gluster volume start data1
    $ sudo gluster volume start data2
    $ sudo gluster volume start data3
    
  4. On each cluster node, create mount points for the volumes:

    $ sudo mkdir -p /var/lib/columnstore/data1
    $ sudo mkdir -p /var/lib/columnstore/data2
    $ sudo mkdir -p /var/lib/columnstore/data3
    
  5. On each cluster node, add the mount points to /etc/fstab:

    127.0.0.1:data1 /var/lib/columnstore/data1 glusterfs defaults,_netdev 0 0
    127.0.0.1:data2 /var/lib/columnstore/data2 glusterfs defaults,_netdev 0 0
    127.0.0.1:data3 /var/lib/columnstore/data3 glusterfs defaults,_netdev 0 0
    
  6. On each cluster node, mount the volumes with the mount utility:

    $ sudo mount -a
    

Non-Shared Local Storage

MariaDB Enterprise ColumnStore supports non-shared local storage for single-node deployments.