ColumnStore Streaming Data Adapters
Contents
The ColumnStore Bulk Data API enable creation of higher performance adapters for ETL integration and data ingestions. The Streaming Data Adapters are out of box adapters using these API for specific data sources and use cases.
- MaxScale CDC Data Adapter is integration of the MaxScale CDC streams into MariaDB ColumnStore.
- Kafka Data Adapter is integration of the Kafka streams into MariaDB ColumnStore.
MaxScale CDC Data Adapter
Installation
CentOS 7
sudo yum -y install epel-release sudo yum -y install <data adapter>.rpm
Debian 9/Ubuntu Xenial:
sudo apt-get update sudo dpkg -i <data adapter>.deb sudo apt-get -f install
Debian 8:
sudo echo "deb http://httpredir.debian.org/debian jessie-backports main contrib non-free" >> /etc/apt/sources.list sudo apt-get update sudo dpkg -i <data adapter>.deb sudo apt-get -f install
Usage
Usage: mxs_adapter [OPTION]... DATABASE TABLE DATABASE Source & Target database TABLE Table to stream -h HOST MaxScale host -P PORT Port number where the CDC service listens -u USER Username for the MaxScale CDC service -p PASSWORD Password of the user -c CONFIG Path to the Columnstore.xml file (installed by MariaDB ColumnStore) -r ROWS Number of events to group for one bulk load (default: 1) -t TIMEOUT Timeout in seconds (default: 10)
Quick Start
Download and install both MaxScale and ColumnStore.
Copy the Columnstore.xml file from
/usr/local/mariadb/columnstore/etc/Columnstore.xml
from one of the ColumnStore UM or PM node to the server where the adapter is installed.
Configure MaxScale according to the CDC tutorial.
Create a CDC user by executing the following MaxAdmin command on the MaxScale server. Replace the `<service>` with the name of the avrorouter service and `<user>` and `<password>` with the credentials that are to be created.
maxadmin call command cdc add_user <service> <user> <password>
Then we can start the adapter by executing the following command.
mxs_adapter -u <user> -p <password> -h <host> -P <port> -c <path to Columnstore.xml> <database><table>
The `<database>` and `<table>` define the table that is streamed to ColumnStore. This table should exist on the master server where MaxScale is reading events from. If the table is not created on ColumnStore, the adapter will print instructions on how to define it in the correct way.
The `<user>` and `<password>` are the users created for the CDC user, `<host>` is the MaxScale address and `<port>` is the port where the CDC service listener is listening.
The `-c` flag is optional if you are running the adapter on the server where ColumnStore is located.
Kafka Data Adapter
Installation
CentOS 7
sudo yum -y install epel-release sudo yum -y install <data adapter>.rpm
Debian 9/Ubuntu Xenial:
sudo apt-get update sudo dpkg -i <data adapter>.deb sudo apt-get -f install
Debian 8:
sudo echo "deb http://httpredir.debian.org/debian jessie-backports main contrib non-free" >> /etc/apt/sources.list sudo apt-get update sudo dpkg -i <data adapter>.deb sudo apt-get -f install