ldi

Overview

In 24.02, 23.08:

S3 data loading filter

In 24.02:

S3 data loading filter

In 23.02, 22.08, 6, 2.5, 2.4:

Not present

In 22.08:

Not present

USAGE

DETAILS

The LDI filter enables data loading from S3 API-compatible object storage such as AWS S3, Google Cloud Storage, or locally-run storage like MinIO. (MXS-4618)

  • File paths that start with S3:// or gs:// are interpreted as S3 object files.

  • For example, after filter setup, the following SQL statement would would load the file my-data.csv from the bucket my-bucket into the table t1:

    LOAD DATA INFILE 'S3://my-bucket/my-data.csv' INTO TABLE t1
      FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n';
    

To Upload Data from S3 to MariaDB Server

  1. Configure LDI filter parameters.

    For example, this is a minimal LDI filter configuration for loading data from AWS S3:

    [LDI-Filter]
    type=filter
    module=ldi
    host=s3.amazonaws.com
    region=us-east-1
    
  2. Move the file to be loaded into the same region as the MaxScale and the MariaDB servers. Network latency is a factor in upload speed. Moving the source and the destination closer can improve data loading speed.

  3. Connect to MaxScale and prepare the session for an upload by configuring the credentials. For example:

    SET @maxscale.ldi.s3_key='<my-access-key>', @maxscale.ldi.s3_secret='<my-secret-key>';
    
  4. When the credentials are configured, begin the data load. For example:

    LOAD DATA INFILE 'S3://my-bucket/my-data.csv' INTO TABLE t1;
    

To Upload Data from S3 to MariaDB Xpand

Uploads to MariaDB Xpand are done using xpand_import. xpand_import must be installed locally on the MaxScale server and must be in the executable path of the MaxScale user.

Uploading S3 to MariaDB Xpand, requires the @maxscale.ldi.import_user and @maxscale.ldi.import_password parameters to be set to the username and password that are used to load data into the Xpand cluster. For example:

SET @maxscale.ldi.import_user='<user>', @maxscale.ldi.import_password='<password>';

If a LOAD DATA LOCAL INFILE command is executed with an Xpand cluster, the data is redirected into xpand_import. This speeds up data imports into Xpand. For this mode, only the @maxscale.ldi.import_user and @maxscale.ldi.import_password parameters must be set. All other S3 related variables are ignored.

If xpand_import is not installed locally, the LOAD DATA INFILE and LOAD DATA LOCAL INFILE commands will not use xpand_import and will use LOAD DATA LOCAL INFILE with MariaDB server behavior.

SYNONYMS

SCHEMA

PARAMETERS

If parameters are going to be configured, they must be defined before beginning data load.

They can be defined in the MaxScale configuration file and/or with a SET statement.

Using a SET statement will override the current value of the parameter.

Parameter

Description

Type

Required

Dynamic

Default

key

The S3 access key used to perform all requests to the S3.

String

No

Yes

secret

The S3 secret key used to perform all requests to the S3.

String

No

Yes

region

The S3 region where the data is located.

String

No

Yes

us-east-1 Can be overridden before starting data load.

host

The location of the S3 object storage.

String

No

Yes

s3.amazonaws.com The corresponding value for Google Cloud Storage is storage.googleapis.com.

port

The port on which the S3 object storage is listening. If not set, or set to the value of 0, the default S3 port is used.

Integer

No

Yes

0

no_verify

Specifies if TLS certificate verification should be executed. If set to true, TLS certificate verification for the object storage is skipped.

boolean

No

Yes

false

use_http

Specifies if communication with the object storage is done encrypted or unencrypted. If set to true, communication with the object storage is done unencrypted using HTTP instead of HTTPS.

boolean

No

Yes

False

protocol_version

Which protocol version to use. Allowable values are 0, 1, and 2.

integer

No

Yes

0

import_user

The Xpand user that will be used to import the data. This parameter must be defined when uploading data to a MariaDB Xpand cluster.

string

no

yes

import_password

The password for the Xpand user that will be used to import the data. This parameter must be defined if the data is being uploaded to a MariaDB Xpand cluster. The password can be encrypted with maxpasswd before use.

password

no

yes

key

The S3 access key used to perform all requests to the S3.

To set in MaxScale configuration file:

[LDI-Filter]
type=filter
module=ldi
host=s3.amazonaws.com
region=us-east-1
key=my-S3-access-key

To set dynamically:

SET @maxscale.ldi.s3_key='<my-S3-access-key>'

secret

The S3 secret key used to perform all requests to the S3.

To set in MaxScale configuration file:

[LDI-Filter]
type=filter
module=ldi
host=s3.amazonaws.com
region=us-east-1
secret=my-S3-secret-key

To set dynamically:

SET @maxscale.ldi.s3_secret='<my-S3-secret-key>'

region

The S3 region where the data is located.

To set in MaxScale configuration file:

[LDI-Filter]
type=filter
module=ldi
host=s3.amazonaws.com
region=us-east-1
region=region-where-data-resides

To set dynamically:

SET @maxscale.ldi.s3_region='<region-where-data-resides>'

host

The location of the S3 object storage.

The original AWS S3 host is used by default. For Google Cloud Storage, the value is storage.googleapis.com.

To set in MaxScale configuration file:

[LDI-Filter]
type=filter
module=ldi
host=s3.amazonaws.com
region=us-east-1
host=host-name

To set dynamically:

SET @maxscale.ldi.s3_host='<host-name>'

port

The port on which the S3 object storage is listening.

If not set or set to the value of 0, the default S3 port is used.

The value must be specified as a SQL integer not a SQL string.

To set in MaxScale configuration file:

[LDI-Filter]
type=filter
module=ldi
host=s3.amazonaws.com
region=us-east-1
port=port-number

To set dynamically:

SET @maxscale.ldi.s3_port='<port-number>'

no_verify

Specifies if TLS certificate verification is used for the object storage.

If set to false (the default), TLS certificate verification for the object storage is used.

If set to true, TLS certificate verification for the object storage is skipped.

To set in MaxScale configuration file:

[LDI-Filter]
type=filter
module=ldi
host=s3.amazonaws.com
region=us-east-1
no_verify=true

To set dynamically:

SET @maxscale.ldi.s3_no_verify='<value>'

use_http

Specifies if communication with the object storage is done unencrypted using HTTP or is done with HTTPS.

If set to true, communication with the object storage is unencrypted using HTTP.

If set to false (the default), communication with the object storage is encrypted using HTTPS.

To set in MaxScale configuration file:

[LDI-Filter]
type=filter
module=ldi
host=s3.amazonaws.com
region=us-east-1
use_http=true

To set dynamically:

SET @maxscale.ldi.s3_use_http='<value>'

protocol_version

Specifies which protocol version to use.

By default the protocol version is derived from the value of host but the automatic protocol version deduction will not always produce the correct result.

For legacy path-style requests used by older S3 storage buckets, the value must be set to 1. All new buckets use the protocol version 2.

For object storage programs like MinIO, the value must be set to 1 as the bucket name cannot be resolved via the subdomain like it is done for object stores in the cloud.

To set in MaxScale configuration file:

[LDI-Filter]
type=filter
module=ldi
host=s3.amazonaws.com
region=us-east-1
protocol_version=2

To set dynamically:

SET @maxscale.ldi.s3_protocol_version='<value>'

import_user

The Xpand user that will be used to upload data to a MariaDB Xpand cluster.

The import_user and import_password parameters must be defined to upload data to MariaDB Xpand. The Xpand user that will be used to upload data to an Xpand cluster.

The import_user parameter must be configured before starting data load.

To set in MaxScale configuration file:

[LDI-Filter]
type=filter
module=ldi
host=s3.amazonaws.com
region=us-east-1
import_user=xpand-user-name
import_password=xpand-user-password

To set dynamically:

SET @maxscale.ldi.import_user='<xpand-user-name>'

import_password

The password for the Xpand user that will be used to upload data to a MariaDB Xpand cluster.

The import_password and the import_user parameters must be defined to upload data to MariaDB Xpand.

The import_password parameter must be configured before starting data load.

To set in MaxScale configuration file:

[LDI-Filter]
type=filter
module=ldi
host=s3.amazonaws.com
region=us-east-1
import_user=xpand-user-name
import_password=xpand-user-password

To set dynamically:

SET @maxscale.ldi.import_password='<xpand-user-password>'

SKYSQL

PRIVILEGES

EXAMPLES

ERROR HANDLING

FEATURE INTERACTION

RESPONSES

DIAGNOSIS

ISO 9075:2016

CHANGE HISTORY

Release Series

History

24.02

  • Present starting in MariaDB MaxScale 24.02.0.

23.08

  • Added in MariaDB MaxScale 23.08.1.

23.02

  • Not present.

22.08

  • Not present.

6

  • Not present.

2.5

  • Not present.

2.4

  • Not present.

Release Series

History

24.02

  • Present starting in MariaDB MaxScale 24.02.0.

22.08

  • Not present.

EXTERNAL REFERENCES