MariaDB Enterprise ColumnStore Data Loading with load_from_s3
This page is part of MariaDB's Documentation.
The parent of this page is: Data Import with MariaDB Enterprise ColumnStore
Topics on this page:
Overview
MariaDB Enterprise ColumnStore includes a stored procedure called columnstore_info.load_from_s3
, which can load data from a plain-text file containing delimiter-separated values (such as CSV or TSV) stored on S3-compatible cloud object storage.
Compatibility
MariaDB Enterprise ColumnStore 23.02
System Variables
Before you import data with the columnstore_info.load_from_s3
stored procedure, the authentication credentials and the region can be configured using system variables:
For example, the following statements show how to set the system variables for your current session:
SET columnstore_s3_key='S3_KEY';
SET columnstore_s3_secret='S3_SECRET';
SET columnstore_s3_region='S3_REGION';
Import Data
To import data with the columnstore_info.load_from_s3
stored procedure, use the CALL
statement:
CALL columnstore_info.load_from_s3('BUCKET_URL',
'FILE_NAME',
'DATABASE_NAME',
'TABLE_NAME',
'DELIMITED_BY',
'ENCLOSED_BY',
'ESCAPED_BY');
Replace
'BUCKET_URL'
with the URL of your bucket. The protocol in the URL must bes3://
for AWS S3 orgs://
for Google Cloud StorageReplace
'FILE_NAME'
with the file name to load from. The file must be a plain-text file containing delimiter-separated values, such as a comma-separated values (CSV) or tab-separated values (TSV) file. The supported file format is similar to the plain-text file formats supported bycpimport
andLOAD DATA [LOCAL] INFILE
. Please note that this stored procedure can't load dump files created bymariadb-dump
Replace
'DATABASE_NAME'
with the database to import intoReplace
'TABLE_NAME'
with the table name to import intoReplace
'DELIMITED_BY'
with the field terminator used in the file, similar to the-s
command-line option forcpimport
Replace
'ENCLOSED_BY'
with the quotes used in the file, similar to the-E
command-line option forcpimport
Replace
'ESCAPED_BY'
with the escape character used in the file, similar to the-C
command-line option forcpimport
All parameters are mandatory. The line terminator is determined automatically, so it is not required.
For example, to load a comma-separated values (CSV) file from AWS S3:
CALL columnstore_info.load_from_s3('s3://mariadb-columnstore-test-data/',
'test-data-db1-tab1.csv',
'db1',
'tab1',
',',
'"',
'\\');
When the stored procedure completes, it returns JSON containing the status of the operation. If the JSON shows an error or "success": false
, check your table to see if some or all of your data was loaded, because many errors are non-fatal.
Permissions
When the data file is stored in Amazon S3, the AWS user only requires the s3:GetObject
action on the bucket.
For example, the AWS user can use a user policy like the following:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "readBucket",
"Effect": "Allow",
"Action": [
"s3:GetObject"
],
"Resource": [
"arn:aws:s3:::my-bucket",
"arn:aws:s3:::my-bucket/*"
]
}
]
}