MariaDB Enterprise ColumnStore Data Loading with load_from_s3
This page is part of MariaDB's Documentation.
The parent of this page is: Data Import with MariaDB Enterprise ColumnStore
Topics on this page:
Overview
MariaDB Enterprise ColumnStore includes a stored procedure called columnstore_info.load_from_s3, which can load data from a plain-text file containing delimiter-separated values (such as CSV or TSV) stored on S3-compatible cloud object storage.
Compatibility
MariaDB Enterprise ColumnStore 23.02
System Variables
Before you import data with the columnstore_info.load_from_s3 stored procedure, the authentication credentials and the region can be configured using system variables:
For example, the following statements show how to set the system variables for your current session:
SET columnstore_s3_key='S3_KEY';
SET columnstore_s3_secret='S3_SECRET';
SET columnstore_s3_region='S3_REGION';
Import Data
To import data with the columnstore_info.load_from_s3 stored procedure, use the CALL statement:
CALL columnstore_info.load_from_s3('BUCKET_URL',
'FILE_NAME',
'DATABASE_NAME',
'TABLE_NAME',
'DELIMITED_BY',
'ENCLOSED_BY',
'ESCAPED_BY');
Replace
'BUCKET_URL'with the URL of your bucket. The protocol in the URL must bes3://for AWS S3 orgs://for Google Cloud StorageReplace
'FILE_NAME'with the file name to load from. The file must be a plain-text file containing delimiter-separated values, such as a comma-separated values (CSV) or tab-separated values (TSV) file. The supported file format is similar to the plain-text file formats supported bycpimportandLOAD DATA [LOCAL] INFILE. Please note that this stored procedure can't load dump files created bymariadb-dumpReplace
'DATABASE_NAME'with the database to import intoReplace
'TABLE_NAME'with the table name to import intoReplace
'DELIMITED_BY'with the field terminator used in the file, similar to the-scommand-line option forcpimportReplace
'ENCLOSED_BY'with the quotes used in the file, similar to the-Ecommand-line option forcpimportReplace
'ESCAPED_BY'with the escape character used in the file, similar to the-Ccommand-line option forcpimport
All parameters are mandatory. The line terminator is determined automatically, so it is not required.
For example, to load a comma-separated values (CSV) file from AWS S3:
CALL columnstore_info.load_from_s3('s3://mariadb-columnstore-test-data/',
'test-data-db1-tab1.csv',
'db1',
'tab1',
',',
'"',
'\\');
When the stored procedure completes, it returns JSON containing the status of the operation. If the JSON shows an error or "success": false, check your table to see if some or all of your data was loaded, because many errors are non-fatal.
Permissions
When the data file is stored in Amazon S3, the AWS user only requires the s3:GetObject action on the bucket.
For example, the AWS user can use a user policy like the following:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "readBucket",
"Effect": "Allow",
"Action": [
"s3:GetObject"
],
"Resource": [
"arn:aws:s3:::my-bucket",
"arn:aws:s3:::my-bucket/*"
]
}
]
}
