History Source Flag as Spam / Inappropriate Translate

Created: 5 years, 7 months ago
Modified: 5 years, 7 months ago
Type: article
Status: active
License: CC BY-SA / Gnu FDL

Attachments

No attachments exist

Product Versions

MariaDB starting with 10.5
MariaDB starting with 10.5

Using the S3 Storage Engine

You are viewing an old version of this article. View the current version here.

New Options for CREATE TABLE
mysqld Startup Options for S3
Typical my.cnf file entry would something like:
Typical Usage Case for S3 Tables
Operations Allowed on S3 Tables
Discovery
aria_s3_copy
Current Limitations
Future Development Ideas
Troubleshooting S3 on SELinux
See Also

MariaDB starting with 10.5

S3 is introduced in MariaDB 10.5

The S3 storage engine is read only and allows one to archive MariaDB tables in Amazon S3, or any third-party public or private cloud that implements S3 API (of which there are many), but still have them accessible for reading in MariaDB.

To move data to S3 one can do

ALTER TABLE old_table ENGINE=S3

To get data back to a 'normal' table one can do:

ALTER TABLE s3_table ENGINE=INNODB

New Options for CREATE TABLE

S3_BLOCK_SIZE : Set to 4M as default. This is the block size for all index and data pages stored in S3.
COMPRESSION_ALGORITHM : Set to 'none' as default. Which compression algorithm to use for block stored in S3. Options are: none or zlib.

mysqld Startup Options for S3

To be able to use S3 for storage one *must* define how to access S3 and where data are stored in S3:

S3_ACCESS_KEY : The AWS access key to access your data
S3_SECRET_KEY : The AWS secret key to access your data
S3_BUCKET: The AWS bucket where your data should be stored. All MariaDB table data is stored in this bucket.
S3_REGION: The AWS region where your data should be stored

Other, less critical options, are:

S3_BLOCK_SIZE : Set to 4M as default. This is the default block size for a table, if not specified in CREATE TABLE.
PAGECACHE_BUFFER_SIZE : Default 128M. The size of the buffer used for index blocks for S3 tables. Increase this to get better index handling (for all reads and multiple writes) to as much as you can afford.

Last some options you probably don't have to ever touch:

PAGECACHE_AGE_THRESHOLD : Default 300: This characterizes the number of hits a hot block has to be untouched until it is considered aged enough to be downgraded to a warm block. This specifies the percentage ratio of that number of hits to the total number of blocks in the page cache.
PAGECACHE_DIVISION_LIMIT : Default 100. The minimum percentage of warm blocks in key cache.
pagecache_file_hash_size : Default 512. Number of hash buckets for open files. If you have a lot of S3 files open you should increase this for faster flush of changes. A good value is probably 1/10 of number of possible open S3 files.

Typical my.cnf file entry would something like:

[mysqld-10.5]
s3-bucket=mariadb
s3-access-key=xxxx
s3-secret-key=xxx
s3-region=eu-north-1

[aria_s3_copy]
s3-bucket=mariadb
s3-access-key=xxxx
s3-secret-key=xxx
s3-region=eu-north-1

Typical Usage Case for S3 Tables

The typical use case would be that there exists tables that after some time would become fairly inactive, but are still important so that they can not be removed. In that case, an option is to move such a table to an archiving service, which is accessible through an S3 API.

Notice that S3 means the Cloud Object Storage API defined by Amazon AWS. Often the whole of Amazon’s Cloud Object Storage is referred to as S3. In the context of the S3 archive storage engine, it refers to the API itself that defines how to store objects in a cloud service, being it Amazon’s or someone else’s. OpenStack for example provides an S3 API for storing objects.

The main benefit of storing things in an S3 compatible storage is that the cost of storage is much cheaper than many other alternatives. Many S3 implementations also provide reliable long-term storage.

Operations Allowed on S3 Tables

ALTER TABLE S3 supports all types, keys and other options that are supported by the Aria engine.
DROP TABLE
SELECT Any SELECT operations you can perform on a normal table should work with an S3 table.
SHOW TABLES will show all tables that exist in the current defined S3 location.

Discovery

The S3 storage engine supports full MariaDB discovery. This means that if you have the S3 storage engine enabled and properly configured, the table stored in S3 will be automatically discovered when it's accessed with [show-tables|SHOW TABLES]], SELECT or any other operation tries to access it. In the case of SELECT, the .frm file from S3 will be copied to the local storage to speed up future accesses.

When an S3 table is opened for the first time (it's not in the table cache) and there is a local .frm file, the S3 engine will check if it's still relevant, and if not, update or delete the .frm file.

This means that if the table definition changes on S3 and it's in the local cache, one has to execute FLUSH TABLES to get MariaDB to notice the change and update the .frm file.

aria_s3_copy

aria_s3_copy is an external tool that one can use to copy Aria tables to and from S3. Use aria_s3_copy --help to get the options of how to use it.

Current Limitations

One needs to do FLUSH TABLES for the MariaDB to notice that the definition has changed if the S3 table is in the table cache.
Replication of S3 engines to a slave that uses the same S3 storage doesn't work. The slave will stop at any ALTER TABLE ... ENGINE=S3 with an error that the table already exists in S3.
mysqldump will dump the S3 table in a format that can't be executed (CREATE TABLE is not supported for S3). Should be fixed so that mysqldump ignores S3 tables or uses CREATE and then ALTER.
mysql-test-run doesn't by default test the S3 engine as we can't embed AWS keys into mtr

Future Development Ideas

Add MERGE support for S3 tables (possible with another storage engine name). - Store aws keys and region in the mysql.servers table (as Spider and FederatedX). This will allow one to have different tables on different S3 servers. - Make s3 bucket, access_key and secret key thread specific

Troubleshooting S3 on SELinux

If you get errors like

ERROR 3 (HY000): Got error from put_object(bubu/produkt/frm): 5 Couldn't connect to server

One reason could be that your system doesn't allow MAaiaDB to connect to other ports than 3006. The fix is to in /etc/selinux/config change to SELINUX=disabled.

Comments

Comments loading...

Content reproduced on this site is the property of its respective owners, and this content is not reviewed in advance by MariaDB. The views, information and opinions expressed by this content do not necessarily represent those of MariaDB or any other party.

Using the S3 Storage Engine

Contents

MariaDB starting with 10.5

New Options for CREATE TABLE

mysqld Startup Options for S3

Typical my.cnf file entry would something like:

Typical Usage Case for S3 Tables

Operations Allowed on S3 Tables

Discovery

aria_s3_copy

Current Limitations

Future Development Ideas

Troubleshooting S3 on SELinux

See Also

Comments

Products

Services

Pricing

Resources

About Us

Download MariaDB