InnoDB/XtraDB Page Compression
MariaDB starting with 10.1
Page compression is an alternative way to compress your tables which is different (but similar) to the InnoDB COMPRESSED storage format. In page compression, only uncompressed pages are stored in the buffer pool. This approach differs significantly from legacy InnoDB compressed tables using
row_format=compressed, where both uncompressed and compressed pages can be in the buffer pool. In page compression, pages are actually compressed just before being written to the filespace. Similarly, pages are uncompressed when they are read from tablespace before being placed in the buffer pool. Additionally, page compression supports different compression algorithms, not only zlib. For the moment, the only engines that fully support page compression are XtraDB and InnoDB. Page compression is available only for a given table if it uses a file-per-table tablespace and if the table was created with
Page compression can be used on any storage device and any file system, but it is most beneficial on SSDs and Non-Volatile Memory (NVM) devices, such as FusionIO atomic-series devices. It also performs best when your storage device and file system support atomic writes, since that allows the doublewrite buffer to be disabled. However, it still works on any device and file system, even without atomic write support and with the doublewrite buffer enabled.
Choosing Compression Algorithm
You specify which compression algorithm to use with the innodb-compression-algorithm startup option for MariaDB. The options are:
|Default. Data is not compressed.|
|Pages are compressed with bundled zlib compression method.|
|Pages are compressed using https://code.google.com/p/lz4/ compression method.|
|Pages are compressed using http://www.oberhumer.com/opensource/lzo/ compression method.|
|Pages are compressed using http://tukaani.org/xz/ compression method.|
|Pages are compressed using http://www.bzip.org/ compression method.|
|Pages are compressed using http://google.github.io/snappy/.|
Because all of these compression methods are not available by default on all distributions and MariaDB server does not bundle them, you may need to download the desired compression method package from the above links, install the package and finally recompile MariaDB server from the source distribution with:
cmake . make make install
After the first command above, please check that cmake has found the desired compression method from your system.
The compression method can be changed whenever needed. Currently the compression method is global (i.e. you can't specify compression method/table).
set global innodb_compression_algorithm=lz4;
From this point on page compressed tables will use the lz4 compression method. This setting does not change already compressed pages that were compressed with a different compression method. This is because MariaDB supports pages that are uncompressed, compressed with e.g. lzo and compressed with e.g. lz4 in same tablespace. This is possible because every page in InnoDB tablespace contains compression method information on page header metadata.
Choosing Compression Level
You specify the default compression level to use with the innodb-compression-level startup option for MariaDB. Values are 0-9, default is 6. Note that not all compression methods allow choosing the compression level and in those cases the compression level value is ignored.
Creating Compressed Tables
By default, only tables that are specified to be compressed are actually compressed,. You can create a page compressed table by specifying
PAGE_COMPRESSED=1 in the CREATE TABLE statement, for example:
CREATE TABLE users(user_id int not null, b varchar(200), primary key(user_id)) ENGINE=innodb PAGE_COMPRESSED=1;
The innodb_compression_default system variable allows you to specify whether or not new InnoDB tables are compressed by default. It is off by default (no compression).
Normally InnoDB always writes the full page i.e. by default 16K (innodb-page-size). However, when compression is used you may use a different approach. On file systems that support
fallocate() and creating sparse files
(e.g. ext3/4, xfs, nvmfs, etc) by using
FALLOC_FL_PUNCH_HOLE, InnoDB will only write the actual compressed
page size aligned to sector size. The rest of the page is trimmed using
fallocate(file_handle, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE, file_offset, remainder_len);. This is needed because InnoDB always reads the full page size (default 16K).
Note that innodb_use_trim favors NVMFS (see InnoDB holepunch compression vs the filesystem in MariaDB 10.1).
MariaDB starting with 10.2.2
InnoDB Page Cleaners were introduced in version 10.2.2 of MariaDB.
Until MariaDB 10.2.1, MariaDB by default used a single thread to flush dirty pages from the buffer pool. From MariaDB 10.2.2, particularly when using compression on systems with multiple cores or CPU's and a fast I/O device, (such as an NVM device), MariaDB benefits from using Page Cleaners.
You can define the number of threads you want to use for page cleaning through the
innodb_page_cleaners system variables. It defaults to either 4 or the value given to the
innodb_buffer_pool_instances system variable, whichever is lower. If set to 1, MariaDB falls back to only using a single thread. Cleaner threads flush dirty pages from the buffer pool, performing flush list and LRU flushing.
innodb_page_cleaners = 8
Prior to version 10.3.2, users had access to the Multi-Threaded Flush mechanism for similar functionality. You can enable this feature using the
innodb_use_mtflush system variable and set the number of threads you want to use with
innodb_mtflush_threads. By default, it is set to 8. The current maximum is 64 threads. In multi-core systems a value similar to
innodb_buffer_pool_instances that is close to the number of cores has been shown to be effective. Use your own benchmarks to find a suitable value for your particular application.
SHOW STATUS contains new status variables that can be used to monitor compression
|Status variable name||Values||Description|
|Innodb_page_compression_saved||0||Bytes saved by compression|
|Innodb_page_compression_trim_sect512||0||Number of 512 sectors trimmed|
|Innodb_page_compression_trim_sect1024||0||Number of 1024 sectors trimmed|
|Innodb_page_compression_trim_sect2048||0||Number of 2048 sectors trimmed|
|Innodb_page_compression_trim_sect4096||0||Number of 4096 sectors trimmed|
|Innodb_page_compression_trim_sect8192||0||Number of 8192 sectors trimmed|
|Innodb_page_compression_trim_sect16384||0||Number of 16384 sectors trimmed|
|Innodb_page_compression_trim_sect32768||0||Number of 32768 sectors trimmed|
|Innodb_num_pages_page_compressed||0||Number of pages compressed|
|Innodb_num_page_compressed_trim_op||0||Number of trim operations|
|Innodb_num_page_compressed_trim_op_saved||0||Number of trim operations saved|
|Innodb_num_pages_page_decompressed||0||Number of pages decompressed|
|Innodb_num_pages_page_compression_error||0||Number of compression errors|
|Innodb_have_lz4||ON||Does system have lz4 compression method available|
|Innodb_have_lzo||ON||Does system have lzo compression method available|
|Innodb_have_lzma||ON||Does system have lzma compression method available|
|Innodb_have_bzip2||ON||Does system have bzip2 compression method available|
|Innodb_have_snappy||ON||Does system have snappy compression method available|
Keep in mind that page compression is performed when InnoDB pages are flushed to disk, so if you are monitoring page compression via these status variables, the status variables values will get incremented when the pages are flushed, which does not necessarily happen immediately. Here's an example:
CREATE TABLE `tab` ( `id` int(11) NOT NULL, `str` varchar(50) DEFAULT NULL, PRIMARY KEY (`id`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1; INSERT INTO tab VALUES (1, 'str1'); SHOW GLOBAL STATUS LIKE 'Innodb_num_pages_page_compressed'; +----------------------------------+-------+ | Variable_name | Value | +----------------------------------+-------+ | Innodb_num_pages_page_compressed | 0 | +----------------------------------+-------+ SET GLOBAL innodb_compression_algorithm=zlib; ALTER TABLE tab PAGE_COMPRESSED=1; SHOW GLOBAL STATUS LIKE 'Innodb_num_pages_page_compressed'; +----------------------------------+-------+ | Variable_name | Value | +----------------------------------+-------+ | Innodb_num_pages_page_compressed | 0 | +----------------------------------+-------+ SELECT SLEEP(10); +-----------+ | SLEEP(10) | +-----------+ | 0 | +-----------+ SHOW GLOBAL STATUS LIKE 'Innodb_num_pages_page_compressed'; +----------------------------------+-------+ | Variable_name | Value | +----------------------------------+-------+ | Innodb_num_pages_page_compressed | 3 | +----------------------------------+-------+
Note also that the written *.ibd files are sparse files, so in order to see the effects of the compression,
ls will not be sufficient, and
ls -s or
dd should be used to view the file sizes instead.
- Table compression was developed with cooperation by Fusion-io http://fusionio.com and especially Dhananjoy Das and Torben Mathiasen.