XtraDB/InnoDB Storage Formats
The XtraDB/InnoDB storage engine supports four different table storage formats.
These are COMPACT, REDUNDANT, COMPRESSED and DYNAMIC, and they can be set with the ROW FORMAT option in the CREATE TABLE statement. COMPACT is the default format.
Compact is the default format, and suitable for general use if Antelope file format is used. It was introduced in MySQL 5.0.
In the Compact storage format (as in Redundant) BLOB and TEXT columns are partly stored in the row page. At least 767 bytes are stored in the row; values which exceed this value are are stored in dedicated pages. Since Compact and Redundant rows maximum size is about 8000 bytes, this limits the number of BLOB or TEXT columns that can be used in a table. Each BLOB page is 16KB, regardless the size of the data.
Other columns can be stored in different pages too, if they exceed the row page's size limit.
Redundant is the old, non-compacted format supported by old versions of MySQL. It was the only available format before MySQL 5.0, and was replaced as the default in MySQL 5.0.3. It is recommended to set innodb_strict_mode when using this format.
Dynamic tables contain records of a variable length, resulting in more efficient data storage than compact or redundant, especially for tables containing BLOBs, although less than the compressed format.
It can only be used with the newer XtraDB Barracuda file format, and requires tables and indexes to be stored in their own tablespaces, so can only be enabled if the system variables innodb_file_per_table=1 and innodb_file_format=barracuda. It is also recommended to set innodb_strict_mode when using this format.
The new row size limit is 65,535 bytes.
With Dynamic (and Compressed) storage format, BLOB and TEXT columns are stored differently from Compact. If the values cannot be entirely contained in the row page, only a pointer to a dedicated page is stored. Each external page contains part of the data and a pointer to the next page, if any. Pointers are 20KB. This allows to store a high number of BLOB or TEXT columns in a table.
The compressed format results in the smallest data size in most cases. It can only be used with the newer XtraDB Barracuda file format, and requires tables and indexes to be stored in their own tablespaces, so can only be enabled if the system variables innodb_file_per_table=1 and innodb_file_format=barracuda. It is also recommended to set innodb_strict_mode when using this format.
Using the compressed format also reduces the default KEY_BLOCK_SIZE. If KEY_BLOCK_SIZE is ommitted from the CREATE TABLE or ALTER TABLE statement, it will default to 8KB - usually it is 16 (see innodb_page_size). It is also possible to set KEY_BLOCK_SIZE to 1KB, 2KB, 4KB or 16KB. Setting to 16, the regular size, will usually result in minimal compression unless there are many long BLOB, TEXT or VARCHAR columns.
MariaDB 10.1 permits innodb_page_size to be set to 32k and 64k, based on a request for a larger size to permit more blob fields, but compressed tables can still only be 16k or less. Increasing the size for compressed tables will be revisited at a later stage.
Note that specifying a KEY_BLOCK_SIZE in an InnoDB table definition will automatically result in the table being compressed - it is not necessary to specify
To avoid compressing and uncompressing pages too many times, XtraDB/InnoDB tries to keep in the buffer pool both compressed and uncompressed pages, when there is enough room. This results in a bigger cache. When there is not enough room, an adaptive LRU algorithm is used to decide whether compressed or uncompressed pages should be evicted from the buffer: for CPU-bound workloads, the compressed pages are evicted first; for I/O-bound workloads, the uncompressed pages are evicted first. Of course, when necessary, both the compressed and uncompressed version of the same data can be evicted from the buffer.
Each compressed page has an uncompressed modification log, stored within the page itself. XtraDB/InnoDB writes small changes into it. When the space in the modification log runs out, the page is uncompressed, changes are applied, and the page is recompressed again. This is done to avoid some unnecessary uncompressing and recompressing operations.
Sometimes a compression failure might happen, because the data has grown too much to fit the page. When this happens, the page (and the index node) is split into two different pages. This process can be repeated recursively until the data fit the pages. This can be CPU-consuming on some busy servers which perform many write operations.
Before writing a compressed page into a data file, XtraDB/InnoDB writes it into the redo log. This ensures that the redo log can always be used to recover tables after a crash, even if the compression library is updated and some incompatibilities are introduced. But this also means that the redo log will grow faster and might need more space, or the frequency of checkpoints might need to increase.
INFORMATION_SCHEMA tables can be used to monitor the performances of XtraDB/InnoDB compressed tables:
- INNODB_CMP and INNODB_CMP_RESET
- INNODB_CMP_PER_INDEX and INNODB_CMP_PER_INDEX_RESET
- INNODB_CMPMEM and INNODB_CMPMEM_RESET
SET SESSION innodb_strict_mode = ON; CREATE TABLE compressed_table (x INT PRIMARY KEY) ENGINE=InnoDB ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=4; -- check that the table is compressed SELECT ROW_FORMAT FROM information_schema.INNODB_SYS_TABLES WHERE NAME = 'test/compressed_table'; +------------+ | ROW_FORMAT | +------------+ | Compressed | +------------+