Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Explore different partitioning types for MariaDB Server tables. Understand range, list, hash, and key partitioning to optimize data management and improve query performance.
Learn about LINEAR KEY partitioning, which combines the internal key hashing with a linear algorithm for efficient partition handling.
LINEAR PARTITION BY KEY ([column_names])
[PARTITIONS (number_of_partitions)]LINEAR KEY partitioning is a form of , similar to .
LINEAR KEY partitioning makes use of a powers-of-two algorithm, while KEY partitioning uses modulo arithmetic to determine the partition number.
Adding, dropping, merging and splitting partitions is much faster than with the ; however, data is less likely to be evenly distributed over the partitions.
This page is licensed: CC BY-SA / Gnu FDL
An introduction to the various partitioning strategies available in MariaDB, helping you choose the right method for your data distribution needs.
A partitioning type determines how a partitioned table's rows are distributed across partitions. Some partition types require the user to specify a partitioning expression that determines in which partition a row are stored.
The size of individual partitions depends on the partitioning type. Read and write performance are affected by the partitioning expression. Therefore, these choices should be made carefully.
MariaDB supports the following partitioning types:
This page is licensed: CC BY-SA / Gnu FDL
It requires a column value, or an expression based on a column value, which is hashed, as well as the number of partitions into which to divide the table.
partitioning_expression needs to return a non-constant, deterministic integer. It is evaluated for each insert and update, so overly complex expressions can lead to performance issues. A hashing function operating on a single column, and where the value changes consistently with the column value, allows for easy pruning on ranges of partitions, and is usually a better choice. For this reason, using multiple columns in a hashing expression is not usually recommended.
number_of_partitions is a positive integer specifying the number of partitions into which to divide the table. If the PARTITIONS clause is omitted, the default number of partitions is one.
To determine which partition to use, perform the following calculation:
For example, if the expression is TO_DAYS(datetime_column) and the number of partitions is 5, inserting a datetime value of '2023-11-15' would determine the partition as follows:
TO_DAYS('2023-11-15') gives a value of 739204.
MOD(739204,5) returns 4, so the 4th partition is used.
HASH partitioning makes use of the modulus of the hashing function's value. The LINEAR HASH partitioning type is similar, using a powers-of-two algorithm. Data is more likely to be evenly distributed over the partitions than with the LINEAR HASH partitioning type; however, adding, dropping, merging and splitting partitions is much slower.
Using the Information Schema PARTITIONS Table for more information:
Partition Maintenance for suggestions on using partitions
This page is licensed: CC BY-SA / Gnu FDL
LINEAR HASH partitioning makes use of a powers-of-two algorithm, while HASH partitioning uses the modulus of the hashing function's value. Adding, dropping, merging and splitting partitions is much faster than with the HASH partitioning type, however, data is less likely to be evenly distributed over the partitions.
This page is licensed: CC BY-SA / Gnu FDL
PARTITION BY LINEAR HASH (partitioning_expression)
[PARTITIONS(number_of_partitions)]CREATE OR REPLACE TABLE t1 (c1 INT, c2 DATETIME)
PARTITION BY LINEAR HASH(TO_DAYS(c2))
PARTITIONS 5;CREATE OR REPLACE TABLE t1 (v1 INT)
PARTITION BY LINEAR KEY (v1)
PARTITIONS 2;Understand KEY partitioning, similar to HASH but using MariaDB's internal hashing function on one or more columns to distribute data.
Partitioning by key is a type of partitioning that is similar to and can be used in a similar way as partitioning by hash.
KEY takes an optional list of column_names, and the hashing function is given by the server.
Just like HASH partitioning, in KEY partitioning the server takes care of the partition and ensures an even distribution among the partitions. However, the largest difference is that KEY partitioning makes use of column_names, and cannot accept a partitioning_expression which is based on column_names, in contrast to HASH partitioning, which can.
If no column_names are specified, the table's primary key is used if present, or not null unique key if no primary key is present. If neither of these keys are present, not specifying any column_names will result in an error:
Unlike other partitioning types, columns used for partitioning by KEY are not limited to integer or NULL values.
KEY partitions do not support column index prefixes. Any columns in the partitioning key that make use of column prefixes are not used.
The unique key must be NOT NULL:
KEY requires column_values if no primary key or not null unique key is present:
Primary key columns with index prefixes are silently ignored, so the following two queries are equivalent:
a(5) and c(5) are silently ignored in the former.
If all columns use index prefixes, the statement fails with a slightly misleading error:
This page is licensed: CC BY-SA / Gnu FDL
PARTITION BY HASH (partitioning_expression)
[PARTITIONS(number_of_partitions)]MOD(partitioning_expression, number_of_partitions)CREATE OR REPLACE TABLE t1 (c1 INT, c2 DATETIME)
PARTITION BY HASH(TO_DAYS(c2))
PARTITIONS 5;INSERT INTO t1 VALUES (1,'2023-11-15');
SELECT PARTITION_NAME,TABLE_ROWS FROM INFORMATION_SCHEMA.PARTITIONS
WHERE TABLE_SCHEMA='test' AND TABLE_NAME='t1';
+----------------+------------+
| PARTITION_NAME | TABLE_ROWS |
+----------------+------------+
| p0 | 0 |
| p1 | 0 |
| p2 | 0 |
| p3 | 0 |
| p4 | 1 |
+----------------+------------+PARTITION BY KEY ([column_names])
[PARTITIONS (number_of_partitions)] ERROR 1488 (HY000): Field in list of fields for partition function not found in tableCREATE OR REPLACE TABLE t1 (v1 INT)
PARTITION BY KEY (v1)
PARTITIONS 2;CREATE OR REPLACE TABLE t1 (v1 INT, v2 INT)
PARTITION BY KEY (v1,v2)
PARTITIONS 2;CREATE OR REPLACE TABLE t1 (
id INT NOT NULL PRIMARY KEY,
name VARCHAR(5)
)
PARTITION BY KEY()
PARTITIONS 2;CREATE OR REPLACE TABLE t1 (
id INT NOT NULL UNIQUE KEY,
name VARCHAR(5)
)
PARTITION BY KEY()
PARTITIONS 2;CREATE OR REPLACE TABLE t1 (
id INT NULL UNIQUE KEY,
name VARCHAR(5)
)
PARTITION BY KEY()
PARTITIONS 2;
ERROR 1488 (HY000): Field in list of fields for partition function not found in tableCREATE OR REPLACE TABLE t1 (
id INT NULL UNIQUE KEY,
name VARCHAR(5)
)
PARTITION BY KEY()
PARTITIONS 2;
ERROR 1488 (HY000): Field in list of fields for partition function not found in tableCREATE OR REPLACE TABLE t1 (
id INT NULL UNIQUE KEY,
name VARCHAR(5)
)
PARTITION BY KEY(name)
PARTITIONS 2;CREATE OR REPLACE TABLE t1 (
a VARCHAR(10),
b VARCHAR(10),
c VARCHAR(10),
PRIMARY KEY (a(5), b, c(5))
) PARTITION BY KEY() PARTITIONS 2;
CREATE OR REPLACE TABLE t1 (
a VARCHAR(10),
b VARCHAR(10),
c VARCHAR(10),
PRIMARY KEY (b)
) PARTITION BY KEY() PARTITIONS 2;CREATE OR REPLACE TABLE t1 (
a VARCHAR(10),
b VARCHAR(10),
c VARCHAR(10),
PRIMARY KEY (a(5), b(5), c(5))
) PARTITION BY KEY() PARTITIONS 2;
ERROR 1503 (HY000): A PRIMARY KEY must include all columns in the table's partitioning functionUnderstand LIST partitioning, where rows are assigned to partitions based on whether a column value matches one in a defined list of values.
LIST partitioning is conceptually similar to RANGE partitioning. In both cases you decide a partitioning expression (a column, or a slightly more complex calculation) and use it to determine which partitions will contain each row. However, with the RANGE type, partitioning is done by assigning a range of values to each partition. With the LIST type, we assign a set of values to each partition. This is usually preferred if the partitioning expression can return a limited set of values.
A variant of this partitioning method, LIST COLUMNS, allows us to use multiple columns and more datatypes.
The last part of a statement can be the definition of the new table's partitions. In the case of LIST partitioning, the syntax is as follows:
PARTITION BY LIST indicates that the partitioning type is LIST.
The partitioning_expression is an SQL expression that returns a value from each row. In the simplest cases, it is a column name. This value is used to determine which partition should contain a row.
partition_name is the name of a partition.
value_list is a list of values. If partitioning_expression returns one of these values, the row are stored in this partition. If we try to insert something that does not belong to any of these value lists, the row are rejected with an error.
The DEFAULT partition catches all records which do not fit into other partitions.
LIST partitioning can be useful when we have a column that can only contain a limited set of values. Even in that case, RANGE partitioning could be used instead; but LIST partitioning allows us to equally distribute the rows by assigning a proper set of values to each partition.
This page is licensed: CC BY-SA / Gnu FDL
PARTITION BY LIST (partitioning_expression)
(
PARTITION partition_name VALUES IN (value_list),
[ PARTITION partition_name VALUES IN (value_list), ... ]
[ PARTITION partition_name DEFAULT ]
)CREATE OR REPLACE TABLE t1 (
num TINYINT(1) NOT NULL
)
ENGINE = InnoDB
PARTITION BY LIST (num) (
PARTITION p0 VALUES IN (0,1),
PARTITION p1 VALUES IN (2,3),
PARTITION p2 DEFAULT
);Discover these variants that allow partitioning based on multiple columns and non-integer types, offering greater flexibility than standard RANGE/LIST.
RANGE COLUMNS and LIST COLUMNS are variants of, respectively, RANGE and LIST. With these partitioning types, there is not a single partitioning expression; instead, a list of one or more columns is accepted. The following rules apply:
The list can contain one or more columns.
Only bare columns are permitted; no expressions.
All the specified columns are compared to the specified values to determine which partition should contain a specific row. See below for details.
The last part of a statement can be definition of the new table's partitions. In the case of RANGE COLUMNS partitioning, the syntax is as follows:
The syntax for LIST COLUMNS is as follows:
partition_name is the name of a partition.
To determine which partition should contain a row, all specified columns are compared to each partition definition.
With LIST COLUMNS, a row matches a partition if all row values are identical to the specified values. At most one partition can match the row.
With RANGE COLUMNS, a row matches a partition if it is less than the specified value tuple in lexicographic order. The first partition that matches the row values are used.
The DEFAULT partition catches all records which do not fit in other partitions. Only one DEFAULT partition is allowed.
RANGE COLUMNS partition:
LIST COLUMNS partition:
This page is licensed: CC BY-SA / Gnu FDL
The RANGE partitioning type assigns rows to partitions based on whether column values fall within contiguous, non-overlapping ranges.
The RANGE partitioning type is used to assign each partition a range of values generated by the partitioning expression. Ranges must be ordered, contiguous and non-overlapping. The minimum value is always included in the first range. The highest value may or may not be included in the last range.
A variant of this partitioning method, , allows us to use multiple columns and more datatypes.
The last part of a statement can be definition of the new table's partitions. In the case of RANGE
PARTITION BY RANGE COLUMNS (col1, col2, ...)
(
PARTITION partition_name VALUES LESS THAN (value1, value2, ...),
[ PARTITION partition_name VALUES LESS THAN (value1, value2, ...), ... ]
)PARTITION BY LIST COLUMNS (partitioning_expression)
(
PARTITION partition_name VALUES IN (value1, value2, ...),
[ PARTITION partition_name VALUES IN (value1, value2, ...), ... ]
[ PARTITION partititon_name DEFAULT ]
)CREATE OR REPLACE TABLE t1 (
date1 DATE NOT NULL,
date2 DATE NOT NULL
)
ENGINE = InnoDB
PARTITION BY RANGE COLUMNS (date1,date2) (
PARTITION p0 VALUES LESS THAN ('2013-01-01', '1994-12-01'),
PARTITION p1 VALUES LESS THAN ('2014-01-01', '1995-12-01'),
PARTITION p2 VALUES LESS THAN ('2015-01-01', '1996-12-01')
);CREATE OR REPLACE TABLE t1 (
num TINYINT(1) NOT NULL
)
ENGINE = InnoDB
PARTITION BY LIST COLUMNS (num) (
PARTITION p0 VALUES IN (0,1),
PARTITION p1 VALUES IN (2,3),
PARTITION p2 DEFAULT
);PARTITION BY RANGE indicates that the partitioning type is RANGE.
partitioning_expression is an SQL expression that returns a value from each row. In the simplest cases, it is a column name. This value is used to determine which partition should contain a row.
partition_name is the name of a partition.
value indicates the upper bound for that partition. The values must be ascending. For the first partition, the lower limit is NULL. When trying to insert a row, if its value is higher than the upper limit of the last partition, the row are rejected (with an error, if the keyword is not used).
As a catchall, MAXVALUE can be specified as a value for the last partition. Note, however, that in order to append a new partition, it is not possible to use ADD PARTITION; instead, REORGANIZE PARTITION must be used.
A typical use case is when we want to partition a table whose rows refer to a moment or period in time; for example commercial transactions, blog posts, or events of some kind. We can partition the table by year, to keep all recent data in one partition and distribute historical data in big partitions that are stored on slower disks. Or, if our queries always read rows which refer to the same month or week, we can partition the table by month or year week (in this case, historical data and recent data are stored together).
AUTO_INCREMENT values also represent a chronological order. So, these values can be used to store old data in separate partitions. However, partitioning by id is not the best choice if we usually query a table by date.
Partitioning a log table by year:
Partitioning the table by both year and month:
In the last example, the UNIX_TIMESTAMP function is used to accomplish the purpose. Also, the first two partitions cover longer periods of time (probably because the logged activities were less intensive).
In both cases, when our tables become huge and we don't need to store all historical data any more, we can drop the oldest partitions in this way:
We will still be able to drop a partition that does not contain the oldest data, but all rows stored in it will disappear.
Example of an error when inserting outside a defined partition range:
To avoid the error, use the IGNORE keyword:
An alternative definition with MAXVALUE as a catchall:
This page is licensed: CC BY-SA / Gnu FDL
PARTITION BY RANGE (partitioning_expression)
(
PARTITION partition_name VALUES LESS THAN (value),
[ PARTITION partition_name VALUES LESS THAN (value), ... ]
[ PARTITION partition_name VALUES LESS THAN MAXVALUE ]
)CREATE TABLE log
(
id INT UNSIGNED NOT NULL AUTO_INCREMENT,
dt DATETIME NOT NULL,
user INT UNSIGNED,
PRIMARY KEY (id, dt)
)
ENGINE = InnoDB
PARTITION BY RANGE (YEAR(dt))
(
PARTITION p0 VALUES LESS THAN (2013),
PARTITION p1 VALUES LESS THAN (2014),
PARTITION p2 VALUES LESS THAN (2015),
PARTITION p3 VALUES LESS THAN (2016)
);CREATE TABLE log2
(
id INT UNSIGNED NOT NULL AUTO_INCREMENT,
ts TIMESTAMP NOT NULL,
user INT UNSIGNED,
PRIMARY KEY (id, ts)
)
ENGINE = InnoDB
PARTITION BY RANGE (UNIX_TIMESTAMP(ts))
(
PARTITION p0 VALUES LESS THAN (UNIX_TIMESTAMP('2014-08-01 00:00:00')),
PARTITION p1 VALUES LESS THAN (UNIX_TIMESTAMP('2014-11-01 00:00:00')),
PARTITION p2 VALUES LESS THAN (UNIX_TIMESTAMP('2015-01-01 00:00:00')),
PARTITION p3 VALUES LESS THAN (UNIX_TIMESTAMP('2015-02-01 00:00:00'))
);ALTER TABLE log DROP PARTITION p0;INSERT INTO log(id,dt) VALUES
(1, '2016-01-01 01:01:01'),
(2, '2015-01-01 01:01:01');
ERROR 1526 (HY000): Table has no partition for value 2016INSERT IGNORE INTO log(id,dt) VALUES
(1, '2016-01-01 01:01:01'),
(2, '2015-01-01 01:01:01');
SELECT * FROM log;
+----+---------------------+------+
| id | timestamp | user |
+----+---------------------+------+
| 2 | 2015-01-01 01:01:01 | NULL |
+----+---------------------+------+CREATE TABLE log
(
id INT UNSIGNED NOT NULL AUTO_INCREMENT,
dt DATETIME NOT NULL,
user INT UNSIGNED,
PRIMARY KEY (id, dt)
)
ENGINE = InnoDB
PARTITION BY RANGE (YEAR(dt))
(
PARTITION p0 VALUES LESS THAN (2013),
PARTITION p1 VALUES LESS THAN (2014),
PARTITION p2 VALUES LESS THAN (2015),
PARTITION p3 VALUES LESS THAN (2016),
PARTITION p4 VALUES LESS THAN MAXVALUE
);