RANGE Partitioning Type
The RANGE partitioning type is used to assign each partition a range of values generated by the partitioning expression. Ranges must be ordered, contigous and non-overlapping. The minimum value is always included in the first range. The highest value may or may not be included in the last range.
A variant of this partitioning method ,RANGE COLUMNS, allows us to use multiple columns and more datatypes.
The last part of a CREATE TABLE statement can be definition of the new table's partitions. In the case of RANGE partitioning, the syntax is the following:
PARTITION BY RANGE (partitioning_expression) ( PARTITION partition_name VALUES LESS THAN (value), [ PARTITION partition_name VALUES LESS THAN (value), ... ] )
PARTITION BY RANGE indicates that the partitioning type is RANGE.
partitioning_expression is an SQL expression that returns a value from each row. In the simplest cases, it is a column name. This value is used to determine which partition should contain a row.
partition_name is the name of a partition.
value indicates the upper bound for that partition. The values must be ascending. For the first partition, the lower limit is NULL. When trying to insert a row, if its value is higher than the upper limit of the last partition, the row will be rejected (with an error, if the IGNORE keyword is not used).
If this is a problem, MAXVALUE can be specified as a value for the last partition. Note however that it is not possible to split partitions of an existing RANGE partitioned table. New partitions can be appended, but this will not be possible if the last partition's higher bound is MAXVALUE.
A typical use case is when we want to partition a table whose rows refer to a moment or period in time; for example commercial transactions, blog posts, or events of some kind. We can partition the table by year, to keep all recent data in one partition and distribute historical data in big partitions that are stored on slower disks. Or, if our queries always read rows which refer to the same month or week, we can partition the table by month or year week (in this case, historical data and recent data will be stored together).
AUTO_INCREMENT values also represent a chronological order. So, these values can be used to store old data in separate partitions. However, partitioning by id is not the best choice if we usually query a table by date.
In the following example, we will partition a log table by year.
CREATE TABLE log ( id INT UNSIGNED NOT NULL AUTO_INCREMENT, timestamp DATETIME NOT NULL, user INT UNSIGNED, ip BINARY(16) NOT NULL, action VARCHAR(20) NOT NULL, PRIMARY KEY (id, timestamp) ) ENGINE = InnoDB PARTITION BY RANGE (YEAR(timestamp)) ( PARTITION p0 VALUES LESS THAN (2013), PARTITION p1 VALUES LESS THAN (2014), PARTITION p2 VALUES LESS THAN (2015), PARTITION p3 VALUES LESS THAN (2016) );
As an alternative, we can partition the table by both year and month:
CREATE TABLE log ( id INT UNSIGNED NOT NULL AUTO_INCREMENT, timestamp TIMESTAMP NOT NULL, user INT UNSIGNED, ip BINARY(16) NOT NULL, action VARCHAR(20) NOT NULL, PRIMARY KEY (id, timestamp) ) ENGINE = InnoDB PARTITION BY RANGE (UNIX_TIMESTAMP(timestamp)) ( PARTITION p0 VALUES LESS THAN (UNIX_TIMESTAMP('2014-08-01 00:00:00')), PARTITION p1 VALUES LESS THAN (UNIX_TIMESTAMP('2014-11-01 00:00:00')), PARTITION p2 VALUES LESS THAN (UNIX_TIMESTAMP('2015-01-01 00:00:00')), PARTITION p3 VALUES LESS THAN (UNIX_TIMESTAMP('2015-02-01 00:00:00')) );
As you can see, we used the UNIX_TIMESTAMP function to accomplish the purpose. Also, the first two partitions cover longer periods of time (probably because the logged activities were less intensive).
In both cases, when our tables become huge and we don't need to store all historical data any more, we can drop the oldest partitions in this way:
ALTER TABLE log DROP PARTITION p0;
We will still be able to drop a partition that does not contain the oldest data, but all rows stored in it will disappear.