Temporal Data Tables
MariaDB now supports versioning tables in the forms of system-versioning tables (allowing you to query and operate on historic data), application-time period-versioning (allowing you to query and operate on a temporal range of data), and bitemporal versioning (which combines system and application-time period data).
- System-versioning Tables
- Creating a System-Versioned Table
- Adding or Removing System Versioning To/From a Table
- Querying Historical Data
- Transaction-Precise History in InnoDB
- Storing the History Separately
- Removing Old History
- Excluding Columns From Versioning
- Application-time period-versioning Tables
- Creating Tables with Time Periods
- Adding and Removing Time Periods
- Deletion by Portion
- Updating by Portion
- Using Bitemporal Tables
- System Variables
- See Also
System-versioned tables store the history of all changes, not only data which is currently valid. This allows data analysis for any point in time, auditing of changes and comparison of data from different points in time. Typical uses cases are:
- Forensic analysis & legal requirements to store data for N years.
- Data analytics (retrospective, trends etc.), e.g. to get your staff information as of one year ago.
- Point-in-time recovery - recover a table state as of particular point in time.
System-versioned tables were first introduced in the SQL:2011 standard.
Creating a System-Versioned Table
The CREATE TABLE syntax has been extended to permit creating a system-versioned table. To be system-versioned, according to SQL:2011, a table must have two generated columns, a period, and a special table option clause:
CREATE TABLE t( x INT, start_timestamp TIMESTAMP(6) GENERATED ALWAYS AS ROW START, end_timestamp TIMESTAMP(6) GENERATED ALWAYS AS ROW END, PERIOD FOR SYSTEM_TIME(start_timestamp, end_timestamp) ) WITH SYSTEM VERSIONING;
In MariaDB one can also use a simplified syntax:
CREATE TABLE t ( x INT ) WITH SYSTEM VERSIONING;
In the latter case no extra columns will be created and they won't clutter the output of, say,
SELECT * FROM t. The versioning information will still be stored, and it can be accessed via the pseudo-columns
SELECT x, ROW_START, ROW_END FROM t;
Adding or Removing System Versioning To/From a Table
An existing table can be altered to enable system versioning for it.
CREATE TABLE t( x INT );
ALTER TABLE t ADD SYSTEM VERSIONING;
SHOW CREATE TABLE t\G *************************** 1. row *************************** Table: t Create Table: CREATE TABLE `t` ( `x` int(11) DEFAULT NULL ) ENGINE=InnoDB DEFAULT CHARSET=latin1 WITH SYSTEM VERSIONING
Similarly, system versioning can be removed from a table:
ALTER TABLE t DROP SYSTEM VERSIONING;
SHOW CREATE TABLE t\G *************************** 1. row *************************** Table: t Create Table: CREATE TABLE `t` ( `x` int(11) DEFAULT NULL ) ENGINE=InnoDB DEFAULT CHARSET=latin1
One can also add system versioning with all columns created explicitly:
ALTER TABLE t ADD COLUMN ts TIMESTAMP(6) GENERATED ALWAYS AS ROW START, ADD COLUMN te TIMESTAMP(6) GENERATED ALWAYS AS ROW END, ADD PERIOD FOR SYSTEM_TIME(ts, te), ADD SYSTEM VERSIONING;
SHOW CREATE TABLE t\G *************************** 1. row *************************** Table: t Create Table: CREATE TABLE `t` ( `x` int(11) DEFAULT NULL, `ts` timestamp(6) GENERATED ALWAYS AS ROW START, `te` timestamp(6) GENERATED ALWAYS AS ROW END, PERIOD FOR SYSTEM_TIME (`ts`, `te`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 WITH SYSTEM VERSIONING
Querying Historical Data
To query the historical data one uses the clause
FOR SYSTEM_TIME directly after the table name (before the table alias, if any). SQL:2011 provides three syntactic extensions:
AS OFis used to see the table as it was at a specific point in time in the past:
SELECT * FROM t FOR SYSTEM_TIME AS OF TIMESTAMP'2016-10-09 08:07:06'
BETWEEN start AND endwill show all rows that were visible at any point between two specified points in time. It works inclusively, a row visible exactly at start or exactly at end will be shown too.
SELECT * FROM t FOR SYSTEM_TIME BETWEEN (NOW() - INTERVAL 1 YEAR) AND NOW()
FROM start TO endwill also show all rows that were visible at any point between two specified points in time, including start, but excluding end.
SELECT * FROM t FOR SYSTEM_TIME FROM '2016-01-01 00:00:00' TO '2017-01-01 00:00:00'
Additionally MariaDB implements a non-standard extension:
ALLwill show all rows, historical and current.
SELECT * FROM t FOR SYSTEM_TIME ALL
FOR SYSTEM_TIME clause is not used, the table will show the current data, as if one had specified
FOR SYSTEM_TIME AS OF CURRENT_TIMESTAMP.
Views and Subqueries
When a system-versioned tables is used in a view or in a subquery in the from clause,
FOR SYSTEM_TIME can be used directly in the view or subquery body, or (non-standard) applied to the whole view when it's being used in a
CREATE VIEW v1 AS SELECT * FROM t FOR SYSTEM_TIME AS OF TIMESTAMP'2016-10-09 08:07:06'
CREATE VIEW v1 AS SELECT * FROM t; SELECT * FROM v1 FOR SYSTEM_TIME AS OF TIMESTAMP'2016-10-09 08:07:06';
Use in Replication and Binary Logs
Tables that use system-versioning implicitly add the
row_end column to the Primary Key. While this is generally not an issue for most use cases, it can lead to problems when re-applying write statements from the binary log or in replication environments, where a master retries an SQL statement on the slave.
Specifically, these writes include a value on the
row_end column containing the timestamp from when the write was initially made. The reoccurrence of the Primary Key with the old system-versioning columns raises an error due to the duplication.
To mitigate this with MariaDB Replication, set the
secure_timestamp system variable to
YES on the slave. When set, the slave uses its own system clock when applying to the row log, meaning that the master can retry as many times as needed without causing a conflict. The retries generate new historical rows with new values for the
Transaction-Precise History in InnoDB
A point in time when a row was inserted or deleted does not necessarily mean that a change became visible at the same moment. With transactional tables, a row might have been inserted in a long transaction, and became visible hours after it was inserted.
For some applications — for example, when doing data analytics on one-year-old data — this distinction does not matter much. For others — forensic analysis — it might be crucial.
MariaDB supports transaction-precise history (only for the InnoDB storage engine) that allows seeing the data exactly as it would've been seen by a new connection doing a
SELECT at the specified point in time — rows inserted before that point, but committed after will not be shown.
To use transaction-precise history, InnoDB needs to remember not timestamps, but transaction identifier per row. This is done by creating generated columns as
BIGINT UNSIGNED, not
CREATE TABLE t( x INT, start_trxid BIGINT UNSIGNED GENERATED ALWAYS AS ROW START, end_trxid BIGINT UNSIGNED GENERATED ALWAYS AS ROW END, PERIOD FOR SYSTEM_TIME(start_trxid, end_trxid) ) WITH SYSTEM VERSIONING;
These columns must be specified explicitly, but they can be made INVISIBLE to avoid cluttering
SELECT * output.
When one uses transaction-precise history, one can optionally use transaction identifiers in the
FOR SYSTEM_TIME clause:
SELECT * FROM t FOR SYSTEM_TIME AS OF TRANSACTION 12345;
This will show the data, exactly as it was seen by the transaction with the identifier 12345.
Storing the History Separately
When the history is stored together with the current data, it increases the size of the table, so current data queries — table scans and index searches — will take more time, because they will need to skip over historical data. If most queries on that table use only current data, it might make sense to store the history separately, to reduce the overhead from versioning.
This is done by partitioning the table by
SYSTEM_TIME. Because of partition pruning optimization, all current data queries will only access one partition, the one that stores current data.
This example shows how to create such a partitioned table:
CREATE TABLE t (x INT) WITH SYSTEM VERSIONING PARTITION BY SYSTEM_TIME ( PARTITION p_hist HISTORY, PARTITION p_cur CURRENT );
In this example all history will be stored in the partition
p_hist while all current data will be in the partition
p_cur. The table must have exactly one current partition and at least one historical partition.
SYSTEM_TIME also supports automatic partition rotation. One can rotate historical partitions by time or by size. This example shows how to rotate partitions by size:
CREATE TABLE t (x INT) WITH SYSTEM VERSIONING PARTITION BY SYSTEM_TIME LIMIT 100000 ( PARTITION p0 HISTORY, PARTITION p1 HISTORY, PARTITION pcur CURRENT );
MariaDB will start writing history rows into partition
p0, and when it reaches a size of 100000 rows, MariaDB will switch to partition
p1. There are only two historical partitions, so when
p1 overflows, MariaDB will issue a warning, but will continue writing into it.
Similarly, one can rotate partitions by time:
CREATE TABLE t (x INT) WITH SYSTEM VERSIONING PARTITION BY SYSTEM_TIME INTERVAL 1 WEEK ( PARTITION p0 HISTORY, PARTITION p1 HISTORY, PARTITION p2 HISTORY, PARTITION pcur CURRENT );
This means that the history for the first week after the table was created will be stored in
p0. The history for the second week — in
p1, and all later history will go into
p2. One can see the exact rotation time for each partition in the
It is possible to combine partitioning by
SYSTEM_TIME and subpartitions:
CREATE TABLE t (x INT) WITH SYSTEM VERSIONING PARTITION BY SYSTEM_TIME SUBPARTITION BY KEY (x) SUBPARTITIONS 4 ( PARTITION ph HISTORY, PARTITION pc CURRENT );
Removing Old History
Because it stores all the history, a system-versioned table might grow very large over time. There are many options to trim down the space and remove the old history.
One can completely drop the versioning from the table and add it back again, this will delete all the history:
ALTER TABLE t DROP SYSTEM VERSIONING; ALTER TABLE t ADD SYSTEM VERSIONING;
It might be a rather time-consuming operation, though, as the table will need to be rebuilt, possibly twice (depending on the storage engine).
Another option would be to use partitioning and drop some of historical partitions:
ALTER TABLE t DROP PARTITION p0;
Note, that one cannot drop a current partition or the only historical partition.
And the third option; one can use a variant of the DELETE statement to prune the history:
DELETE HISTORY FROM t;
or only old history up to a specific point in time:
DELETE HISTORY FROM t BEFORE SYSTEM_TIME '2016-10-09 08:07:06';
or to a specific transaction (with
BEFORE SYSTEM_TIME TRANSACTION xxx).
To protect the integrity of the history, this statement requires a special
DELETE HISTORY privilege.
The TRUNCATE TABLE statement drops all historical records from a system-versioned-table.
Excluding Columns From Versioning
Another MariaDB extension allows to version only a subset of columns in a table. This is useful, for example, if you have a table with user information that should be versioned, but one column is, let's say, a login counter that is incremented often and is not interesting to version. Such a column can be excluded from versioning by declaring it
CREATE TABLE t ( x INT, y INT WITHOUT SYSTEM VERSIONING ) WITH SYSTEM VERSIONING;
A column can also be declared
WITH VERSIONING, that will automatically make the table versioned. The statement below is equivalent to the one above:
CREATE TABLE t ( x INT WITH SYSTEM VERSIONING, y INT );
Application-time period-versioning Tables
Extending system-versioning tables, MariaDB 10.4 supports application-time period-versioning tables. Time periods are defined by a range between two temporal columns. The columns must be of the same temporal data type, such as
TIMESTAMP, (though not
TIME, which is currently unsupported), and of the same width.
Using time periods implicitly defines the two columns as
NOT NULL. It also adds a constraint to check whether the first value is less than the second value. The constraint is invisible to
SHOW CREATE TABLE statements. The name of this constraint is prefixed by the time period name, to avoid conflict with other constraints.
Creating Tables with Time Periods
To create a table with a time period, use a
CREATE TABLE statement with the
PERIOD table option.
CREATE TABLE test.t1( name VARCHAR(50), date_1 DATE, date_2 DATE, PERIOD FOR date_period(date_1, date_2));
This creates a table with a
time_period period and populates the table with some basic temporal values.
Examples are available in the MariaDB Server source code, at
Adding and Removing Time Periods
ALTER TABLE statement now supports syntax for adding and removing time periods from a table. To add a period, use the
ADD PERIOD clause.
CREATE TABLE test.t2 ( id INT PRIMARY KEY, time_1 TIMESTAMP(6), time_2 TIMESTAMP(2) ); ALTER TABLE test.t2 ADD PERIOD FOR time_period(time_1, time_2);
To remove a period, use the
DROP PERIOD clause:
ALTER TBALE test.t2 DROP PERIOD time_period;
ADD PERIOD and
DROP PERIOD clauses include an option to handle whether the period already exists:
ALTER TABLE test.t2 ADD PERIOD IF NOT EXISTS time_period(date_1, date_2); ALTER TABLE test.t2 DROP PERIOD IF EXISTS time_period;
Deletion by Portion
In addition to updates, you can also remove rows that fall within certain time periods.
When MariaDB executes a
DELETE FOR PORTION statement, it removes the row:
- When the row period falls completely within the delete period, it removes the row.
- When the row period overlaps the delete period, it shrinks the row, removing the overlap from the first or second row period value.
- When the delete period falls completely within the row period, it splits the row into two rows. The first row runs from the starting row period to the starting delete period. The second runs from the ending delete period to the ending row period.
To test this, first populate the table with some data to operate on:
TRUNCATE test.t1; INSERT INTO test.t1 (date_1, date_2) VALUES ('1999-01-01', '2018-12-12'), ('1999-01-01', '2017-01-01'), ('2017-01-01', '2019-01-01'), ('1998-01-01', '2018-12-12'), ('1997-01-01', '2015-01-01'), ('2016-01-01', '2020-01-01'), ('2010-01-01', '2015-01-01'); SELECT * FROM test.t1; +------+------------+------------+ | name | date_1 | date_2 | +------+------------+------------+ | NULL | 1999-01-01 | 2018-12-12 | | NULL | 1999-01-01 | 2017-01-01 | | NULL | 2017-01-01 | 2019-01-01 | | NULL | 1998-01-01 | 2018-12-12 | | NULL | 1997-01-01 | 2015-01-01 | | NULL | 2016-01-01 | 2020-01-01 | | NULL | 2010-01-01 | 2015-01-01 | +----+------+------------+------------+
Then, run the
DELETE FOR PORTION statement:
DELETE FROM test.t1 FOR PORTION OF date_period FROM '2001-01-01' TO '2018-01-01'; SELECT * FROM test.t1; +------+------------+------------+ | name | date_1 | date_2 | +------+------------+------------+ | NULL | 1999-01-01 | 2001-01-01 | | NULL | 2018-01-01 | 2018-12-12 | | NULL | 1999-01-01 | 2001-01-01 | | NULL | 2018-01-01 | 2019-01-01 | | NULL | 1998-01-01 | 2001-01-01 | | NULL | 2018-01-01 | 2018-12-12 | | NULL | 1997-01-01 | 2001-01-01 | | NULL | 2018-01-01 | 2020-01-01 | +------+------------+------------+
Here, second row, with values ranging from 1999 to 2017 has been shrunk to 1999 to 2001. The first row, with a range from 1999 to 2018 split into two rows 1999 to 2000 and 2018-01 to 2018-12. The fifth row, ranging from 2010 to 2015 was deleted entirely.
DELETE FOR PORTION statement has the following restrictions
FROM...TOcaluse must be constant
- Multi-delete is now supported
If there are
INSERT triggers, it works as following: any matched row is deleted, and then one or two rows are inserted. If the record is deleted completely, nothing is inserted.
Updating by Portion
UPDATE syntax now supports
UPDATE FOR PORTION, which modifies rows based on their occurrence in a range:
To test it, first populate the table with some data:
TRUNCATE test.t1; INSERT INTO test.t1 (date_1, date_2) VALUES ('1999-01-01', '2018-12-12'), ('1999-01-01', '2017-01-01'), ('2017-01-01', '2019-01-01'), ('1998-01-01', '2018-12-12'), ('1997-01-01', '2015-01-01'), ('2016-01-01', '2020-01-01'), ('2010-01-01', '2015-01-01');
Then run the update:
UPDATE test.t1 FOR PORTION OF date_period FROM '2000-01-01' TO '2018-01-01' SET name = "Generation Z"; SELECT FROM test.t1; +----+--------------+------------+------------+ | id | name | date_1 | date_2 | +----+--------------+------------+------------+ | 1 | Generation Z | 2000-01-01 | 2018-01-01 | | 2 | Generation Z | 2000-01-01 | 2017-01-01 | | 3 | Generation Z | 2017-01-01 | 2018-01-01 | | 4 | Generation Z | 2000-01-01 | 2018-01-01 | | 5 | Generation Z | 2000-01-01 | 2015-01-01 | | 6 | Generation Z | 2016-01-01 | 2018-01-01 | | 7 | Generation Z | 2010-01-01 | 2015-01-01 | | 8 | NULL | 1999-01-01 | 2000-01-01 | | 9 | NULL | 2018-01-01 | 2018-12-12 | | 10 | NULL | 1999-01-01 | 2000-01-01 | | 11 | NULL | 2018-01-01 | 2019-01-01 | | 12 | NULL | 1998-01-01 | 2000-01-01 | | 13 | NULL | 2018-01-01 | 2018-12-12 | | 14 | NULL | 1997-01-01 | 2000-01-01 | | 15 | NULL | 2018-01-01 | 2020-01-01 | +----+--------------+------------+------------+
UPDATE FOR PORTION statement has the following limitations:
- Operation cannot modify the two temporal columns used by the time period
- Operation cannot reference period values in the
FROM...TOexpressions must be constant
- Multi-delete is not supported
Using Bitemporal Tables
MariaDB 10.4 also supports bitemporal tables. These are tables that use versioning both at the system and application-time period levels. To create a bitemporal table, use
CREATE TABLE test.t3 ( date_1 DATE, date_2 DATE, row_start TIMESTAMP(6) AS ROW START INVISIBLE, row_end TIMESTAMP(6) AS ROW END INVISIBLE, PERIOD FOR application_time(date_1, date_2), PERIOD FOR system_time(row_start, row_end));
Note that, while
system_time here is also a time period, it cannot be used in
DELETE FOR PORTION or
UPDATE FOR PORTION statements.
DELETE FROM test.t3 FOR PORTION OF system_time FROM '2000-01-01' TO '2018-01-01'; ERROR 42000: You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near 'of system_time from '2000-01-01' to '2018-01-01'' at line 1
There are a number of system variables related to system-versioned tables:
- Description: SQL:2011 does not allow ALTER TABLE on system-versioned tables. When this variable is set to
ERROR, an attempt to alter a system-versioned table will result in an error. When this variable is set to
KEEP, ALTER TABLE will be allowed, but the history will become incorrect — querying historical data will show the new table structure. This mode is still useful, for example, when adding new columns to a table.
- Scope: Global, Session
- Dynamic: Yes
- Type: Enum
- Default Value:
- Valid Values:
- Introduced: MariaDB 10.3.4
- Description: If set to a specific timestamp value, an implicit
FOR SYSTEM_TIME AS OFclause will be applied to all queries. This is useful if one wants to do many queries for history at the specific point in time. Set it to
DEFAULTto restore the default behavior.
- Commandline: None
- Scope: Global, Session
- Dynamic: Yes
- Type: Varchar
- Default Value:
- Introduced: MariaDB 10.3.4
- Description: Never fully implemented and removed in the following release.
- Scope: Global, Session
- Dynamic: Yes
- Type: Boolean
- Default Value:
- Introduced: MariaDB 10.3.4
- Removed: MariaDB 10.3.5
- Versioning clauses can not be applied to generated (virtual and persistent) columns.
- mysqldump does not read historical rows from versioned tables, and so historical data will not be backed up. Also, a restore of the timestamps would not be possible as they cannot be defined by an insert/a user.