System-Versioned Tables

MariaDB supports temporal data tables in the form of system-versioning tables (allowing you to query and operate on historic data, discussed below), application-time periods (allow you to query and operate on a temporal range of data), and bitemporal tables (which combine both system-versioning and application-time periods).

System-Versioned Tables

System-versioned tables store the history of all changes, not only data which is currently applicable. This allows data analysis for any point in time, auditing of changes and comparison of data from different points in time. Typical uses cases are:

  • Forensic analysis & legal requirements to store data for N years.
  • Data analytics (retrospective, trends etc.), e.g. to get your staff information as of one year ago.
  • Point-in-time recovery - recover a table state as of particular point in time.

System-versioned tables were first introduced in the SQL:2011 standard.

Creating a System-Versioned Table


The CREATE TABLE syntax has been extended to permit creating a system-versioned table. To be system-versioned, according to SQL:2011, a table must have two generated columns, a period, and a special table option clause:

CREATE TABLE t(
   x INT,
   start_timestamp TIMESTAMP(6) GENERATED ALWAYS AS ROW START,
   end_timestamp TIMESTAMP(6) GENERATED ALWAYS AS ROW END,
   PERIOD FOR SYSTEM_TIME(start_timestamp, end_timestamp)
) WITH SYSTEM VERSIONING;

In MariaDB one can also use a simplified syntax:

CREATE TABLE t (
   x INT
) WITH SYSTEM VERSIONING;

In the latter case no extra columns will be created and they won't clutter the output of, say, SELECT * FROM t. The versioning information will still be stored, and it can be accessed via the pseudo-columns ROW_START and ROW_END:

SELECT x, ROW_START, ROW_END FROM t;

Adding or Removing System Versioning To/From a Table

An existing table can be altered to enable system versioning for it.

CREATE TABLE t(
  x INT
);
ALTER TABLE t ADD SYSTEM VERSIONING;
SHOW CREATE TABLE t\G
*************************** 1. row ***************************
       Table: t
Create Table: CREATE TABLE `t` (
  `x` int(11) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1 WITH SYSTEM VERSIONING

Similarly, system versioning can be removed from a table:

ALTER TABLE t DROP SYSTEM VERSIONING;
SHOW CREATE TABLE t\G
*************************** 1. row ***************************
       Table: t
Create Table: CREATE TABLE `t` (
  `x` int(11) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1

One can also add system versioning with all columns created explicitly:

ALTER TABLE t ADD COLUMN ts TIMESTAMP(6) GENERATED ALWAYS AS ROW START,
              ADD COLUMN te TIMESTAMP(6) GENERATED ALWAYS AS ROW END,
              ADD PERIOD FOR SYSTEM_TIME(ts, te),
              ADD SYSTEM VERSIONING;
SHOW CREATE TABLE t\G
*************************** 1. row ***************************
       Table: t
Create Table: CREATE TABLE `t` (
  `x` int(11) DEFAULT NULL,
  `ts` timestamp(6) GENERATED ALWAYS AS ROW START,
  `te` timestamp(6) GENERATED ALWAYS AS ROW END,
  PERIOD FOR SYSTEM_TIME (`ts`, `te`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 WITH SYSTEM VERSIONING

Inserting Data

When data is inserted into a system-versioned table, it is given a row_start value of the current timestamp, and a row_end value of FROM_UNIXTIME(2147483647.999999). The current timestamp can be adjusted by setting the timestamp system variable, for example:

SELECT NOW();
+---------------------+
| NOW()               |
+---------------------+
| 2022-10-24 23:09:38 |
+---------------------+
 
INSERT INTO t VALUES(1);
 
SET @@timestamp = UNIX_TIMESTAMP('2033-10-24');

INSERT INTO t VALUES(2);
 
SET @@timestamp = default;

INSERT INTO t VALUES(3);
 
SELECT a,row_start,row_end FROM t;
+------+----------------------------+----------------------------+
| a    | row_start                  | row_end                    |
+------+----------------------------+----------------------------+
|    1 | 2022-10-24 23:09:38.951347 | 2038-01-19 05:14:07.999999 |
|    2 | 2033-10-24 00:00:00.000000 | 2038-01-19 05:14:07.999999 |
|    3 | 2022-10-24 23:09:38.961857 | 2038-01-19 05:14:07.999999 |
+------+----------------------------+----------------------------+

Querying Historical Data

SELECT

To query the historical data one uses the clause FOR SYSTEM_TIME directly after the table name (before the table alias, if any). SQL:2011 provides three syntactic extensions:

  • AS OF is used to see the table as it was at a specific point in time in the past:
SELECT * FROM t FOR SYSTEM_TIME AS OF TIMESTAMP'2016-10-09 08:07:06';
  • BETWEEN start AND end will show all rows that were visible at any point between two specified points in time. It works inclusively, a row visible exactly at start or exactly at end will be shown too.
SELECT * FROM t FOR SYSTEM_TIME BETWEEN (NOW() - INTERVAL 1 YEAR) AND NOW();
  • FROM start TO end will also show all rows that were visible at any point between two specified points in time, including start, but excluding end.
SELECT * FROM t FOR SYSTEM_TIME FROM '2016-01-01 00:00:00' TO '2017-01-01 00:00:00';

Additionally MariaDB implements a non-standard extension:

  • ALL will show all rows, historical and current.
SELECT * FROM t FOR SYSTEM_TIME ALL;

If the FOR SYSTEM_TIME clause is not used, the table will show the current data. This is usually the same as if one had specified FOR SYSTEM_TIME AS OF CURRENT_TIMESTAMP, unless one has adjusted the row_start value (until MariaDB 10.11, only possible by setting the secure_timestamp variable). For example:

CREATE OR REPLACE TABLE t (a int) WITH SYSTEM VERSIONING;

SELECT NOW();
+---------------------+
| NOW()               |
+---------------------+
| 2022-10-24 23:43:37 |
+---------------------+

INSERT INTO t VALUES (1);

SET @@timestamp = UNIX_TIMESTAMP('2033-03-03');

INSERT INTO t VALUES (2);

DELETE FROM t;

SET @@timestamp = default;


SELECT a, row_start, row_end FROM t FOR SYSTEM_TIME ALL;
+------+----------------------------+----------------------------+
| a    | row_start                  | row_end                    |
+------+----------------------------+----------------------------+
|    1 | 2022-10-24 23:43:37.192725 | 2033-03-03 00:00:00.000000 |
|    2 | 2033-03-03 00:00:00.000000 | 2033-03-03 00:00:00.000000 |
+------+----------------------------+----------------------------+
2 rows in set (0.000 sec)


SELECT a, row_start, row_end FROM t FOR SYSTEM_TIME AS OF CURRENT_TIMESTAMP;
+------+----------------------------+----------------------------+
| a    | row_start                  | row_end                    |
+------+----------------------------+----------------------------+
|    1 | 2022-10-24 23:43:37.192725 | 2033-03-03 00:00:00.000000 |
+------+----------------------------+----------------------------+
1 row in set (0.000 sec)


SELECT a, row_start, row_end FROM t;
Empty set (0.001 sec)

Views and Subqueries

When a system-versioned tables is used in a view or in a subquery in the from clause, FOR SYSTEM_TIME can be used directly in the view or subquery body, or (non-standard) applied to the whole view when it's being used in a SELECT:

CREATE VIEW v1 AS SELECT * FROM t FOR SYSTEM_TIME AS OF TIMESTAMP'2016-10-09 08:07:06';

Or

CREATE VIEW v1 AS SELECT * FROM t;
SELECT * FROM v1 FOR SYSTEM_TIME AS OF TIMESTAMP'2016-10-09 08:07:06';

Use in Replication and Binary Logs

Tables that use system-versioning implicitly add the row_end column to the Primary Key. While this is generally not an issue for most use cases, it can lead to problems when re-applying write statements from the binary log or in replication environments, where a primary retries an SQL statement on the replica.

Specifically, these writes include a value on the row_end column containing the timestamp from when the write was initially made. The re-occurrence of the Primary Key with the old system-versioning columns raises an error due to the duplication.

To mitigate this with MariaDB Replication, set the secure_timestamp system variable to YES on the replica. When set, the replica uses its own system clock when applying to the row log, meaning that the primary can retry as many times as needed without causing a conflict. The retries generate new historical rows with new values for the row_start and row_end columns.

Transaction-Precise History in InnoDB

A point in time when a row was inserted or deleted does not necessarily mean that a change became visible at the same moment. With transactional tables, a row might have been inserted in a long transaction, and became visible hours after it was inserted.

For some applications — for example, when doing data analytics on one-year-old data — this distinction does not matter much. For others — forensic analysis — it might be crucial.

MariaDB supports transaction-precise history (only for the InnoDB storage engine) that allows seeing the data exactly as it would've been seen by a new connection doing a SELECT at the specified point in time — rows inserted before that point, but committed after will not be shown.

To use transaction-precise history, InnoDB needs to remember not timestamps, but transaction identifier per row. This is done by creating generated columns as BIGINT UNSIGNED, not TIMESTAMP(6):

CREATE TABLE t(
   x INT,
   start_trxid BIGINT UNSIGNED GENERATED ALWAYS AS ROW START,
   end_trxid BIGINT UNSIGNED GENERATED ALWAYS AS ROW END,
   PERIOD FOR SYSTEM_TIME(start_trxid, end_trxid)
) WITH SYSTEM VERSIONING;

These columns must be specified explicitly, but they can be made INVISIBLE to avoid cluttering SELECT * output.

When one uses transaction-precise history, one can optionally use transaction identifiers in the FOR SYSTEM_TIME clause:

SELECT * FROM t FOR SYSTEM_TIME AS OF TRANSACTION 12345;

This will show the data, exactly as it was seen by the transaction with the identifier 12345.

Storing the History Separately

When the history is stored together with the current data, it increases the size of the table, so current data queries — table scans and index searches — will take more time, because they will need to skip over historical data. If most queries on that table use only current data, it might make sense to store the history separately, to reduce the overhead from versioning.

This is done by partitioning the table by SYSTEM_TIME. Because of the partition pruning optimization, all current data queries will only access one partition, the one that stores current data.

This example shows how to create such a partitioned table:

CREATE TABLE t (x INT) WITH SYSTEM VERSIONING
  PARTITION BY SYSTEM_TIME (
    PARTITION p_hist HISTORY,
    PARTITION p_cur CURRENT
  );

In this example all history will be stored in the partition p_hist while all current data will be in the partition p_cur. The table must have exactly one current partition and at least one historical partition.

Partitioning by SYSTEM_TIME also supports automatic partition rotation. One can rotate historical partitions by time or by size. This example shows how to rotate partitions by size:

CREATE TABLE t (x INT) WITH SYSTEM VERSIONING
  PARTITION BY SYSTEM_TIME LIMIT 100000 (
    PARTITION p0 HISTORY,
    PARTITION p1 HISTORY,
    PARTITION pcur CURRENT
  );

MariaDB will start writing history rows into partition p0, and when it reaches a size of 100000 rows, MariaDB will switch to partition p1. There are only two historical partitions, so when p1 overflows, MariaDB will issue a warning, but will continue writing into it.

Similarly, one can rotate partitions by time:

CREATE TABLE t (x INT) WITH SYSTEM VERSIONING
  PARTITION BY SYSTEM_TIME INTERVAL 1 WEEK (
    PARTITION p0 HISTORY,
    PARTITION p1 HISTORY,
    PARTITION p2 HISTORY,
    PARTITION pcur CURRENT
  );

This means that the history for the first week after the table was created will be stored in p0. The history for the second week — in p1, and all later history will go into p2. One can see the exact rotation time for each partition in the INFORMATION_SCHEMA.PARTITIONS table.

It is possible to combine partitioning by SYSTEM_TIME and subpartitions:

CREATE TABLE t (x INT) WITH SYSTEM VERSIONING
  PARTITION BY SYSTEM_TIME
    SUBPARTITION BY KEY (x)
    SUBPARTITIONS 4 (
    PARTITION ph HISTORY,
    PARTITION pc CURRENT
  );

Default Partitions

MariaDB starting with 10.5.0

Since partitioning by current and historical data is such a typical usecase, from MariaDB 10.5, it is possible to use a simplified statement to do so. For example, instead of

CREATE TABLE t (x INT) WITH SYSTEM VERSIONING 
  PARTITION BY SYSTEM_TIME (
    PARTITION p0 HISTORY,  
    PARTITION pn CURRENT 
);

you can use

CREATE TABLE t (x INT) WITH SYSTEM VERSIONING 
  PARTITION BY SYSTEM_TIME;

You can also specify the number of partitions, which is useful if you want to rotate history by time, for example:

CREATE TABLE t (x INT) WITH SYSTEM VERSIONING 
  PARTITION BY SYSTEM_TIME 
    INTERVAL 1 MONTH 
    PARTITIONS 12;

Specifying the number of partitions without specifying a rotation condition will result in a warning:

CREATE OR REPLACE TABLE t (x INT) WITH SYSTEM VERSIONING
  PARTITION BY SYSTEM_TIME PARTITIONS 12;
Query OK, 0 rows affected, 1 warning (0.518 sec)

Warning (Code 4115): Maybe missing parameters: no rotation condition for multiple HISTORY partitions.

while specifying only 1 partition will result in an error:

CREATE OR REPLACE TABLE t (x INT) WITH SYSTEM VERSIONING
  PARTITION BY SYSTEM_TIME PARTITIONS 1;
ERROR 4128 (HY000): Wrong partitions for `t`: must have at least one HISTORY and exactly one last CURRENT

Automatically Creating Partitions

MariaDB starting with 10.9.1

From MariaDB 10.9.1, the AUTO keyword can be used to automatically create history partitions.

For example

CREATE TABLE t1 (x int) WITH SYSTEM VERSIONING
    PARTITION BY SYSTEM_TIME INTERVAL 1 HOUR AUTO;

CREATE TABLE t1 (x int) WITH SYSTEM VERSIONING
   PARTITION BY SYSTEM_TIME INTERVAL 1 MONTH
   STARTS '2021-01-01 00:00:00' AUTO PARTITIONS 12;

CREATE TABLE t1 (x int) WITH SYSTEM VERSIONING
  PARTITION BY SYSTEM_TIME LIMIT 1000 AUTO;

Or with explicit partitions:

CREATE TABLE t1 (x int) WITH SYSTEM VERSIONING
  PARTITION BY SYSTEM_TIME INTERVAL 1 HOUR AUTO
  (PARTITION p0 HISTORY, PARTITION pn CURRENT);

To disable or enable auto-creation one can use ALTER TABLE by adding or removing AUTO from the partitioning specification:

CREATE TABLE t1 (x int) WITH SYSTEM VERSIONING
  PARTITION BY SYSTEM_TIME INTERVAL 1 HOUR AUTO;

# Disables auto-creation:
ALTER TABLE t1 PARTITION BY SYSTEM_TIME INTERVAL 1 HOUR;

# Enables auto-creation:
ALTER TABLE t1 PARTITION BY SYSTEM_TIME INTERVAL 1 HOUR AUTO;

If the rest of the partitioning specification is identical to CREATE TABLE, no repartitioning will be done (for details see MDEV-27328).

Removing Old History

Because it stores all the history, a system-versioned table might grow very large over time. There are many options to trim down the space and remove the old history.

One can completely drop the versioning from the table and add it back again, this will delete all the history:

ALTER TABLE t DROP SYSTEM VERSIONING;
ALTER TABLE t ADD SYSTEM VERSIONING;

It might be a rather time-consuming operation, though, as the table will need to be rebuilt, possibly twice (depending on the storage engine).

Another option would be to use partitioning and drop some of historical partitions:

ALTER TABLE t DROP PARTITION p0;

Note, that one cannot drop a current partition or the only historical partition.

And the third option; one can use a variant of the DELETE statement to prune the history:

DELETE HISTORY FROM t;

or only old history up to a specific point in time:

DELETE HISTORY FROM t BEFORE SYSTEM_TIME '2016-10-09 08:07:06';

or to a specific transaction (with BEFORE SYSTEM_TIME TRANSACTION xxx).

To protect the integrity of the history, this statement requires a special DELETE HISTORY privilege.

Currently, using the DELETE HISTORY statement with a BEFORE SYSTEM_TIME greater than the ROW_END of the active records (as a TIMESTAMP, this has a maximum value of '2038-01-19 03:14:07' UTC) will result in the historical records being dropped, and the active records being deleted and moved to history. See MDEV-25468.

Prior to MariaDB 10.4.5, the TRUNCATE TABLE statement drops all historical records from a system-versioned-table.

From MariaDB 10.4.5, historic data is protected from TRUNCATE statements, as per the SQL standard, and an Error 4137 is instead raised:

TRUNCATE t;
ERROR 4137 (HY000): System-versioned tables do not support TRUNCATE TABLE

Excluding Columns From Versioning

Another MariaDB extension allows one to version only a subset of columns in a table. This is useful, for example, if you have a table with user information that should be versioned, but one column is, let's say, a login counter that is incremented often and is not interesting to version. Such a column can be excluded from versioning by declaring it WITHOUT VERSIONING

CREATE TABLE t (
   x INT,
   y INT WITHOUT SYSTEM VERSIONING
) WITH SYSTEM VERSIONING;

A column can also be declared WITH VERSIONING, that will automatically make the table versioned. The statement below is equivalent to the one above:

CREATE TABLE t (
   x INT WITH SYSTEM VERSIONING,
   y INT
);

System Variables

There are a number of system variables related to system-versioned tables:

system_versioning_alter_history

  • Description: SQL:2011 does not allow ALTER TABLE on system-versioned tables. When this variable is set to ERROR, an attempt to alter a system-versioned table will result in an error. When this variable is set to KEEP, ALTER TABLE will be allowed, but the history will become incorrect — querying historical data will show the new table structure. This mode is still useful, for example, when adding new columns to a table. Note that if historical data contains or would contain nulls, attempting to ALTER these columns to be NOT NULL will return an error (or warning if strict_mode is not set).
  • Commandline: --system-versioning-alter-history=value
  • Scope: Global, Session
  • Dynamic: Yes
  • Type: Enum
  • Default Value: ERROR
  • Valid Values: ERROR, KEEP

system_versioning_asof

  • Description: If set to a specific timestamp value, an implicit FOR SYSTEM_TIME AS OF clause will be applied to all queries. This is useful if one wants to do many queries for history at the specific point in time. Set it to 'DEFAULT' to restore the default behavior. Has no effect on DML, so queries such as INSERT .. SELECT and REPLACE .. SELECT need to state AS OF explicitly.

Note: You need to use quotes around the name 'DEFAULT' when setting the session value, unquoted literal DEFAULT will restore the current global value instead.

  • Commandline: None
  • Scope: Global, Session
  • Dynamic: Yes
  • Type: Varchar
  • Default Value: DEFAULT

system_versioning_innodb_algorithm_simple

  • Description: Never fully implemented and removed in the following release.
  • Commandline: --system-versioning-innodb-algorithm-simple[={0|1}]
  • Scope: Global, Session
  • Dynamic: Yes
  • Type: Boolean
  • Default Value: ON
  • Introduced: MariaDB 10.3.4
  • Removed: MariaDB 10.3.5

system_versioning_insert_history

  • Description: Allows direct inserts into ROW_START and ROW_END columns if secure_timestamp allows changing timestamp.
  • Commandline: --system-versioning-insert-history[={0|1}]
  • Scope: Global, Session
  • Dynamic: Yes
  • Type: Boolean
  • Default Value: OFF
  • Introduced: MariaDB 10.11.0

Limitations

  • Versioning clauses can not be applied to generated (virtual and persistent) columns.
  • Before MariaDB 10.11, mariadb-dump did not read historical rows from versioned tables, and so historical data would not be backed up. Also, a restore of the timestamps would not be possible as they cannot be defined by an insert/a user. From MariaDB 10.11, use the -H or --dump-history options to include the history.

See Also

Comments

Comments loading...
Content reproduced on this site is the property of its respective owners, and this content is not reviewed in advance by MariaDB. The views, information and opinions expressed by this content do not necessarily represent those of MariaDB or any other party.