May 25, 2018

Automatic Data Versioning in MariaDB Server 10.3

MariaDB Server 10.3 comes with a new, very useful feature that will ease the design of many applications. Data versioning is important for several perspectives. Compliance might require that you need to store data changes. For analytical queries, you may want to look at data at a specific point in time and for auditing purposes, what changes were made and when is important. Also, in the case of a table being deleted it can be of great value to recover it from history. MariaDB Server now includes a feature named System-Versioned Tables, which is based on the specification in the SQL:2011 standard. It provides automatic versioning of table data.

I’ll walk through the concept of System-Versioned Tables with a very simple example, which will show you what it is all about. Let’s start by creating a database and a table.

CREATE DATABASE Company; 

CREATE TABLE Person (
  Id int(11) NOT NULL AUTO_INCREMENT,
  FirstName varchar(50) NOT NULL,
  LastName varchar(50) NOT NULL,
  Gender char(1) NOT NULL,
  DepartmentId int(11) NOT NULL,
  PRIMARY KEY (Id),
  CONSTRAINT con_gender CHECK (Gender in ('f','m')))
WITH SYSTEM VERSIONING;

It looks exactly like before except the last table option WITH SYSTEM_VERSIONING, which turns on the automatic versioning on the table. Let’s see how it works by inserting a row into the table.

MariaDB [Company]> INSERT INTO Person (FirstName, LastName, Gender, DepartmentId) VALUES ('Rasmus', 'Johansson', 'm', 1);
Query OK, 1 row affected (0.002 sec)

MariaDB [Company]> SELECT * FROM Person;
+----+-----------+-----------+--------+--------------+
| Id | FirstName | LastName  | Gender | DepartmentId |
+----+-----------+-----------+--------+--------------+
|  1 | Rasmus    | Johansson | m      |            1 |
+----+-----------+-----------+--------+--------------+
1 row in set (0.001 sec)

There we have me as one row in a table. The interesting part starts when we update rows. I’ll change departments a couple of times.

MariaDB [Company]> UPDATE Person SET DepartmentId = 2 WHERE Id = 1;                                      Query OK, 1 row affected (0.002 sec)
Rows matched: 1  Changed: 1  Inserted: 1  Warnings: 0

MariaDB [Company]> SELECT * FROM Person;
+----+-----------+-----------+--------+--------------+
| Id | FirstName | LastName  | Gender | DepartmentId |
+----+-----------+-----------+--------+--------------+
|  1 | Rasmus    | Johansson | m      |            2 |
+----+-----------+-----------+--------+--------------+
1 row in set (0.001 sec)

MariaDB [Company]> UPDATE Person SET DepartmentId = 3 WHERE Id = 1;
Query OK, 1 row affected (0.003 sec)
Rows matched: 1  Changed: 1  Inserted: 1  Warnings: 0

MariaDB [Company]> SELECT * FROM Person;
+----+-----------+-----------+--------+--------------+
| Id | FirstName | LastName  | Gender | DepartmentId |
+----+-----------+-----------+--------+--------------+
|  1 | Rasmus    | Johansson | m      |            3 |
+----+-----------+-----------+--------+--------------+
1 row in set (0.001 sec)

As you can see MariaDB Server tells that there is 1 changed row for each update as usual, but also 1 inserted, which wouldn’t be the case for a table without versioning. Each update causes a new version of the row, which has to be inserted into the table. As you also see above a normal SELECT will only show the latest version. To see all versions of the rows MariaDB Server provides the following syntax.

MariaDB [Company]> SELECT * FROM Person FOR SYSTEM_TIME ALL;
+----+-----------+-----------+--------+--------------+
| Id | FirstName | LastName  | Gender | DepartmentId |
+----+-----------+-----------+--------+--------------+
|  1 | Rasmus    | Johansson | m      |            1 |
|  1 | Rasmus    | Johansson | m      |            2 |
|  1 | Rasmus    | Johansson | m      |            3 |
+----+-----------+-----------+--------+--------------+
3 rows in set (0.001 sec)

To be able to see when the rows have been updated we want to include two invisible columns that are created by the automatic versioning. Invisible Columns is another exciting new feature of MariaDB Server 10.3. The invisible columns of automatic versioning are ROW_START and ROW_END. They define the time period for which the version of the row was/is valid.

MariaDB [Company]> SELECT *, ROW_START, ROW_END FROM Person FOR SYSTEM_TIME ALL;
+----+-----------+-----------+--------+--------------+----------------------------+----------------------------+
| Id | FirstName | LastName  | Gender | DepartmentId | ROW_START                  | ROW_END                    |
+----+-----------+-----------+--------+--------------+----------------------------+----------------------------+
|  1 | Rasmus    | Johansson | m      |            1 | 2018-05-03 07:21:12.386980 | 2018-05-03 07:22:29.188266 |
|  1 | Rasmus    | Johansson | m      |            2 | 2018-05-03 07:22:29.188266 | 2018-05-03 07:22:47.596481 |
|  1 | Rasmus    | Johansson | m      |            3 | 2018-05-03 07:22:47.596481 | 2038-01-19 03:14:07.999999 |
+----+-----------+-----------+--------+--------------+----------------------------+----------------------------+
3 rows in set (0.000 sec)

The interesting thing now is to do point in time queries to retrieve exactly what the table looked like at a specific date and time. We can do this by using the AS OF syntax:

MariaDB [Company]> SELECT * FROM Person FOR SYSTEM_TIME AS OF TIMESTAMP '2018-05-03 07:22:33';
+----+-----------+-----------+--------+--------------+
| Id | FirstName | LastName  | Gender | DepartmentId |
+----+-----------+-----------+--------+--------------+
|  1 | Rasmus    | Johansson | m      |            2 |
+----+-----------+-----------+--------+--------------+
1 row in set (0.001 sec)

This was just a little glimpse into System-Versioned Tables. In addition to the examples shown above you can place history on separate partitions for performance reasons, exclude columns from versioning and much more.

Read more about System-Versioned Tables in the MariaDB documentation. Get MariaDB Server 10.3 as part of the MariaDB TX 3.0 download – now available.

About Rasmus Johansson

Rasmus has worked for MariaDB since 2010 and was appointed VP Engineering in 2013. He is responsible for the architecting and development of MariaDB Server and MariaDB Cluster.

Read all posts by Rasmus Johansson