August 11, 2014

Putting Virtual Columns to good use

We all know that in general it's a bad idea to have columns values contain too much "hidden" information, and in particular for primary keys, this is a big no-no, although I know that not everybody agrees here. In some cases though, there is data that at it's heart contains several aspects and we just cannot avoid this, the prime example being data and time values. What I mean here is that a single datetime value has aspects that aren't always obvious from the datetime value itself. Examples include leap year information and weekday. Sometimes datetime value also contain the timezone, in which case you might want to ask if this is the timezone of the server or the client.

Can you make a datetime unique? Let's assume you know that every row that contains a datetime value are at least 5 ms apart, from which you can determine that if the datetime is stored with ms precision, you will always get a unique datetime? Right? Well, yes, assuming you don't count on daylight savings time, in which case the same time will appear twice for one hour of the year (and another time will be "illegal" for one hour another time of the year).

But for now, let's instead have a look at the weekday, as this is a pretty common attribute derived from datetime and which is commonly used.

Let's assume we have an orders table:

CREATE TABLE `orders` ( `id` int(11) NOT NULL AUTO_INCREMENT, 
  `order_date` datetime NOT NULL,
  `customer_id` int(11) NOT NULL,
  PRIMARY KEY (`id`) );

If I now want to check which day of the week I get my orders, I can do this:

SELECT id, customer_id, DAYNAME(order_date) FROM orders;

But if we work with the dayname all the time, and maybe even want to select based on it, then this is clumsy and ineffective, and I can not use an index to select based on the name of the weekday. On the other hand, having the application add a column with the weekday is error prone and requires application changes. This is where a Virtual Column comes in, as this allows you to add a column whose value is derived from another column. In this case we could modify our table like this:

CREATE TABLE `orders` ( `id` int(11) NOT NULL AUTO_INCREMENT,
  `order_date` datetime NOT NULL,
  `order_dayname` varchar(8) AS (dayname(order_date)) VIRTUAL,
  `customer_id` int(11) NOT NULL,
  PRIMARY KEY (`id`) ); 

So now we can do this instead:

SELECT id, customer_id, order_dayname FROM orders;

In this case, the order_dayname column is created as needed, but you have the option of also storing it, which is required if you want to use an index on it, and this is how this is done:

CREATE TABLE `orders` ( `id` int(11) NOT NULL AUTO_INCREMENT,
  `order_date` datetime NOT NULL,
  `order_dayname` varchar(8) AS (dayname(order_date)) PERSISTENT,
  `customer_id` int(11) NOT NULL,
  PRIMARY KEY (`id`) );

This works the same way as above, but now we can create an index on this column:

CREATE INDEX ix1 ON orders(order_dayname);

And now, to get all orders entered on a Monday, this newly created index will be used:

EXPLAIN SELECT * FROM orders WHERE order_dayname = 'Monday'
+------+-------------+--------+------+---------------+------+---------+-------+------+-----------------------+
|   id | select_type | table  | type | possible_keys |  key | key_len |   ref | rows | Extra                 |
+------+-------------+--------+------+---------------+------+---------+-------+------+-----------------------+
|    1 | SIMPLE      | orders | ref  |           ix1 |  ix1 |      11 | const |    1 | Using index condition |
+------+-------------+--------+------+---------------+------+---------+-------+------+-----------------------+

 

Further reading

See the MariaDB Knowledge base for more information on Virtual Columns.

About Anders Karlsson

Anders Karlsson is a Sales Engineer with a long experience from the field of database software. Anders has worked for many of the major database software companies.

Read all posts by Anders Karlsson