Using MariaDB MaxScale 2.1 Regex Filter for Migrations

Migrating applications from one database system to another is sometimes easy and sometimes not. But they are hardly ever effortless. Among the obvious issues are schema and data, migrating from one datatype to another, with slightly different behavior and semantics is one thing and another is migrating the actual data, is it UTF8 and if so how many bytes? What is the collation? What is the required accuracy of numeric types?

And on top of this are things such as triggers, stored procedures and such. Not to mention performance tuning and the optimal way to construct SQL statements.

Speaking of SQL statements, we have application code also. Yes, most databases have some kind of application running on them, often more than one, and these access the database using SQL over some kind of API such as JDBC, ODBC or some proprietary driver. And application code, even simple SQL tends to have one or two database specific constructs in them, and that is what this blog is about.

Before moving on to that though, a few words on MariaDB Server 10.2.6 which is GA since May 23. MariaDB Server 10.2 does contain more than a few things that make migration from other database systems to MariaDB a lot easier. Among these features are:

CHECK constraints. The syntax for these has been supported before, but in MariaDB Server 10.2 these are actually implementing proper constraints.

  • DEFAULT values. In MariaDB Server before version 10.2, these were several restrictions around what DEFAULT values could be used, and how, but in 10.2 these are lifted.
  • Multiple triggers per event. In MariaDB Server before 10.2 you could only have one trigger per DML eevent, i.e. several BEFORE INSERT triggers. This has two advantages, one is the obvious one that the database you are migrating from might be supporting multiple triggers per event. Another is that sometimes you want to add a trigger to implement some compatibility with the database system you are migrating from, and this feature makes doing this a lot easier.

With that said, let’s say you have migrated the schema and the procedures and what have you not, and also the data. Then you have replaced that ODBC driver from the existing database system with one from MariaDB, which means we are all set to try the application. And the application falls over on the first SQL statement because it uses some proprietary feature of the database we are migrating from. There are some ways of getting around that, with MariaDB there are two ways that have been used in the past:

  • Use MariaDB compatibility features. As stated above, there are many new compatibility features in MariaDB Server 10.2.6 GA. In addition there are some settings for the SQL_MODE parameter for compatibility, such as the PIPES_AS_CONCAT that ensures that the ANSI SQL concatenation operator, two pipes (||), is interpreted as a MariaDB CONCAT.
  • Develop procedures, functions and user defined functions that mimic procedures and functions in other database system.

There is nothing wrong with the above means of migration, but they don’t cover all aspect of a migration. One more tool that is available now is the new MariaDB MaxScale 2.1.3 GA and there is a plugin that is particularly useful, the Regex one. What this allows us to do is to replace text in the SQL statement so that it matches something that MariaDB Server can work with, and a good example is the Oracle DECODE() function. This function is rather special, in a few ways:

  • It takes a variable number of arguments, from 3 and up.
  • The type of the return value depends on the type of the arguments.

The SQL Standard construct for this is the CASE statement, which has the same attributes as above. We cannot solve the use of the DECODE function by adding a STORED FUNCTION. A UDF (User Defined Function) is possible as this can take any number and type of arguments. Also even though a UDF can only return a predefined type, this is not a big issue as MariaDB is loosely typed, so we can always, for numeric results, return a numeric string. A bigger issue though is that MariaDB already has a DECODE function that does something else.

Also, we would really like to use the CASE function and a way to deal with that is to use the MariaDB MaxScale Regex filter. Let me show you how. To begin with, we need to set up the Regex filter itself, and the way I do it here, I will use multiple filters, one for each of the number of arguments I pass to DECODE. I guess there is some way of doing this in a smarter way, but here I am just showing the principle. Also note that the Regex filter use the PCRE2 regular expressions, not the Posix one. Let’s start with a couple of filter specification for a DECODE with 3 and 4 arguments and define them in our MariaDB MaxScale configuration file:

[DecodeFilter3]
type=filter
module=regexfilter
options=ignorecase
match=DECODE(([^,)]*),([^,)]*),([^,)]*))
replace=CASE $1 WHEN $2 THEN $3 END

[DecodeFilter4]
type=filter
module=regexfilter
options=ignorecase
match=DECODE(([^,)]*),([^,)]*),([^,)]*),([^,)]*))
replace=CASE $1 WHEN $2 THEN $3 ELSE $4 END

As anyone can see, the above really isn’t perfect, things like strings with embedded comas and what have you not will not work, but in the general case, this should work reasonable well, which is not to say you would want to use this in production, but for a test or a proof-of-concept this is good enough. For DECODE with 5, 6 or more arguments, you add these following the pattern above.
Before we show this in action, let me add one more useful filter for the Oracle SYSDATE psedocolumn. In Oracle SQL, SYSDATE is the same as NOW() in MariaDB, so this is a simple replacement, but as SYSDATE is a pseudocolumn and not a function, like NOW(), we cannot write a simple Stored Function to handle it, but using a MaraiaDB MaxScale filter should do the trick, like this:

[sysdate]
type=filter
module=regexfilter
options=ignorecase
match=([^[:alpha:]])SYSDATE
replace=$1NOW()

With this, it is now time to enable these filters, and that is done by adding them to the Service in MariaDB MaxScale which we will use:

[Read-Write Service]
type=service
router=readwritesplit
servers=srv1
user=rwuser
passwd=rwpwd
max_slave_connections=100%
filters=DecodeFilter3|DecodeFilter4|sysdate

Assuming you have your MariaDB MaxScale correctly configured in any other place, let’s see if this works as expected. First, we have to restart MariaDB MaxScale and then when we connect to MariaDB and do a call to DECODE the way it looks like in Oracle and see what is returned:

$ mysql -h moe -P 4008 -u theuser -pthepassword
Welcome to the MariaDB monitor.  Commands end with ; or g.
Your MySQL connection id is 21205
Server version: 10.0.0 2.1.3-maxscale MariaDB Server

Copyright (c) 2000, 2016, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or 'h' for help. Type 'c' to clear the current input statement.

MariaDB> SELECT DECODE(1, 2, 3, 4) FROM dual;
+------------------------------------+
| CASE 1 WHEN  2 THEN  3 ELSE  4 END |
+------------------------------------+
|                                  4 |
+------------------------------------+
MariaDB> SELECT DECODE('Str1', 'Str1', 'Was str1', 'Not str1') FROM dual;
+----------------------------------------------------------------+
| CASE 'Str1' WHEN  'Str1' THEN  'Was str1' ELSE  'Not str1' END |
+----------------------------------------------------------------+
| Was str1                                                       |
+----------------------------------------------------------------+’
MariaDB> SELECT DECODE('Str1', 'Str2', 'Was str1') FROM dual;
+-----------------------------------------------+
| CASE 'Str1' WHEN  'Str2' THEN  'Was str1' END |
+-----------------------------------------------+
| NULL                                          |
+-----------------------------------------------+

As can be seen, the translation from DECODE to a CASE statement seems to work as expected. Let’s also try with SYSDATE

MariaDB> SELECT DECODE('Today', 'Today', SYSDATE, 'Some other day') FROM dual;
+-------------------------------------------------------------------+
| CASE 'Today' WHEN  'Today' THEN  NOW() ELSE  'Some other day' END |
+-------------------------------------------------------------------+
| 2017-05-19 18:51:22                                               |
+-------------------------------------------------------------------+

As we see here, not only does SYSDATE work as expected, we can handle both DECODE and SYSDATE conversions as the filters are piped to each other. Using MariaDB MaxScale with the Regex filter is yet another tool for migrating applications.

Happy SQL’ing
/Karlsson