ColumnStore User Defined Aggregate and Window Functions
Introduction
Starting with MariaDB ColumnStore 1.1, the ability to create and use user defined aggregate and window functions is supported in addition to scalar functions. A C++ SDK is provided as well as 3 reference examples that provide additional functions that may be of general use:
- median - mathematical median, equivalent to percentile_cont(0.5)
- avg_mode - mathematical mode, i.e the most frequent value in the set
- ssq - sum of squares, i.e the sum of each individual number squared in the set
Using user defined aggregate functions
The reference examples above are included in the standard build of MariaDB ColumnStore and so can be used by registering them as user defined aggregate functions. The same can be done for new functions assuming the instance has the updated libraries included. From a mcsmysql prompt:
CREATE AGGREGATE FUNCTION median returns REAL soname 'libudf_mysql.so'; CREATE AGGREGATE FUNCTION avg_mode returns REAL soname 'libudf_mysql.so'; CREATE AGGREGATE FUNCTION ssq returns REAL soname 'libudf_mysql.so';
After this these maybe used in the same way as any other aggregate or window function like sum:
SELECT grade, AVG(loan_amnt) avg, MEDIAN(loan_amnt) median FROM loanstats GROUP BY grade ORDER BY grade;
Developing a new function
This requires a MariaDB ColumnStore source tree and necessary tools to compile C/C++ code. The SDK and reference examples are available in the utils/udfsdk directory of the source tree. This contains the SDK documentation which is also available here:
- 1.1.0 Beta: TODO