ColumnStore User Defined Functions

You are viewing an old version of this article. View the current version here.

Introduction

MariaDB provides extensibility support through user defined functions. For more details on the MariaDB server framework see the user-defined-functions article.

MariaDB ColumnStore provides scale out query processing and as such requires a separate distributed implementation of each SQL function. This allows for the function application to happen on each PM server node providing distributed scale out performance.

Thus, to fully implement a user defined function for MariaDB ColumnStore requires implementing 2 different API's:

  • The MariaDB server UDF API: This allows utilization on all storage engines and is the implementation used if applied in the select list.
  • The ColumnStore distributed UDF API: This enables distributed execution of where clause and group by usage of the function and will be pushed down to PM nodes for execution where possible.

MariaDB ColumnStore supports user defined function implementations in C/C++. User defined aggregate and window functions are not supported in ColumnStore 1.0.

Developing a user defined function

The development kit can be found under the utils/udfsdk directory of the mariadb-columnstore-engine source tree. To develop a user defined function requires you to set up a development environment and be comfortable with c++ development. To setup a ColumnStore development environment please follow the instructions on dependencies in the ColumnStore server fork repository.

Three main files will need to be modified in order to add a new UDF:

  • udfmysql.cpp : mariadb server UDF implementation
  • udfsdk.h : class headers.
  • udfinfinidb.cpp : distributed columnstore UDF implementation.

Two reference implementations are provided to provide guidance on creating your own functions:

  • IDB_IsNull : this illustrates a simple one argument function providing the ability to return a boolean if the expression parameter is null
  • IDB_Add: this illustrates a simple 2 argument function to illustrate adding 2 values and return the sum.

It is simplest to copy these and adapt to your needs. There are no system dependencies on the included reference implementations so these can be removed to simplify the class files if preferred.

MariaDB server UDF implementation

Three functions are required to be implemented (for more details see user-defined-functions):

  • x_init : perform any parameter validation or setup such as memory allocation.
  • x : perform the actual function implementation.
  • x_deinit : perform any clean up tasks such as deallocating memory where 'x' is the function name.

ColumnStore distributed UDF implementation

The function name and class must be registered in order to be recognized and used by the ColumnStore primitive processor. This is done by adding a line to perform the registration in the UDFSDK::UDFMap() function in the file udfinfinidb.cpp:

FuncMap UDFSDK::UDFMap() const
{
	FuncMap fm;
	
	// first: function name
	// second: Function pointer
	// please use lower case for the function name. Because the names might be 
	// case-insensitive in MySQL depending on the setting. In such case, 
	// the function names passed to the interface is always in lower case.
	fm["idb_add"] = new IDB_add();
	fm["idb_isnull"] = new IDB_isnull();

	return fm;
}

For any new user defined functions add a new entry into the FuncMap object mapping the name to the udf class.

The UDF class should be defined in file udfsdk.h and implemented in file udfinfiinidb.cpp. It is easiest to adapt the example classes for new instance but each class must implement the funcexp::Func C++ class definition:

  • constructor: any initialization necessary
  • destructor: any de-initialization.
  • getOperationType: this performs parameter validation and returns the result data type.
  • get<DATATYPE>Val : computes and returns the value of the user defined function for each given return datatype.

Comments

Comments loading...
Content reproduced on this site is the property of its respective owners, and this content is not reviewed in advance by MariaDB. The views, information and opinions expressed by this content do not necessarily represent those of MariaDB or any other party.