Top database challenges for MariaDB

The latest top research identifies a number of big trends and challenges for databases. Jan Lindström, developer at MariaDB, summarises the key issues and challenges and how MariaDB approaches them.

First of all, the opinions expressed in this article are the author’s own and do not necessary reflect the view of the MariaDB Corporation. These views are based on the review article by Abadi, et.all.: The Beckman Report on Database Research, Communications of the ACM, Vol. 59, NO. 02, 02/2016. While this meeting with thirty leaders from the database research community met in October 2013, my view is that issues raised in this meeting are still more than valid.

The review article identifies big data as a defining challenge of our time. This is because it has become cheaper to generate data due to inexpensive storage, sensors, smart devices, social software, multiplayer games, and the Internet of Things. Additionally, it has become cheaper to process large amounts of data, due to advances in multicore CPUs, solid state storage, cheap cloud computing, and open source software.

By 2020, the International Data Corporation (IDC) predicts that the amount of digital information created and replicated in the world will grow to almost 40 zettabytes (ZB)—more than 50 times what existed in 2010 and amounting to 5,247 gigabytes for every person on the planet (see http://www.datacenterjournal.com/birth-death-big-data/).

This means that organizations have more and more unstructured and unused data that could contain valuable information for predicting business trends and making business decisions. Forbes predicted in 2015 that Buying and selling data will become the new business bread and butter.

In the recent years the database research and development community has strengthened core research and development in relational DBMSs and branched out into new directions: security, privacy, data pricing, data attribution, social and mobile data, spatiotemporal data, personalization and contextualization, energy constrained processing, and scientific data management.

These lofty research challenges must be taken down on a functionality, if not even feature level. Here are some features we’re working on, in a chewing-an-elephant-a-bite-at-a-time fashion.

Security:
Personalization
- Default roles
Spatiotemporal data
- Support for Spatial Reference systems for the GIS data , new REF_SYSTEM_ID column attribute can be used to specify Spatial Reference System ID for columns of spatial data types.
- More functions from the OGC standard added

The review article identifies five big data challenges: scalable big/fast data infrastructures; coping with diversity in data management; end-to-end processing of data; cloud services; and the roles of the people in the data life cycle. Three of the challenges deal with the volume, velocity, and variety aspects of big data. The last two challenges deal with extending big data applications in the cloud and managing the involvement of people in these applications.

How can MariaDB address these challenges? By developing new storage engines for Big Data like MariaDB ColumnStore (earlier InfiniDB). MariaDB ColumnStore is a scalable columnar database management system built for big data analytics, business intelligence, data warehousing and other read-intensive application. Column-store architecture enables very quick load and query times. Its massive parallel processing (MPP) technology scales with any type of storage hardware.

Furthermore, MariaDB can support new datatypes and SQL-functions like:

Window functions: First drop in MariaDB 10.2.0.
JSON

Forbes predicted on 2015 that Security will become the killer app for big data analytics. Now that MariaDB 10.1 server provides tools for data at rest encryption, other storage engines can easily provide security feature for their data.

However, as seen from research challenges there is a lot of room for additional vision and development on relational database management systems like MariaDB.