Google Summer of Code 2017
We participated in the Google Summer of Code 2017 (we have participated previously in 2016, 2015, 2014, and 2013). The MariaDB Foundation believes we are making a better database that remains application compatible with MySQL. We also work on making LGPL connectors (currently C, ODBC, Java) and on MariaDB Galera Cluster, which allows you to scale your reads & writes. Lately, we also have MariaDB ColumnStore, which is a columnar storage engine, designed to process petabytes of data with real-time response to analytical queries.
- Where to start
- List of tasks
- Support for GTID in mysqlbinlog
- Automatic provisioning of slave
- GIS enhancements
- mysqltest improvements
- Allow multiple alternative authentication methods for the same user
- connection encryption plugins
- OSS-Fuzz configuration for MariaDB
- NUMA for MariaDB
- Cassandra Storage Engine V2
- ColumnStore: Add proper NULL support
- ColumnStore: Add full support for DECIMAL
- ColumnStore: Full UTF8 support.
- Suggest a task
Where to start
Please join us at
irc.freenode.net at #maria to mingle with the community. Don't forget to subscribe to email@example.com (this is the main list where we discuss development).
A few handy tips for any interested students who are unsure which projects to choose: Blog post from former GSoC student & mentor
List of tasks
The complete list of tasks suggested for GSoC 2017 is located in the MariaDB Jira. A subset is listed below.
Support for GTID in mysqlbinlog
The mysqlbinlog tool needs to be updated to understand the replication feature called Global Transaction IDs (GTIDs) in MariaDB 10. The current version does not support GTIDs and the MySQL variant does not speak MariaDB 10's GTIDs.
Automatic provisioning of slave
The purpose of this task is to create an easy-to-use facility for setting up a new MariaDB replication slave.
GIS enhancements for 10.1 that we want to work on include adding support for altitude (the third coordinate), converters (eg. ST_GeomFromGeoJSON - ST_AsGeoJSON, ST_GeomFromKML - ST_AsKML, etc.), Getting data from SHP format (shp2sql convertor), as well as making sure we are fully OpenGIS compliant.
mysqltest is a client utility that runs tests in the mysql-test framework. It sends sql statements to the server, compares the results with the expected results, and uses a special small DSL for loops, assignments, and so on. It's pretty old and very ad hoc with many strange limitations. It badly needs a proper parser and a consistent logical grammar.
Allow multiple alternative authentication methods for the same user
Currently one can specify only one authentication method per user. It would make a lot of sense to support multiple authentication methods per user. PAM-style. For example, one may want to authenticate using unix_socket when connecting locally, but ask for a password if connecting remotely or if unix_socket authentication failed.
connection encryption plugins
Encrypting the client-server communications is closely related to authentication. Normally SSL is used for the on-the-wire encryption, and SSL can be used to authenticate the client too. GSSAPI can be used for authentication, and it has support for on-the-wire encryption. This task is about making on-the-wire encryption pluggable.
OSS-Fuzz configuration for MariaDB
This would involve randomizing a bunch of queries (RQG based?), configurations and replication setups to search for segfaults, race conditions and perhaps invalid results.
NUMA for MariaDB
This would involve organising a bunch of memory and threads to run on the same NUMA node. Attention to detail to ensure no additional race conditions get added in the process. A good understanding of systems programming would be useful. Ability to implement WIndows NUMA support at the same time as Linux NUMA support would be advantageous.
|Skills:||C, locking and threads, Windows system programming and Innodb internals would be a plus.|
Cassandra Storage Engine V2
Current Cassandra Storage Engine was developed against Cassandra 1.1 and it uses Thrift API to communicate with Cassandra. However, starting from Cassandra 1.2, the preferred way to access Cassandra database is use CQL (Cassandra Query Language) and DataStax C++ Driver (https://github.com/datastax/cpp-driver). Thrift-based access is deprecated and places heavy constraints on the schema.
This task is about re-implementing Cassandra Storage Engine using DataStax C++ Driver and CQL.
ColumnStore: Add proper NULL support
At the moment NULL is just the maximum integer for a column (or empty string for VARCHAR/CHAR). We need a mechanism to store NULLs separately to give us full type ranges.
ColumnStore: Add full support for DECIMAL
Right now it is cast to double which is not great for obvious reasons. It will mean modifying a lot of ColumnStore's version of MariaDB's function implementations and allowing column files to store more than 8 bytes per field.
ColumnStore: Full UTF8 support.
This includes collations and anything that works on the length of the string.
Suggest a task
Do you have an idea of your own, not listed above or in Jira? Do let us know!