Google Summer of Code 2024

This year we are again participating in the Google Summer of Code. The MariaDB Foundation believes we are making a better database that remains application compatible with MySQL. We also work on making LGPL connectors (currently C, C++, ODBC, Java, Node.js) and on MariaDB Galera Cluster, which allows you to scale your reads & writes. And we have MariaDB ColumnStore, which is a columnar storage engine, designed to process petabytes of data with real-time response to analytical queries.

Where to Start

Please join us on Zulip to mingle with the community. You should also subscribe to the developers mailing list (this is the main list where we discuss development - there are also other mailing lists).

To improve your chances of being accepted, it is a good idea to submit a pull request with a bug fix to the server.

Also see the List of beginner friendly issues from the MariaDB Issue Tracker.

List of Tasks

MariaDB Server

Implement IVFFlat indexing strategy for MariaDB Vector and evaluate performance

Part-time (175h) or full-time project (350h) - depending on scope MariaDB Vector is coming to MariaDB Server to serve AI Workloads. The current indexing strategy will use HNSW, but IVFFlat is a possible alternative that costs fewer resources to create. Having it as an option is desirable.

MDEV-17398 Spatial (GIS) functions in MariaDB

Part-time (175h) or full-time project (350h) - depending on scope

Our GIS functionality is limitted compared to other DBMSes. Given that MariaDB looks to facilitate migration from MySQL, we should be on par. We have a list of functions that are missing in MariaDB compared to MySQL, as described in https://mariadb.com/kb/en/function-differences-between-mariadb-1010-and-mysql-80/.
Our goal is to have as many of these functions available within MariaDB. Some of the functionality can be ported from MySQL, while others might require implementation from scratch.

Skills needed: Understanding of C++ development. Ability to navigate a large codebase (with help from mentor).
Mentors: Anel Husakovic (primary) / Vicențiu Ciorbaru (secondary)


MDEV-16482 MariaDB Oracle mode misses Synonyms

Full-time project 350h

Synonyms are an important feature, particularly as it helps smooth migration from other databases. While the initial project scope seems straightforward, there are a number of aspects that must be considered:

  1. Grammar extension
  2. Where will the synonyms definitions be stored?
  3. How do synonyms map to the underlying privilege system? Who can create a synonym? Who can access a synonym?
  4. Do we enforce the underlying object to exists before creating a synonym? What if the underlying object gets dropped?
  5. What kind of error messages do we present to the user in various corner cases?
  6. How do synonyms interact with replication (row based vs statement based)
  7. How do synonyms interact with views (and views execution)
  8. How to present synonyms to users (as part of INFORMATION_SCHEMA for instance?)
  9. Performance considerations for multiple connections to the database.

Skills needed: Understanding of C++ development. Able to write and discuss various tradeoffs such that we achieve a feature set that makes sense given the database's priorities.
Mentors: Vicențiu Ciorbaru (primary) / Michael Widenius (secondary)


MDEV-30645 CREATE TRIGGER FOR { STARTUP | SHUTDOWN }

Full-time project 350h

Support generalized triggers like

CREATE TRIGGER ... AFTER STARTUP ...
CREATE TRIGGER ... BEFORE SHUTDOWN ...
CREATE TRIGGER ... ON SCHEDULE ...

the latter being a synonym for CREATE EVENT.

  • should STARTUP/SHUTDOWN triggers run exclusively? that is, STARTUP trigger is run before any connection is allowed or in parallel with them? Same for SHUTDOWN.

Skills needed: Understanding of C++ development. Able to write and discuss various tradeoffs such that we achieve a feature set that makes sense given the database's priorities.
Mentors:


MDEV-21978 make my_vsnprintf to use gcc-compatible format extensions

Part-time project 175h

my_vsnprintf() is used internally in the server as a portable printf replacement. And it's also exported to plugins as a service.

It supports a subset of printf formats and three extensions:

  • %`s means that a string should be quoted like an `identifier`
  • %b means that it's a binary string, not zero-terminated; printing won't stop at \0, so one should always specify the field width (like %.100b)
  • %M is used in error messages and prints the integer (errno) and the corresponding strerror() for it
  • %T takes string and print it like %s but if the string should be truncated puts "..." at the end

gcc knows printf formats and check whether actual arguments match the format string and issue a warning if they don't. Unfortunately there seems to be no easy way to teach gcc our extensions, so for now we have to disable printf format checks.

An better approach would be to use gcc compatible format extensions, like Linux kernel does. We should migrate to a different syntax for our extensions

  • %sI to mean "print as an identifier"
  • %sB to mean "print a binary string"
  • %uE to mean "print an errno"
  • %sT to put a "..." as truncation indicator

old formats can still be supported or they can be removed and in the latter case the major version of the service should be increased to signal an incompatible change.

All error messages and all usages of my_vsnprintf should be changed to use the new syntax. One way to do it is to disable old syntax conditionally, only in debug builds. All gcc printf format checks should be enabled.

Skills needed: Understanding of C development.
Mentors:


Suggest a Task

Do you have an idea of your own, not listed above? Do let us know!

Comments

Comments loading...
Content reproduced on this site is the property of its respective owners, and this content is not reviewed in advance by MariaDB. The views, information and opinions expressed by this content do not necessarily represent those of MariaDB or any other party.