All pages
Powered by GitBook
1 of 23

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Contributing & Participating

There are many ways to contribute to MariaDB.

Sponsoring the MariaDB Project

Sponsorships are crucial for ongoing and future development of the MariaDB project! There are a number of easy ways you for you to help the project:

  • Fund the development of a specific feature. You can find a list of suggested features to sponsor here or in JIRA. Feel free to sign in and add more projects to either place!

  • Contribute with developer time. If your organization has talented developers familiar with MariaDB or MySQL codebase they can become part of the MariaDB team and contribute to the development of the MariaDB project.

  • Hire a developer that you dedicate to work on the MariaDB project.

  • A pure with no strings attached

This page is licensed: CC BY-SA / Gnu FDL

Google Summer of Code

MariaDB has participated in since 2013. This section contains pages providing information for each year.

We believe we are making a better database that remains application compatible with MySQL. We also work on making LGPL connectors (currently , , , ) and on , which allows you to scale your reads & writes. Lately, we also have , which is a columnar storage engine, designed to process petabytes of data with real-time response to analytical queries.

Google Season of Docs

Remove from the navigation. Discuss those pages with Max.

MariaDB applied to participate in the first .

Google Season of Docs

Contributing Code

For contributors interested in MariaDB development, explore open projects via JIRA and check for beginner-friendly tasks. Engage with the community on the maria-developers mailing list, Slack , Zulip, or IRC channel for guidance.

Contributing to the MariaDB Project

The success of MariaDB relies heavily on community involvement. You can contribute in various ways, even if you are not a developer.

Cover
Cover
donation
Google Summer of Code
Node.js
MariaDB Galera Cluster
MariaDB ColumnsStore

Contributing to the MariaDB Project

How to Contribute to MariaDB

The success of MariaDB relies heavily on community involvement. You can contribute in various ways, even if you are not a developer:

  • Bug Reporting: Create an account to report bugs.

  • Knowledge Sharing: Answer questions in the documentation or on IRC and Zulip.

  • Testing: Upload bug tests to the FTP server for private testing.

  • Documentation: Write or translate articles and documentation for MariaDB.

  • Advocacy: Promote MariaDB and participate in open source events.

  • Support: Donate time or resources, or engage your company in supporting MariaDB.

Explore more ways to get involved on the .

The success of MariaDB depends on the participation of the community. MariaDB would not be as good as it is today without the participation of the entire MariaDB community.

Getting Started

There are several ways to contribute besides , and not all of them require strong C/C++ skills. Areas for non-developers include:

    • To report a bug you'll need to sign up for an account by clicking on the Create an account link below the login fields

  • Helping other people answering problems or even fixing their bugs on in the #maria channel on or on MariaDB's Zulip instance at .

MariaDB Email Lists

A great way to get started in MariaDB is to participate in e-mail discussions via our mailing lists (whichever list best matches your interests):

Sensitive security issues can be sent directly to the persons responsible for MariaDB security: security [AT] mariadb (dot) org.

You can find additional email address, email archives and ways to connect with MariaDB people .

All MariaDB contributors are expected to follow the .

Getting Started for Developers

See also the pages for new developers on the MariaDB Foundation website:

This page is licensed: CC BY-SA / Gnu FDL

MariaDB Public FTP Server

MariaDB provides a secure FTP, SFTP and WebDAV server where you can upload files to be used by MariaDB developers, for example table structures and data for bug reports.

The folder tree consists of:

  • The public folder for files that the MariaDB developers want to give the public access to (patches, samples etc).

  • The private folder for uploads. Files uploaded there can only be accessed by MariaDB developers. You will not be able to see your upload and this folder does not allow downloads. This is done to protect any sensitive information which may be in test results, mysqld & core files. Upload those into this folder.

  • The secret folder is for private downloads. Files in this folder are not visible so you will need the complete filename to successfully download a file from this folder.

To share files with MariaDB developers, upload it into the private directory with either:

  • SFTP client (scp), enter 'anonymous' as the password:

You can ignore the 'fsetstat: Permission denied' error.

  • WebDAV client (curl):

  • FTP client (lftp); enter 'anonymous' as the password:

You can ignore the 'network error'.

Note for MariaDB developers: please request your access to the SFTP service if not already at ftp@mariadb.org (provide public SSH key and username). You will then be able to access the service with:

or with HTTPS at .

Contributing Code

For contributors interested in MariaDB development, explore open projects via and check for . Engage with the community on the mailing list, , , or channel for guidance.

General information about contributing to MariaDB (for developers and non-developers) can be found on the page.

Finding Development Projects to Work on

There are many open development projects for MariaDB which you can contribute to (in addition to any ideas you may have yourself).

Google Summer of Code 2013

We participated in Google Summer of Code 2013. MariaDB and the MariaDB Foundation believes we are making a better database that remains a drop-in replacement to MySQL. We also work on making LGPL connectors (currently in C, Java, C++ in development) and we also work on MariaDB Galera Cluster which allows you to scales your reads & writes.

Where to start

Please join us at irc.freenode.net at #maria to mingle with the community. Or subscribe to . Or both.

Please keep in mind that in April we travel a lot (conferences, busy time), so if you have a question and nobody on IRC answers — do not feel disappointed, ask in an email to maria-developers@lists.launchpad.net.

  • Testing and Benchmarking

    • Bug tests can be uploaded to the 'private' directory of our FTP server.

  • Creating documentation for MariaDB.

  • Advocating MariaDB in your area.

  • Participate in open source events and talk about MariaDB.

  • Running a BuildBot on a platform which is not currently builders.

  • Donate time or money to the MariaDB project.

  • Ask your company to sponsor a feature.

  • MariaDB Foundation page on getting involved

  • MariaDB Foundation page
    Contributing Code
    Reporting Bugs
    Suggesting Features
    IRC
    Libera.Chat
    mariadb.zulipchat.com
    developers
    discuss
    docs
    here
    Ubuntu Code Of Conduct
    Contributing Code
    Getting Started For Developers
    Get the Code, Build It, Test It
    Writing Good Test Cases for MariaDB Server
    Submitting a Pull Request
    • We are using JIRA to manage the MariaDB project. Go to jira.mariadb.org and click on "Projects" to get to the MariaDB project. Browse around the unresolved and unassigned issues to see if there is something that interests you. Some issues have sponsors and you can be paid for doing them!

    • A list of beginner friendly tasks is also available.

    • Check the development plans for the next MariaDB version.

    • Join maria-developers and ask for suggestions of tasks you could do. Please include your programming experience and your knowledge of the MariaDB source and how much you know about using MySQL/MariaDB with the email so that we know which tasks we can suggest to you.

    • If this is your first project, check out the page. It lists projects that will make a good start.

    • Join MariaDB's Zulip instance at and ask for suggestions.

    • Join #/maria on and ask for suggestions.

    If you have your own ideas, please submit them to JIRA so other MariaDB developers can comment on them and suggest how to implement them. You can of course also use the maria-developers list for this.

    What to Expect From a MariaDB Server Developer

    This section is mainly directed to developers with commit rights to the MariaDB git repository. However, we hope it’s also useful for anyone wanting to contribute code to MariaDB to know what a reviewer will expect from them.

    This is not about coding style or if one should prefer C instead of C++. That would be a separate topic that should be created sooner or later.

    The Basics

    When coding, try to create code that 'never has to be changed again'. Try to make the code as performant as possible. In general it is acceptable to spend 50% more time to make the code 15% faster than what you originally intended. Take that into account when you plan your time estimates! That said, don't try to add classes or functionality that is not yet used.

    The code should be easy to read and follow the coding standards of the project. Patches that are smaller and simpler are often better than complex solutions. Don't make the server depend on new external libraries without first checking with Sergei or Monty!

    Add code comments for anything that is not obvious. When possible, use assertions within the code to document expectations of arguments etc. In general, if the code requires complex comments, think if there is a better way to structure the logic. Simpler is often better and with fewer bugs.

    What to Have in a Commit Comment

    • Jira issue number and summary ex:MDEV-23839 innodb_fast_shutdown=0 hang on change buffer merge

    • An empty line

    • A short description of the problem

    • A description of the solution

    • Any extra information needed to understand the patch

    • The commit message should be self contained and the reviewer shouldn't preferably have to look at the Jira at all to understand the commit. This doesn’t mean that the commit message should include all background and different design options considered, as the Jira should contain.

    • Name of all reviewers and authors should be clear from the commit message. The preferred way would be (one line per person)

      • Reviewed-by: email

      • Co-authored-by: email

      • See for details

    • The default is that all code should be reviewed. Only in really extraordinary cases, like merge (where the original code was already reviewed) then it can be self-reviewed, which should clear from the commit. In this case the code should of course be tested extra carefully both locally and in buildbot before pushing.

    Testing

    • All code should have a test case that shows that the new code works or, in case of a bug fix, that the problem is fixed! It should fail with an unpatched server and work with the new version. In the extreme case that a test case is practically impossible to do, there needs to be documentation (in the commit message, optionally also in Jira) how the code was tested.

    • The test case should have a reference to the Jira issue, if such one exists.

    • Patches related to performance should be tested either by the developer (for simple commits) or by performance testers. The result should be put in Jira with a summary in the commit.

    • Complex patches and should be tested by QA in a bb- branch before pushing. The Jira entry should include information that this has been done and what kind of test has been run.

      • Example: git push --force origin HEAD:bb-11.8-MDEV-1234

    • For anything not trivial, one should run either Valgrind or ASAN/MSAN on the new code. (Buildbot will do this for you if you can’t get valgrind or ASAN to work). At least the test case added should be tested by either. If the developer cannot do that for some reason, he should check the buildbot builders that do this and ensure that at least his test case doesn’t give any warnings about using not initialized memory or other failures.

    • For complex code the developer should preferably use gcov or some similar tool to ensure that at least not all not-error branches are executed. “mtr --gcov” or “dgcov.pl” can help you with this.

    Getting Your Code into the Main MariaDB Tree

    All code in MariaDB comes from one of the following sources:

    • MySQL

    • Code developed by people employed by the MariaDB Foundation.

    • Code developed by people employed by MariaDB Corporation.

    • Code shared with the MariaDB Foundation under the MCA.

    • Code with a known origin that is under a permissive license (BSD or public domain).

    If you want the code to be part of the main MariaDB tree, you also have to give the MariaDB Foundation a shared copyright to your code. This is needed so that the foundation can offer the code to other projects (like MySQL).

    You do this by either:

    1. Signing the MariaDB Contributor Agreement (MCA) and then scanning and sending it to the foundation.

    2. Sending an email to maria-developers where you say that your patch and all fixes to it are provided to the MariaDB Foundation under the MCA.

    3. Licensing your code using the BSD license.

    We need shared copyright for the following reasons:

    1. to defend the copyright or GPL if someone breaks it (this is the same reason why the Free Software Foundation also requires copyright assignment for its code)

    2. to be able to donate code to MySQL (for example to fix security bugs or new features)

    3. to allow people who have a non-free license to the MySQL code to also use MariaDB (the MCA/BSD allows us to give those companies the rights to all changes between MySQL and MariaDB so they can use MariaDB instead of MySQL)

    More information about the MCA can be found on the MCA FAQ page.

    Before Pushing Code to a Stable Branch

    • Ensure that you have compiled everything for your new code, in a debug server (configured with cmake -DCMAKE_BUILD_TYPE=Debug ) including embedded and all plugins that may be affected by your code change..

    • Run the mysql-test-run (mtr) test suite locally with your debug server.

    • For anything complex the full test suite should be run.

    • For something absolutely trivial, at least the main suite must be run.

    • Always push first to a bb- branch to test the code. When the bb- branch is green in you can push to the main branch. Take care of checking that Windows builds compiles (take extra care of checking this as this often fails) and that valgrind and msan builds doesn’t show any problems with your new test cases.

      • You can find your push at the link similar to .

    • If you have to do a rebase before pushing, you have to start from the beginning again.

    • When porting code from third parties (such as MySQL), make sure to attribute copyright to the right owner, in the header of each modified file.

    • For example: Copyright © 2000, 2018, Oracle and/or its affiliates. Copyright © 2009, 2020, MariaDB

    • The only exception is that if the changes are trivial and the rebase was trivial and the local mysql-test-run worked, then you can push directly to the main branch. Only do this if you are 99% sure there are no issues! * Please don't make us regret that we have made this one exception! When we have protected git branches, then the above rule will be enforced automatically as the protection will take care of this.

    Working on a New Project

    • First create a Jira entry that explains the problems and the different solutions that can be used to solve the problem. If there is a new syntax include examples of queries and results.

    • After getting an agreement of the to-be-used solution, update the Jira entry with the detailed architecture of the suggested solution.

    • When the architecture is reviewed, the assigned developer can start coding.

    • When the code is ready, the Jira entry should be updated with the reviewer.

    • The reviewer checks the code and either approves it to be pushed or gives comments to the developers that should be fixed. In the later case the developer updates the code and gives it back to the reviewer. This continues until the code is approved.

    • If the design changes during the project, the design in Jira needs to be updated.

    Working on a Bug Fix

    • Ensure that the Jira issue is up to date.

    • For complex bugs that require redesign, follow the process in "Working on a new project"

    • For simpler bugs, one can skip the listing of different solutions and architecture. However one should still document the reason for the bug and how it's fixed or to-be-fixed, in a JIRA comment.

    Making Things Easier for Reviewers

    • Ensure that code compiles, all MTR test works before asking for a review

    • Try to split a bigger project into smaller, self-contained change sets.

    • Automatic things, like renames of classes, variables, functions etc is better to be done in a separate commit.

    When Reviewing Code

    Remember that the stability and security of any project hangs a lot on the reviewers. If there is something wrong with an accepted patch, it's usually the reviewer who is to be blamed for it, as the reviewer was the one who allowed it to go in!

    • Ensure that the code is licensed under New BSD or another approved license for MariaDB (basically any open source license not conflicting with GPL) or that the contributor has signed the MCA.

    • GPL is only allowed for code from MySQL (as MariaDB is already depending on MySQL code).

    • Ensure that commits are not too large. If the code is very large, give suggestions how to split it into smaller pieces. Merge commits, when rebasing is possible, are not allowed, to keep history linear.

    • Check that the commits message describes the commit properly. For code that improves performance, ensure that Jira and the commit message contains information about the improvements.

    • Check that there are no unexplained changes in old tests.

    • Check the quality of the code (no obvious bugs, right algorithms used)

    • Check if any code can be simplified or optimized. Using already existing functions, are loops optimal, are mutexes used correctly etc.

    • Check that there is an appropriate test case for the code. See ‘testing’ for what is required!

    • Ensuring the code follows the coding standard for MariaDB. This document should be created shortly, but in the meantime ask an old MySQL/MariaDB developer if you are unsure.

    • Ensuring that the code follows the architecture agreed for in Jira (if it's in Jira).

    • Code should be easy to understand (good code comments, good function and variable names etc).

    • Ensure you understand every single line of code that is reviewed. If not, ask the developer to add more comments to get things clear or ask help from another reviewer.

    • No performance degradations for all common cases.

    • Any code that touches any sensitive area (files, communication, login, encryption or security) needs to have another reviewer that is an expert in this area.

    See Also

    • Getting Started For Developers (mariadb.org)

    • Get the Code, Build It, Test It (mariadb.org)

    • Writing Good Test Cases for MariaDB Server (mariadb.org)

    • Submitting a Pull Request (mariadb.org)

    • (for non-developers)

    This page is licensed: CC BY-SA / Gnu FDL

    JIRA
    beginner-friendly tasks
    maria-developers
    Slack
    Zulip
    IRC
    Contributing to the MariaDB Project
    LDAP authentication plugin

    We would like the authentication system to be able to authenticate against a LDAP Directory Server.

    See .

    Skills: C, working knowledge of LDAP

    Mentor: Sergei Golubchik

    Kerberos authentication plugin

    this project is taken

    Kerberos is a security mechanism used in a lot of financial institutions. A MySQL plugin that allows authentication against Kerberos is the goal here.

    See .

    Skills: C/C++, working knowledge of Kerberos

    Mentor: Sergei Golubchik

    Active Directory authentication plugin

    The Microsoft Windows world is all about Active Directory and upstream MySQL Enterprise already has this feature (though its a paid offering). It would be great to have an open source equivalent.

    See .

    Skills: C/C++, working knowledge of Active Directory/SAMBA, Windows-based development environment

    Mentor: Sergei Golubchik, Vladislav Vaintroub

    Keystone authentication plugin

    Keystone is the OpenStack Identity Service. The idea would be to ensure that MariaDB can authenticate to Keystone directly.

    Skills: Python, C/C++

    Mentor: Mark Riddoch

    Regex enhancements

    this project is taken

    MySQL and MariaDB use an old regex library, it works bytewise, and thus only supports one byte character set. It needs to be replaced by a modern multi-byte character set aware regex library.

    Additionally a much requested REGEX_REPLACE function should be implemented. (See also mysql-udf-regexp for some UDF code that could be used as a starting point for this)

    Detailed task description: MDEV-4425

    Skills: C/C++

    Mentor: Alexander Barkov

    Self-Tuning Optimizer

    One of the reasons of bad query plans is inadequate cost estimation of individual operations. A cost of reading a row in one engine might be a lot higher than in some other, but optimizer cannot know it. Also, it uses hard-coded constants, assuming, for example, that evaluating a WHERE clause is 5 times cheaper than reading a row from a table.

    Obviously, some kind of calibration procedure is needed to get these cost estimates to be relatively correct. It is not easy, because the estimates depend on the actual hardware where MariaDB is run (a cost of a row read is different on HD and SSD), and also — somewhat — on the application.

    A simple and low-maintenance solution would be to use self-tuning cost coefficients. They measure the timing and adjust automatically to the configuration where MariaDB is run.

    See MDEV-350.

    Skills: C/C++

    Mentor: Sergei Golubchik

    Roles

    this project is taken

    Roles, close to SQL:2003 standard. See MDEV-4397.

    Skills: C/C++

    Mentor: Sergei Golubchik

    Potential list

    suggested development

    This page is licensed: CC BY-SA / Gnu FDL

    maria-developers@lists.launchpad.net
    curl -T MDEV-XXXXX.tgz -u anonymous:anonymous https://webdav.mariadb.org/private/MDEV-XXXXX.tgz
    Created
    https://ftp.mariadb.org

    Google Season of Docs 2020

    MariaDB applied to participate in the 2020 Google Season of Docs, but was unsuccessful.

    Where to Start

    Please join us on Zulip to mingle with the community. You can also subscribe to maria-docs@lists.launchpad.net, the documentation mailing list.

    List of Tasks

    Migration Documentation

    You will choose a major relational DBMS, and, focusing on the most recent stable releases, document the process to migrate to MariaDB, including MariaDB equivalents to features in that system, and a detailed list of features that exist in one but not the other, as well as possible workarounds. For an example, see the work-in-progress as well as the documentation on (bearing that MariaDB is a MySQL fork, and is substantially more similar to MySQL than to other systems).

    Stored Procedures Documentation

    The capabilities of MariaDB Server are critical to producing large-scale applications. The current documentation lacks sufficient examples, and the examples warrant testing. Getting Started content would make this easier to adopt. MariaDB's implementation is based on ISO SQL/PSM.

    Getting Started with Connector/C and Connector/J

    and provide the ability to access a MariaDB Server from applications built in C/C++ and Java, respectively. The current documentation for these connectors lacks Getting Started guides.

    Spider Documentation

    The Storage Engine uses partitioning to provide data sharding through multiple servers. This task involves greatly expanding the existing documentation, including more detail about when to use Spider, basic usage tutorials, updating the , detailed examples of the effects of the and , as well as the . You will also ensure changes in the most recent Spider releases are properly documented.

    Mroonga Documentation

    is a full text search storage engine based on Groonga, which is an open-source CJK-ready fulltext search engine using column base. This project involves greatly expanding the existing MariaDB documentation on the use of this storage engine. A detailed tutorial and user guide, including examples of the various Mroonga , and the effects of changing their settings, as well as the parser and .

    Translation

    You will choose a major language and ensure that a substantial subsection of the documentation is translated into that language. See .

    Loaded from the Loading Issues from [Jira](https://jira.mariadb.org/issues/?jql=labels=gsdocs19 and project=mdev order by key)

    %fields.summary%

    %renderedFields.description%

    Details:
    Mentor:

    <>

    Suggest a Task

    Do you have an idea of your own, not listed above? Do let us know!

    This page is licensed: CC BY-SA / Gnu FDL

    Google Summer of Code 2014

    We participated in Google Summer of Code 2014. MariaDB and the MariaDB Foundation believe we are making a better database that remains a drop-in replacement to MySQL. We also work on making LGPL connectors (currently in C, Java, C++ in development) and on , which allows you to scale your reads & writes.

    Where to start

    Please join us at irc.freenode.net at #maria to mingle with the community. Or subscribe to . Or both.

    Please keep in mind that in April we travel a lot (conferences, busy time), so if you have a question and nobody on IRC answers — do not feel disappointed, ask in an email to maria-developers@lists.launchpad.net. Asking on the mailing list means others benefit from your Q&A too!

    Google Season of Docs 2019

    Google Season of Docs 2019

    MariaDB applied to participate in the first Google Season of Docs. Unfortunately, as a pilot project, Google were only able to accept a limited number of applications, and we were unsuccessful.

    Where to Start

    scp MDEV-XXXXX.tgz anonymous@ftp.mariadb.org:private/
    (anonymous@ftp.mariadb.org) Password: 
    MDEV-XXXXX.tgz                                                          100%  152KB 218.8KB/s   00:00    
    scp: remote fsetstat: Permission denied
    lftp -u anonymous -e 'put  MDEV-XXXXX.tgz' ftp://ftp.mariadb.org/private/
    Password: 
    cd ok, cwd=/private                                            
    put: Access failed: 550 Issue during transfer: network error: error transferring data: read tcp
    [...] read: connection reset by peer (MDEV-XXXXX.tgz)
    sftp user@ftp.mariadb.org
    Suggested Development
    mariadb.zulipchat.com
    IRC
    creating-a-commit-with-multiple-authors
    buildbot
    grid?branch=your-branch-name
    Contributing to the MariaDB Project
    LDAP authentication plugin

    We would like the authentication system to be able to authenticate against a LDAP Directory Server.

    See .

    Skills: C, working knowledge of LDAP

    Mentor: Sergei Golubchik

    Self-Tuning Optimizer

    this project is taken

    One of the reasons for bad query plans is inadequate cost estimation of individual operations. A cost of reading a row in one engine might be a lot higher than in some other, but the optimizer cannot know it. Also, it uses hard-coded constants, assuming, for example, that evaluating a WHERE clause is 5 times cheaper than reading a row from a table.

    Obviously, some kind of calibration procedure is needed to get these cost estimates to be relatively correct. It is not easy, because the estimates depend on the actual hardware where MariaDB is run (a cost of a row read is different on HD and SSD), and also — somewhat — on the application.

    A simple and low-maintenance solution would be to use self-tuning cost coefficients. They measure the timing and adjust automatically to the configuration where MariaDB is run.

    See MDEV-350.

    Skills: C/C++

    Mentor: Sergei Golubchik

    Port InnoDB memcached interface to MariaDB

    MySQL 5.6 has a memcached plugin to InnoDB. MySQL 5.7 has improved performance of this. The task would be to port this to run against MariaDB, and make it work against XtraDB/InnoDB for the 10.1 series of MariaDB.

    See MDEV-4674 for more.

    Skills: C/C++

    Mentor: Colin Charles

    GIS enhancements to MariaDB

    enhancements for 10.1 that we want to work on include adding support for altitude (the third coordinate), as well as making sure we are fully OpenGIS compliant. MDEV-5813

    Skills: C

    Mentor: Holyfoot

    User defined events

    User defined events are supported on several other databases in different form and semantics. Events are used to signal a named event in the database. Applications can use named events instead of polling, which uses more resources.

    See MDEV-5532 for more.

    Skills: C/C++

    Mentor: Jan Lindstrom, Sergei Golubchik

    Indexes for BLOBs (in MyISAM and Aria)

    MyISAM and Aria support special kinds of indexes that only store the hash of the data in the index tree. When two hashes match in the index, the engine compares actual row data to find whether the rows are identical. This is used in internal temporary tables that the optimizer creates to resolve SELECT DISTINCT queries. Normal unique indexes cannot always be used here, because the select list can be very long or include very long strings.

    This task is to provide a direct SQL interface to this feature and to allow users to create these indexes explicitly. This way we can have unique constraints for blobs and very longs strings.

    Skills: C++

    Mentor: Sergei Golubchik

    CREATE OR REPLACE, CREATE IF NOT EXISTS, and DROP IF EXISTS

    this project is taken

    This task is to add support for OR REPLACE and IF EXISTS / IF NOT EXISTS to all CREATE and DROP variants for all objects (where it makes sense). MDEV-5359

    Skills: C++

    Mentor: Sergei Golubchik

    Statistically optimize mysql-test runs by running less tests

    this project is taken

    This is a research not a coding task. See MDEV-5776

    Skills: SQL, Perl/Python or other language of your choice, mathematical statistics

    Mentor: Elena Stepanova, Sergei Golubchik

    Improved temporary tables

    It is a well-known and very old MySQL/MariaDB limitation that temporary tables can only be used once in any query; for example, one cannot join a temporary table to itself. This task is about removing this limitation. MDEV-5535

    Skills: C++

    Mentor: Sergei Golubchik

    Table UDFs

    Implement a new plugin type that adds support for table UDFs — loadable User-Defined Functions that return tables, not scalar values.

    Skills: C++

    Mentor: Sergei Golubchik

    GTID support in mysqlbinlog

    The mysqlbinlog tool needs to understand global transaction ids. In particular, it should be possible to start and end the dump at the specified GTID. Both when reading binlog files and when connecting to a running server. See MDEV-4989.

    If time permits, other client programs could be extended similarly, like mysqldump --master-data or the --sync-with-master command in mysqltest.

    Skills: C++

    Mentor:

    See Also

    • See also «GSoC 2014 tasks» list in Jira.

    This page is licensed: CC BY-SA / Gnu FDL

    MariaDB Galera Cluster
    maria-developers@lists.launchpad.net
    Please join us on Zulip to mingle with the community. You can also subscribe to maria-docs@lists.launchpad.net, the documentation mailing list.

    List of beginner friendly bugs

    List of Tasks

    Stored Procedures Documentation

    The capabilities of MariaDB Server are critical to producing large-scale applications. The current documentation lacks sufficient examples, and the examples warrant testing. Getting Started content would make this easier to adopt. MariaDB's implementation is based on ISO SQL/PSM.

    Getting Started with Connector/C and Connector/J

    and provide the ability to access a MariaDB Server from applications built in C/C++ and Java, respectively. The current documentation for these connectors lacks Getting Started guides.

    Spider Documentation

    The Storage Engine uses partitioning to provide data sharding through multiple servers. This task involves greatly expanding the existing documentation, including more detail about when to use Spider, basic usage, updating the , detailed examples of the effects of the and , as well as the . You will also ensure changes in the most recent Spider releases are properly documented.

    Mroonga Documentation

    is a full text search storage engine based on Groonga, which is an open-source CJK-ready fulltext search engine using column base. This project involves greatly expanding the existing MariaDB documentation on the use of this storage engine. A detailed tutorial and user guide, including examples of the various Mroonga , and the effects of changing their settings, as well as the parser and .

    Translation

    You will choose a major language and ensure that a substantial subsection of the documentation is translated into that language. See translations.

    Migration Documentation

    You will choose a major relational DBMS, and, focusing on the most recent stable releases, document the process to migrate to MariaDB, including MariaDB equivalents to features in that system, and a detailed list of features that exist in one but not the other, as well as possible workarounds. For an example, see the documentation on (bearing that MariaDB is a MySQL fork, and is substantially more similar to MySQL than to other systems).

    Loaded from the MariaDB issue tracker Loading Issues from [Jira](https://jira.mariadb.org/issues/?jql=labels=gsdocs19 and project=mdev order by key)

    %fields.summary%

    %renderedFields.description%

    Details:
    Mentor:

    Details:

    Mentor:

    %fields.assignee.displayName%

    <>

    Suggest a Task

    Do you have an idea of your own, not listed above? Do let us know!

    This page is licensed: CC BY-SA / Gnu FDL

    Details:

    %key%

    Mentor:

    %fields.assignee.displayName%

    translations
    MariaDB issue tracker

    Google Summer of Code 2024

    In 2024, MariaDB again participated in the . We believe we are making a better database that remains application compatible with MySQL. We also work on making LGPL connectors (currently , , , , ) and on , which allows you to scale your reads & writes. And we have , which is a columnar storage engine, designed to process petabytes of data with real-time response to analytical queries.

    Where to Start

    Please join us on to mingle with the community. You should also subscribe to the (this is the main list where we discuss development - there are also ).

    To improve your chances of being accepted, it is a good idea to submit a pull request with a bug fix to the server.

    Google Summer of Code 2022

    In 2022, we again participated in the . The believes we are making a better database that remains application compatible with MySQL. We also work on making LGPL connectors (currently , , , , ) and on , which allows you to scale your reads & writes. And we have , which is a columnar storage engine, designed to process petabytes of data with real-time response to analytical queries.

    Where to Start

    Please join us on to mingle with the community. You should also subscribe to (this is the main list where we discuss development).

    To improve your chances of being accepted, it is a good idea to submit a pull request with a bug fix to the server.

    Also see the List of beginner friendly issues from the MariaDB Issue Tracker.

    List of Tasks

    MariaDB Server

    Implement IVFFlat indexing strategy for MariaDB Vector and evaluate performance

    Part-time (175h) or full-time project (350h) - depending on scope MariaDB Vector is coming to MariaDB Server to serve AI Workloads. The current indexing strategy will use HNSW, but IVFFlat is a possible alternative that costs fewer resources to create. Having it as an option is desirable.

    MDEV-17398 Spatial (GIS) functions in MariaDB

    Part-time (175h) or full-time project (350h) - depending on scope

    Our GIS functionality is limitted compared to other DBMSes. Given that MariaDB looks to facilitate migration from MySQL, we should be on par. We have a list of functions that are missing in MariaDB compared to MySQL, as described in . Our goal is to have as many of these functions available within MariaDB. Some of the functionality can be ported from MySQL, while others might require implementation from scratch.

    Skills needed: Understanding of C++ development. Ability to navigate a large codebase (with help from mentor).Mentors: Anel Husakovic (primary) / Vicențiu Ciorbaru (secondary)

    MDEV-16482 MariaDB Oracle mode misses Synonyms

    Full-time project 350h

    Synonyms are an important feature, particularly as it helps smooth migration from other databases. While the initial project scope seems straightforward, there are a number of aspects that must be considered:

    1. Grammar extension

    2. Where will the synonyms definitions be stored?

    3. How do synonyms map to the underlying privilege system? Who can create a synonym? Who can access a synonym?

    4. Do we enforce the underlying object to exists before creating a synonym? What if the underlying object gets dropped?

    5. What kind of error messages do we present to the user in various corner cases?

    6. How do synonyms interact with replication (row based vs statement based)

    7. How do synonyms interact with views (and views execution)

    8. How to present synonyms to users (as part of INFORMATION_SCHEMA for instance?)

    9. Performance considerations for multiple connections to the database.

    Skills needed: Understanding of C++ development. Able to write and discuss various tradeoffs such that we achieve a feature set that makes sense given the database's priorities.Mentors: Vicențiu Ciorbaru (primary) / Michael Widenius (secondary)

    MDEV-30645 CREATE TRIGGER FOR { STARTUP | SHUTDOWN }

    Full-time project 350h

    Support generalized triggers like

    the latter being a synonym for CREATE EVENT.

    • should STARTUP/SHUTDOWN triggers run exclusively? that is, STARTUP trigger is run before any connection is allowed or in parallel with them? Same for SHUTDOWN.

    Skills needed: Understanding of C++ development. Able to write and discuss various tradeoffs such that we achieve a feature set that makes sense given the database's priorities.Mentors: Sergei Golubchik

    MDEV-21978 make my_vsnprintf to use gcc-compatible format extensions

    Part-time project 175h

    my_vsnprintf() is used internally in the server as a portable printf replacement. And it's also exported to plugins as a service.

    It supports a subset of printf formats and three extensions:

    • %smeans that a string should be quoted like anidentifier`

    • %b means that it's a binary string, not zero-terminated; printing won't stop at \0, so one should always specify the field width (like %.100b)

    • %M is used in error messages and prints the integer (errno) and the corresponding strerror() for it

    • %T takes string and print it like %s but if the string should be truncated puts "..." at the end

    gcc knows printf formats and check whether actual arguments match the format string and issue a warning if they don't. Unfortunately there seems to be no easy way to teach gcc our extensions, so for now we have to disable printf format checks.

    An better approach would be to use gcc compatible format extensions, like Linux kernel does. We should migrate to a different syntax for our extensions

    • %sI to mean "print as an identifier"

    • %sB to mean "print a binary string"

    • %uE to mean "print an errno"

    • %sT to put a "..." as truncation indicator

    old formats can still be supported or they can be removed and in the latter case the major version of the service should be increased to signal an incompatible change.

    All error messages and all usages of my_vsnprintf should be changed to use the new syntax. One way to do it is to disable old syntax conditionally, only in debug builds. All gcc printf format checks should be enabled.

    Skills needed: Understanding of C development.Mentors: Sergei Golubchik

    MCOL-5688 Parallel CSV read leveraging Apache Arrow

    Full-time project 350h

    cpimport is a binary that ingests data into MCS in an efficient manner reducing ingest timings significantly whilst preserving transaction isolation levels.

    cpimport is relatively complex facility that reads data from local file/S3 parses it, converts and put into MCS-specific files. cpimport is unable to read a big-sized single CSV file from disk in parallel. Apache Arrow has a CSV read faciilty that can do parallel CSV read. The goal of the project is to replace an existing homebrew CSV parser implemented in cpimport with the one from Apache Arrow.

    Skills needed: modern C++.Mentors: Leonid Fedorov

    MCOL-4889 Manual vacuum cleaning for on-disk data empty records

    Full-time project 350h

    Here extent is a unit of group of columnar values and partition is a group of extents that stores all column values for a specific portion of a table. MCS has a notion of an empty value for columnar segment/token files and dictionaries. Empty values are marked with a bit in the special 1 byte auxiliary column that is created for every table. When DELETE removes records from a table the records are marked with empty bit in the auxiliary column. The deleted records become a wasted disk space. The goal of the project is to reclaim the wasted disk space either re-creating the whole partition or moving partition values.

    Skills needed: modern C++Mentors: Roman Nozdrin

    Suggest a Task

    Do you have an idea of your own, not listed above? Do let us know in the comments below (Click 'Login' on the top of the page first)!

    This page is licensed: CC BY-SA / Gnu FDL

    Google Summer of Code
    C++
    Node.js
    MariaDB Galera Cluster
    MariaDB ColumnStore
    Zulip
    developers mailing list
    other mailing lists
    Also see the List of beginner friendly issues from the MariaDB Issue Tracker.

    List of Tasks

    MariaDB Server

    • MDEV-21978 make my_vsnprintf to use gcc-compatible format extensions (Part-time project - 175h) MariaDB has its own implementation of most standard C libraries. This is to ensure compatibility across different platforms. Over time this library has evolved and is currently not behaving exactly like POSIX standard library. Thus we want to attain the principle of "least-surprise" with this library. Everything that is supported by standard printf functions should function the same with MariaDB's compatibility library extension.

      • Skills needed: C/C++. Project difficulty: easy Mentor Sergei Golubchik

    • MDEV-19160 JSON_DETAILED output unnecessarily verbose. (Part Time project - 175h) As is explained in the MDEV in detail, we want to improve JSON_DETAILED function to better suit our development and debugging purposes. This project will aim to clean up the function's implementation as well as introduce test cases as well as potential "nice-to-have" features to make developer's lives easier.

      • Skills needed: C/C++, understand JSON, OOP. Project difficulty: easy Mentor: Vicențiu Ciorbaru / Sergei Petrunia

    • Create utility to parse frm files and print their DDL (Full-time project - potential part-time (175 - 350h, depending on scope)) FRM files are what MariaDB uses to store metadata about tables. These files can be used to generate DDL statements (CREATE TABLE ...). We are lacking a utility to parse these files which could in turn make DBAs lives easier. The task of this project is to have this utility implemented, making use of MariaDB's FRM parsing logic. You may have to carry out some refactoring to extract the parsing code into a reusable library, once for MariaDB Server, once for the FRM parsing tool.

      • Skills needed: C/C++, understanding libraries and APIs. Project difficulty: medium to hard, depending on time allocated Mentor Vicențiu Ciorbaru / Sergei Golubchik / Monty Widenius

    • Add linear regression functions (Full-time project - 350h) This project consists of implementing dedicated regression functions within MariaDB. The specification of each function will be decided during the project, based on what other competing databases are offering. We will choose an implementation that best matches user expectations. It is the student's job to perform research into at least one other database and come up with exact semantics for each one of the functions in the MDEV.

      • Skills needed: C/C++, understanding of regression functions and mathematics, APIs, OOP. Project difficulty: medium Mentor Vicențiu Ciorbaru

    • Client compatible delimiter for mysqltest (Full-time project - potential part-time (175 - 350h, depending on scope)) We have a DELIMITER command that has a different syntax in client and mysqltest mysqltest needs an additional (previous) delimiter in the end of the DELIMITER expression, which confuses and makes it hard to copy and paste simple scripts with stored procedures from one to another. We would like to have a new command (--delimiter=client|mysqltest) with current behavior as default.

      • Expected outcomes: You will learn the C command line tools development minors. You'll get familiar with a part of MariaDB server infrastructure -- testing framework, which is written in C and Perl.

      • Skills required: good C knowledge; ability to use console terminal, ability to build from console. Project difficulty: easy Mentor Nikita Malyavin

    • Improve build speed of the server code base (Full-time project - 350h) We have already learned that precompiling the headers improves the build speed five times, however, the standard CMake solution doesn't fit for our comfort of development: CMake PCH generates one "header of headers" and pre-includes it into the each of compilation units. This makes everything that wasn't included by the unit itself be available across the precompiled set. There are the alternative ways of unit precompiling: clang modules and gcc .gch files. We want to prefer it on the per-comiler header.

      • Expected outcomes: You will make a strong practical impact of a high need and show off your mix of analysis and programming skills.

      • Skills required: good C knowledge, some CMake knowledge; ability to use console terminal, ability to build from console. Project difficulty: medium Mentor Nikita Malyavin

    • Improve mysqltest language(Full-time project - 350h) mysqltest has a lot of historical problems: ad hoc parser, weird limitations commands added as needed with no view over the total language structure, etc. The purpose of this work would be improvement of the language.

      • Expected outcomes : Rewrite mysqltest interpreter using either a real parser generator, e.g bison, or cleanly hand-written parser, e. recursive descent, that can be easily extended with new functionality. Added missing control structures, for example "else" for an existing "if". Simple expression evaluations without contacting server, i.e math and string comparisons. Added functionality for minimal string manipulation ,e.g substr function.

      • Skills required: good C/C++ knowledge, interest in parsers/interpreter. Project difficulty: medium Mentor Vladislav Vaintroub

    • Create a function to check for JSON intersection (Part-time project - 175h) This project aims at implementing JSON_INTERSECT() function between two JSON objects or two JSON arrays. If the two documents are json arrays then we want to return all the common elements between the two arrays. And in case of objects, we want to return all common key value pairs.

      • Skills required: C/C++, OOP, basic understanding of JSON. Project difficulty: easy Mentor Rucha Deodhar, Oleksandr Byelkin

    ColumnStore

    • MCOL-4995 Research/implement basic vectorized filtering for ARM platforms (Full-time project - 350h) AS of 6.2.2 Columnstore(MCS) supports vectorization on x86_64 platforms only. The goal of the project is to implement a vectorized low-level filtering for ARM platforms using 128bit ARM NEON extention(SVE is optional). Low-level filtering in the context is the simple predicate WHERE filters, e.g. WHERE c1 = 5 AND c2 in (10, 25). Please see the corresponding Jira issue for details.

      • Skills needed: C/C++, understand low-level platform specifics. Project difficulty: medium Mentor Roman Nozdrin

    • MCOL-4994 Build/run Columnstore on MacOS (Part-time project - 175h) As of Columnstore(MCS) 6.2.2 there is no way to compile/use the MCS engine on MacOS. The goal of this project is to be able to boostrap MariaDB + basic(maybe rudimentary) MCS on MacOS. There are number of known issues that prevents MCS compilation on MacOS: a number of offending preprocessor macroses/definitions specific for Linux x86_64 combination; MacOS doesn't provide syslog used by MCS as the only log messages sink. Please see the corresponding Jira issue for details.

      • Skills needed: C/C++, MacOS specifics. Project difficulty: easy Mentor Roman Nozdrin

    • Implement DISTRIBUTED JSON functions (Full-time project - 350h) As of 6.2.2 Columnstore there are two query execution modes: relatively slow but compatible Table mode and fast Select Handler mode. Table mode execution supports all JSON_* functions and SH mode doesn't support any. We want to add support for JSON_* functions family in SH query execution mode. Please see the corresponding Jira issue for details.

      • Skills needed: C/C++, JSON format. Project difficulty: easy Mentor Roman Nozdrin

    Buildbot (CI/CD)

    • MDBF-320 Better Grid view for buildbot.mariadb.org (Python / Javascript / Web Dev ( Full-time project 350h) Our CI/CD infrastructure uses a recent version of Buildbot. The GRID view plugin that comes with Buildbot is not adequate for our needs. In this project, you will discuss with your mentor as well as other MariaDB developers on how to best improve the User Experience of Buildbot's grid view for what MariaDB Developers needs to accomplish.

      • Skills needed: Understanding of web-dev technologies like Angular, React, and Javascript related libraries. Python may also be required. Mentor Vlad Bogolin / Andreia Hendea

    Suggest a Task

    Do you have an idea of your own, not listed above? Do let us know!

    This page is licensed: CC BY-SA / Gnu FDL

    Google Summer of Code
    MariaDB Foundation
    C++
    Node.js
    MariaDB Galera Cluster
    MariaDB ColumnStore
    Zulip
    maria-developers@lists.launchpad.net
    %key%

    Google Summer of Code 2017

    We participated in the Google Summer of Code 2017 (we have participated previously in 2016, 2015, 2014, and 2013). The MariaDB Foundation believes we are making a better database that remains application compatible with MySQL. We also work on making LGPL connectors (currently C, ODBC, Java) and on MariaDB Galera Cluster, which allows you to scale your reads & writes. Lately, we also have MariaDB ColumnStore, which is a columnar storage engine, designed to process petabytes of data with real-time response to analytical queries.

    Where to start

    Please join us at irc.freenode.net at #maria to mingle with the community. Don't forget to subscribe to maria-developers@lists.launchpad.net (this is the main list where we discuss development).

    A few handy tips for any interested students who are unsure which projects to choose:

    List of tasks

    The complete list of tasks suggested for GSoC 2017 is located in the . A subset is listed below.

    Support for GTID in mysqlbinlog

    The tool needs to be updated to understand the replication feature called (GTIDs) in MariaDB 10. The current version does not support GTIDs and the MySQL variant does not speak MariaDB 10's GTIDs.

    Details:
    Skills:
    Mentor:
    Students Interested:

    Automatic provisioning of slave

    The purpose of this task is to create an easy-to-use facility for setting up a new MariaDB slave.

    Details:
    Skills:
    Mentor:
    Students Interested:

    GIS enhancements

    enhancements for 10.1 that we want to work on include adding support for altitude (the third coordinate), converters (eg. ST_GeomFromGeoJSON - ST_AsGeoJSON, ST_GeomFromKML - ST_AsKML, etc.), Getting data from SHP format (shp2sql convertor), as well as making sure we are fully OpenGIS compliant.

    Details:
    Skills:
    Mentor:
    Students Interested:

    mysqltest improvements

    mysqltest is a client utility that runs tests in the framework. It sends sql statements to the server, compares the results with the expected results, and uses a special small for loops, assignments, and so on. It's pretty old and very ad hoc with many strange limitations. It badly needs a proper parser and a consistent logical grammar.

    Details:
    Skills:
    Mentor:
    Students Interested:

    Allow multiple alternative authentication methods for the same user

    Currently one can specify only one authentication method per user. It would make a lot of sense to support multiple authentication methods per user. PAM-style. For example, one may want to authenticate using unix_socket when connecting locally, but ask for a password if connecting remotely or if unix_socket authentication failed.

    Details:
    Skills:
    Mentor:
    Students Interested:

    connection encryption plugins

    Encrypting the client-server communications is closely related to authentication. Normally SSL is used for the on-the-wire encryption, and SSL can be used to authenticate the client too. GSSAPI can be used for authentication, and it has support for on-the-wire encryption. This task is about making on-the-wire encryption pluggable.

    Details:
    Skills:
    Mentor:
    Students Interested:

    OSS-Fuzz configuration for MariaDB

    This would involve randomizing a bunch of queries (RQG based?), configurations and replication setups to search for segfaults, race conditions and perhaps invalid results.

    Details:
    Skills:
    Mentor:
    Students Interested:

    NUMA for MariaDB

    This would involve organising a bunch of memory and threads to run on the same NUMA node. Attention to detail to ensure no additional race conditions get added in the process. A good understanding of systems programming would be useful. Ability to implement WIndows NUMA support at the same time as Linux NUMA support would be advantageous.

    Details:
    Skills:
    Mentor:
    Students Interested:

    Cassandra Storage Engine V2

    Current Cassandra Storage Engine was developed against Cassandra 1.1 and it uses Thrift API to communicate with Cassandra. However, starting from Cassandra 1.2, the preferred way to access Cassandra database is use CQL (Cassandra Query Language) and DataStax C++ Driver (). Thrift-based access is deprecated and places heavy constraints on the schema.

    This task is about re-implementing Cassandra Storage Engine using DataStax C++ Driver and CQL.

    Details:
    Skills:
    Mentor:
    Students Interested:

    ColumnStore: Add proper NULL support

    At the moment NULL is just the maximum integer for a column (or empty string for VARCHAR/CHAR). We need a mechanism to store NULLs separately to give us full type ranges.

    ColumnStore: Add full support for DECIMAL

    Right now it is cast to double which is not great for obvious reasons. It will mean modifying a lot of ColumnStore's version of MariaDB's function implementations and allowing column files to store more than 8 bytes per field.

    Details:
    Skills:
    Mentor:
    Students Interested:

    ColumnStore: Full UTF8 support.

    This includes collations and anything that works on the length of the string.

    Suggest a task

    Do you have an idea of your own, not listed above or in Jira? Do let us know!

    This page is licensed: CC BY-SA / Gnu FDL

    Google Summer of Code 2025

    This year we are again participating in the . We, joined with the , believe we are making a better database that remains application compatible with MySQL. We also work on making LGPL connectors (currently , , , , ) and on , which allows you to scale your reads & writes. And we have , which is a columnar storage engine, designed to process petabytes of data with real-time response to analytical queries.

    Where to Start

    Please join us on to mingle with the community. You should also subscribe to the (this is the main list where we discuss development - there are also ).

    To improve your chances of being accepted, it is a good idea to submit a pull request with a bug fix to the server.

    CREATE TRIGGER ... AFTER STARTUP ...
    CREATE TRIGGER ... BEFORE SHUTDOWN ...
    CREATE TRIGGER ... ON SCHEDULE ...

    Students Interested:

    2

    Students Interested:

    3

    Students Interested:

    1

    Students Interested:

    3

    Students Interested:

    4

    Students Interested:

    2

    Students Interested:

    1

    Students Interested:

    1

    Students Interested:

    1

    Students Interested:

    1

    Details:

    MDEV-4989

    Skills:

    C/C++

    Mentor:

    Details:

    MDEV-7502

    Skills:

    C/C++

    Mentor:

    Details:

    MDEV-5813

    Skills:

    C/C++

    Mentor:

    Holyfoot

    Details:

    MDEV-12130

    Skills:

    C/C++

    Mentor:

    Sergei Golubchik

    Details:

    MDEV-11340

    Skills:

    C/C++

    Mentor:

    Sergei Golubchik

    Details:

    MDEV-9090

    Skills:

    C/C++

    Mentor:

    Sergei Golubchik

    Details:

    MDEV-12178

    Skills:

    C, SQL

    Mentor:

    Daniel Black

    Details:

    MDEV-12213

    Skills:

    C, locking and threads, Windows system programming and Innodb internals would be a plus.

    Mentor:

    Daniel Black

    Details:

    MDEV-12296

    Skills:

    C/C++

    Mentor:

    Sergei Petrunia

    Details:

    MCOL-641

    Skills:

    C/C++

    Mentor:

    Andrew Hutchings

    Blog post from former GSoC student & mentor
    MariaDB Jira
    mysqlbinlog
    replication
    mysql-test
    DSL
    cpp-driver

    Also see the List of beginner friendly issues from the MariaDB Issue Tracker.

    List of Tasks

    MariaDB Server

    MDEV-28395 LOAD DATA plugins

    Full-time project 350h

    LOAD DATA INFILE can flexibly load data into a table from CSV-like files accessible by the mariadbdb process. LOAD XML INFILE can do it for XML files. LOAD DATA LOCAL INFILE and LOAD XML LOCAL INFILE can do it with files accessible by the client, but not by the server. But there are requests to suport loading more file formats and from other locations, for example, from S3.

    This project is to implement support for LOAD plugins and refactor the current LOAD code accordingly. There are two kind of plugins — data parser plugin (CSV-like and XML) and transfer plugin (file and LOCAL). Implementing new plugins is not in the scope of this task, this task is mainly about moving existing code around, creating a possibility for new plugins (like JSON or S3).

    Skills needed: C++, bisonMentors: Sergei Golubchik

    MDEV-36100 Generate vector embeddings automatically on INSERT

    Full-time project 350h

    Implement a syntax and a plugin API that the server will use to generate embeddings for documents that the user stores in the database. This should allow to simplify significantly the vector pipeline. mariadbd will not generate embeddings internally, it will invoke a plugin to do that.

    Skills needed: C++Mentors: Sergei Golubchik

    MDEV-36107 expressions in mysqltest

    Part-time project 175h

    extend mysqltest language to support

    • standard arithmetic +, -, *, /, %

    • comparisons ==, !=, <, <=, >, >=

    • boolean &&, ||, may be ? :

    • if possible: string repetition, perl-style x (to replace SELECT REPEAT() in test files)

    This should work in commands if, while

    Can be done together with MDEV-36108 as a full-time project.

    Skills needed: C++Mentors: Sergei Golubchik

    MDEV-36108 variable substitutions in mysqltest

    Part-time project 175h

    extend mysqltest language to support bash-like substitutions:

    • ${var}

    • ${parameter:offset:length}

    • ${#parameter}

    • ${parameter/pattern/string/flags}

    • may be ${parameterˆ}, ${parameterˆˆ}, ${parameter,}, ${parameter}

    • may be ${parameter@function} with functions like u, U, Q, etc

    recursive expansion:

    • ${${var}}

    Can be done together with MDEV-36107 as a full-time project.

    Skills needed: C++Mentors: Sergei Golubchik

    MDEV-18827 Create utility to parse frm files and print their DDL

    Full-time project - potential part-time (175 - 350h, depending on scope)

    FRM files are what MariaDB uses to store metadata about tables. These files can be used to generate DDL statements (CREATE TABLE ...). We are lacking a utility to parse these files which could in turn make DBAs lives easier. The task of this project is to have this utility implemented, making use of MariaDB's FRM parsing logic. You may have to carry out some refactoring to extract the parsing code into a reusable library, once for MariaDB Server, once for the FRM parsing tool.

    Skills needed: C/C++, understanding libraries and APIs.Mentors: Vicențiu Ciorbaru / Sergei Golubchik

    MDEV-9345 Replication to enable filtering on master

    Part-time project 175h

    The current methods of filtering replication events are limited to either 1) at binlog-write time, which can break point-in-time recovery because some committed transactions will be missing from the binary log, or 2) on the replica, which forces all events on the primary server to always be sent to the replica, which can be a security concern and is also not efficient. This task aims to eliminate these limitations by adding in another point at which replication filtering occurs: on the binlog dump threads. This would allow users to both maintain a consistent binary log, and minimize network traffic by guarding events which are never intended for replication.

    Skills needed: C++Mentors: Brandon Nesterenko

    Buildbot build statistics dashboard

    Part-time project 175h TODO - A more ample description will be created.

    Skills needed:Mentors: Vlad Radu

    MCOL-4889 Manual extent vacuuming

    Full-time project 350h

    Here extent is a unit of group of columnar values and partition is a group of extents that stores all column values for a specific portion of a table. MCS has a notion of an empty value for columnar segment/token files and dictionaries. Empty values are marked with a bit in the special 1 byte auxiliary column that is created for every table. When DELETE removes records from a table the records are marked with empty bit in the auxiliary column. The deleted records become a wasted disk space. The goal of the project is to reclaim the wasted disk space either re-creating the whole partition or moving partition values.

    Skills needed: modern C++Mentors: Roman Nozdrin

    MCOL-5142 Support for recursive CTE

    Full-time project 350h

    MariaDB Columnstore lacks recursive CTE handling, so as of now Columnstore hands over the processing back to MariaDB Server if a query contains recursive CTE.

    Here is the info about the feature:

    Skills needed: modern C++Mentors: Leonid Fedorov

    MCOL-5598 Support for EXCEPT and INTERSECT SQL expressions

    Full-time project 350h

    MariaDB Columnstore lacks UNION EXCEPT INTERSECT handling, so as of now Columnstore hands over the processing back to MariaDB Server if a query contains UNION EXCEPT or UNION INTERCEPT

    Here is the info about the feature:

    Skills needed: modern C++Mentors: Alexey Antipovsky

    MCOL-XXX Bloom-filters for data scanning

    Full-time project 350h

    MariaDB Columnstore lacks indexes so it reads a lot of extra data from disk. This project introduces Bloom-filters to reduce data read from disk during the most IO heavy operation that is scanning.

    Skills needed: modern C++Mentors: Roman Nozdrin

    MCOL-5758 Reduce the computations in JOINS by simpler Bloom-filter-based pre-joins

    Full-time project 350h

    Joins are very heavy algorithms, both in computation and/or in memory use. They need to hold a substantial amount of data in memory and perform hashing and other operations on that data. Joins can overflow memory limits and keeping balance between memory use and performance is tricky. Thus we have to filter information thaat is going into joins as much as possible. Columnstore already does great work in that regard, pushing WHERE filters before joins. This particular task is also concerned with that, adding Bloom filters' operations that approximate JOIN results and perform a secondary read to feed into joins data that is highly likely will be used in a join.

    Skills needed: modern C++Mentors: Sergey Zefirov

    Suggest a Task

    Do you have an idea of your own, not listed above? Do let us know in the comments below (Click 'Login' on the top of the page first)!

    This page is licensed: CC BY-SA / Gnu FDL

    Google Summer of Code
    MariaDB Foundation
    C++
    Node.js
    MariaDB Galera Cluster
    MariaDB ColumnStore
    Zulip
    developers mailing list
    other mailing lists
    MDEV-18827
    MDEV-17467
    MDEV-23251
    MDEV-25774
    MDEV-12130
    MDEV-26182
    MCOL-785

    Google Summer of Code 2023

    In 2023, MariaDB participated in the Google Summer of Code. The MariaDB Foundation believes we are making a better database that remains application compatible with MySQL. We also work on making LGPL connectors (currently , C++, , , Node.js) and on MariaDB Galera Cluster, which allows you to scale your reads & writes. And we have MariaDB ColumnStore, which is a columnar storage engine, designed to process petabytes of data with real-time response to analytical queries.

    Where to Start

    Please join us on Zulip to mingle with the community. You should also subscribe to maria-developers@lists.launchpad.net (this is the main list where we discuss development).

    To improve your chances of being accepted, it is a good idea to submit a pull request with a bug fix to the server.

    Also see the from the MariaDB Issue Tracker.

    List of Tasks

    MariaDB Server

    Update RocksDB to the latest upstream version

    Full-time project 350h

    Our version of RocksDB is lagging behind the current upstream and needs to be updated. This isn't a case of simply updating the submodule, there have been significant API changes. It will likely require porting the latest MyRocks code over to the MariaDB storage API.

    Skills needed: Understanding of C/C++ development. Preferably some experience with the MariaDB or MySQL codebase (but not essential).Mentor: Andrew Hutchings

    Spatial (GIS) functions in MariaDB

    Part-time (175h) or full-time project (350h) - depending on scope

    Our GIS functionality is limitted compared to other DBMSes. Given that MariaDB looks to facilitate migration from MySQL, we should be on par. We have a list of functions that are missing in MariaDB compared to MySQL, as described in . Our goal is to have as many of these functions available within MariaDB. Some of the functionality can be ported from MySQL, while others might require implementation from scratch.

    Skills needed: Understanding of C++ development. Ability to navigate a large codebase (with help from mentor).Mentors: Anel Husakovic (primary) / Vicențiu Ciorbaru (secondary)

    MariaDB Oracle mode misses Synonyms

    Full-time project 350h

    Synonyms are an important feature, particularly as it helps smooth migration from other databases. While the initial project scope seems straightforward, there are a number of aspects that must be considered:

    1. Grammar extension

    2. Where will the synonyms definitions be stored?

    3. How do synonyms map to the underlying privilege system? Who can create a synonym? Who can access a synonym?

    4. Do we enforce the underlying object to exists before creating a synonym? What if the underlying object gets dropped?

    Skills needed: Understanding of C++ development. Able to write and discuss various tradeoffs such that we achieve a feature set that makes sense given the database's priorities.Mentors: Vicențiu Ciorbaru (primary) / Michael Widenius (secondary)

    MariaDB ColumnStore in Docker Official OCI Image

    Part-time project 175h / Full-time project 350h - depending on scope

    MariaDB ships with ColumnStore as a storage engine. However the architecture of ColumnStore is not like a traditional storage engine. Instead it relies on multiple database nodes working in unison. This means that starting up a ColumnStore enabled MariaDB service is not a trivial endeavour. This project seeks to create the necessary tooling around starting MariaDB with ColumnStore inside a OCI containers. You will be writing Dockerfiles, configuration files as well as bash scripts to achieve this.

    The challenge of this project lies in:

    1. Limited documentation around ColumnStore. There will be some time spent on the discovery process.

    2. Formulating a clear plan to facilitate:

    3. Starting MariaDB with ColumnStore

    4. Upgrading MariaDB with ColumnStore on a version upgrade

    Skills needed: Ability to develop durable bash scripts, understanding of container runtime and ability to confirm to container best practices. Able to incrementally develop and test functionality.Mentors: Daniel Black (primary - containers) / Andrew Hutchings (secondary - ColumnStore)

    LOAD DATA INFILE - store all warnings

    Part-time project 175h

    The main focus of this project is around developer / sysadmin experience. We want to improve the quality of life of those using MariaDB. Migrating large datasets is one of these challenges. As is described in the MDEV, a simple limitation related to LOAD DATA INFILE can severly hamper developer productivity. A related problem is discussed in .

    The goal of this project is to come up with a solution for storing warnings during LOAD DATA INFILE. This will require modifying the existing server codebase to create an SQL interface for processing the generated warnings.

    Challenges:

    • LOAD DATA INFILE can process large datasets. That means that the server must not just store all warnings in memory. You will need to make use of already existing mechanisms (creating temporary tables) so one can spill to disk.

    Skills needed: Understanding of C++ development.Mentors: Anel Husakovic (primary) / Daniel Black (secondary)

    Buildbot (CI/CD)

    Better Grid view for buildbot.mariadb.org (Python / Javascript / Web Dev

    Full-time project 350h

    Our CI/CD infrastructure uses a recent version of Buildbot. The GRID view plugin that comes with Buildbot is not adequate for our needs. In this project, you will discuss with your mentor as well as other MariaDB developers on how to best improve the User Experience of Buildbot's grid view for what MariaDB Developers needs to accomplish.

    Skills needed: Understanding of web-dev technologies like Angular, React, and Javascript related libraries. Python may also be required.Mentor: Vlad Bogolin

    MariaDB Columnstore

    MariaDB Columnstore is a columnar engine for MariaDB Server for OLAP workload. MCS is also a distributed multithreaded application written in C++. C++20 is curently used for development. There is a number of interesting MCS projects to be part of in both research and production programming areas.

    JIT compiled SQL expressions

    Part-time project 175h / Full-time project 350h - depending on scope

    MCS uses interepreted execution to calculate SQL expressions results. Here is an example of a SQL expression 'table1.col1 + FLOOR(table2.col1)'. Given that table1.col1 is DECIMAL and table2.col1 is DOUBLE there is a number of conditions that drives the calculation of this relatively simple example in runtime. Given that SQL types and expression tree are known before the query begins it is possible to replace interpretation with JIT to produce specialized compiled bytecode that is:

    1. small

    2. has no or almost no branches

    3. optimized for the specific platform it is run at

    This is a research mostly project which goal is to produce a set of microbenchmarks that:

    1. leverages any JIT compiler available, e.g. LLVM, MIR

    2. demonstrates a negative and positive effects of using JIT

    Skills needed:

    1. C++

    2. at least basic compilers internals knowledge

    Mentor: Roman Nozdrin

    Parquet support in cpimport

    Full-time project 350h

    cpimport in MCS is a standalone tool that does bulk ingestion outside SQL interface. It takes a source data as an input and puts it into MCS cluster. This put is an atomic operation that supports rollback. The sources can either local files or files on S3. The only format cpimport now reads is CSV with custom:

    1. delimiters

    2. quotation signs

    3. NULL symbol

    The goal of this project is to either teach cpimport to support parquet format as input format or introduce a modular framework to add input formats. This project is mostly about reading/writing production code where the challenges are:

    1. to learn a codebase

    2. produce a feature

    3. support the feature with unit and integration tests using existing frameworks

    cpimport consist of:

    1. a buffer where parsed data lines goes in a form of low-level representation of SQL datatypes

    2. a set of parser threads that populates the buffer

    3. a set of writer threads that takes values that makes a single SQL record and puts them into the corresponding files

    Parser threads now has a fixed Delimiter Separated Values parser that can be parametrized only with:

    1. escape character

    2. 'enclosed by' characters

    The suggested approach is to replace this DSV parser with a modular one that understands how to read popular formats, e.g. parquet, Arrow, Avro, JSON

    Skills needed:

    1. C++

    2. production development tooling like git, Cmake

    Mentor: Gagan Goel

    SIMD for SQL expressions

    Part-time project 175h / Full-time project 350h - depending on scope

    MCS uses scalar processing to calculate SQL expressions results. The expressions can be in projection or filtering part of a SQL query. Here is an example of a SQL expression 'table1.col1 + FLOOR(table2.col1)'. In most cases scalar processing can be replaced with a vectorized execution that reduces a number of cycles to render the result of an expression. The challenge of this project is that in-memory representation ca be both vertical and horisontal.

    This is a research mostly project which goal is to produce a set of microbenchmarks that:

    1. unveils limitations or problems applying vectorization for expressions

    2. compares performance for cases:

    3. vectorized execution with vertical data

    4. vectorized execution with horizontal data

    Skills needed:

    1. C++

    2. ASM knowledge to manage with low-level part of this project

    Mentor: Andrey Piskunov

    Fuzzing infrastructure for Columnstore engine.

    Full-time project 350h - depending on scope

    Fuzzing is a well-known technique for finding various types of bugs. This task is to integrate libFuzzer, sanitizers (ASan, TSan, UBSan) and MCS Columnstore into one fuzzing pipeline and create a fuzzing infrastructure. This task requires:

    1. Add support to the Columnstore for building with sanitizers (ASan, TSan, UBSan)

    2. Write a code which integrates C++ MariaDB Connectors and libFuzzer.

    3. Prepare a valid corpus with SQL sripts suitable for Columnstore.

    4. Create a fuzzing infrastructure.

    Skills needed:

    1. Basic knowledge how to work with C++ build tools (CMake, clang, ld, rtld).

    2. Basic C++.

    Mentor: Denis Khalikov

    Suggest a Task

    Do you have an idea of your own, not listed above? Do let us know!

    This page is licensed: CC BY-SA / Gnu FDL

    Google Summer of Code 2016

    We participated in the Google Summer of Code 2016 (we have participated previously in 2015, 2014, and 2013). The MariaDB Foundation believes we are making a better database that remains application compatible with MySQL. We also work on making LGPL connectors (currently in C, Java, C++ in development) and on MariaDB Galera Cluster, which allows you to scale your reads & writes. Lately, we also have MariaDB MaxScale which is a pluggable database proxy.

    Where to start

    Please join us at irc.freenode.net at #maria to mingle with the community. Don't forget to subscribe to maria-developers@lists.launchpad.net (this is the main list where we discuss development).

    Please keep in mind that in April we travel a lot (conferences, busy time focusing on a release), so if you have a question and nobody on IRC answers, don't feel disappointed, ask in an email to maria-developers@lists.launchpad.net. Asking on the mailing list means others benefit from your Q&A too!

    The complete list of tasks suggested for GSoC 2016 is located in the . A subset is listed below.

    Support for GTID in mysqlbinlog

    The tool needs to be updated to understand the replication feature called (GTIDs) in MariaDB 10. The current version does not support GTIDs and the MySQL variant does not speak MariaDB 10's GTIDs.

    Details:
    Skills:
    Mentor:
    Students Interested:

    Aggregate stored functions

    With one can create functions in SQL, but this syntax doesn't allow one to create an aggregate function (like , , etc). This task is to add support for aggregate .

    Details:
    Skills:
    Mentor:
    Students Interested:

    GIS enhancements

    enhancements for 10.1 that we want to work on include adding support for altitude (the third coordinate), converters (eg. ST_GeomFromGeoJSON - ST_AsGeoJSON, ST_GeomFromKML - ST_AsKML, etc.), Getting data from SHP format (shp2sql convertor), as well as making sure we are fully OpenGIS compliant.

    Details:
    Skills:
    Mentor:
    Students Interested:

    Indexes for BLOBs (in MyISAM and Aria)

    MyISAM and Aria support special that only store the hash of the data in the index tree. When two hashes match in the index, the engine compares actual row data to find whether the rows are identical. This is used in internal temporary tables that the optimizer creates to resolve SELECT DISTINCT queries. Normal unique indexes cannot always be used here, because the select list can be very long or include very long strings.

    This task is to provide a direct SQL interface to this feature and to allow users to create these indexes explicitly. This way we can have unique constraints for blobs and very longs strings.

    Details:
    Skills:
    Mentor:
    Students Interested:

    Provide GTID support for MHA

    MySQL Master HA (MHA) is a tool to assist with automating master failover and slave promotion within short downtime, without suffering from replication consistency problems, and without performance penalty. We would like to have this tool support MariaDB 10 .

    Skills:
    Mentor:
    Students Interested:

    Import and export popular data formats from and to dynamic columns

    Provide import and export functions for popular data formats like JSON, XML (limited), PHP, ... for Connector/C and MariaDB Server (which use same code base for dynamic columns)

    Details:
    Skills:
    Mentor:
    Students Interested:

    MaxScale filter to capture incoming operations for consumption in external sources

    Design a filter that will capture incoming inserts, updates and deletes, for specified tables (as regex) in a separate log file that is consumable as JSON or CSV form. So that external ETL processes can process it for data uploading into DWH or big data platform. Optionally a plugin that takes this log into a Kafka broker that can put this data on Hadoop node can be developed as next step.

    Details:
    Skills:
    Mentor:
    Students Interested:

    MaxScale filter to real Microsoft SQL Server syntax

    Develop a MaxScale filter that will translate SQL Server syntax to MariaDB syntax. Develop a SQL Server client protocol plugin.

    Details:
    Skills:
    Mentor:
    Students Interested:

    Additional libraries for MaxScale's experimental Luafilter

    Create additional entry points into MaxScale that the Lua side scripts can use. Various types of functions can be added ranging from SQL processing functions to utility functions which communicate with MaxScale.

    Details:
    Skills:
    Mentor:
    Students Interested:

    Query Injection Filter

    Create a filter which can inject queries before the client executes any queries. This filter could be used for various purposes for example auditing.

    Details:
    Skills:
    Mentor:
    Students Interested:

    Cassandra Storage Engine V2

    Current Cassandra Storage Engine was developed against Cassandra 1.1 and it uses Thrift API to communicate with Cassandra. However, starting from Cassandra 1.2, the preferred way to access Cassandra database is use CQL (Cassandra Query Language) and DataStax C++ Driver (). Thrift-based access is deprecated and places heavy constraints on the schema.

    This task is about re-implementing Cassandra Storage Engine using DataStax C++ Driver and CQL.

    Details:
    Skills:
    Mentor:
    Students Interested:

    NO PAD collations

    Currently MariaDB ignores trailing spaces when comparing values of the CHAR, VARCHAR, TEXT data types. In some cases it would be nice to take trailing spaces into account. This task will introduce a set of new collations that will make this possible.

    Details:
    Skills:
    Mentor:
    Students Interested:

    Suggest a task

    Are you a student interested in working on something? Let us know here.

    This page is licensed: CC BY-SA / Gnu FDL

    What kind of error messages do we present to the user in various corner cases?

  • How do synonyms interact with replication (row based vs statement based)

  • How do synonyms interact with views (and views execution)

  • How to present synonyms to users (as part of INFORMATION_SCHEMA for instance?)

  • Performance considerations for multiple connections to the database.

  • Creating health checks to validate the health of the ColumnStore service nodes
  • Setting configuration variables via environment switches where appropriate

  • Declaratively (likely docker-compose yml file) state the system's architecture.

  • Documenting the necessary steps to deployment

  • Producing a blog of its operation

  • Optionally enable deployment via Kubernetes

  • Implementing the plan and creating a CI/CD pipeline for testing.

  • scalar execution with horizontal data
    List of beginner friendly issues
    MDEV-30610
    MDEV-17398
    MDEV-16482
    GH-457
    MDEV-6166
    MDEV-13046
    MDBF-320

    Students Interested:

    2

    Students Interested:

    6

    Students Interested:

    2

    Students Interested:

    3

    Students Interested:

    2

    Students Interested:

    2

    Students Interested:

    1

    Students Interested:

    2

    Students Interested:

    1

    Students Interested:

    3

    Students Interested:

    2

    Details:

    MDEV-4989

    Skills:

    C/C++

    Mentor:

    Kristian Nielsen

    Details:

    MDEV-7773

    Skills:

    C/C++

    Mentor:

    Sergei Golubchik

    Details:

    MDEV-5813

    Skills:

    C/C++

    Mentor:

    Holyfoot

    Details:

    MDEV-371

    Skills:

    C/C++

    Mentor:

    Sergei Golubchik

    Skills:

    Perl

    Mentor:

    Colin Charles

    Students Interested:

    2

    Details:

    CONC-125

    Skills:

    C

    Mentor:

    Oleksandr Byelkin, Georg Richter

    Details:

    MXS-2

    Skills:

    C/C++

    Mentor:

    Markus Makela

    Details:

    MXS-1

    Skills:

    C/C++

    Mentor:

    Markus Makela & Massimiliano Pinto

    Details:

    MXS-593

    Skills:

    C/C++

    Mentor:

    Markus Makela

    Details:

    MXS-591

    Skills:

    C/C++

    Mentor:

    Markus Makela

    Details:

    MDEV-8947

    Skills:

    C/C++

    Mentor:

    Sergei Petrunia

    Details:

    MDEV-9711

    Skills:

    C/C++

    Mentor:

    Alexander Barkov

    MariaDB Jira
    mysqlbinlog
    cpp-driver

    Google Summer of Code 2015

    We participated in the Google Summer of Code 2015. MariaDB and the MariaDB Foundation believe we are making a better database that remains a drop-in replacement to MySQL. We also work on making LGPL connectors (currently in C, Java, C++ in development) and on MariaDB Galera Cluster, which allows you to scale your reads & writes. Lately, we also have MariaDB MaxScale which is a pluggable database proxy.

    Where to start

    Please join us at irc.freenode.net at #maria to mingle with the community. Don't forget to subscribe to maria-developers@lists.launchpad.net (this is the main list where we discuss development).

    Please keep in mind that in April we travel a lot (conferences, busy time focusing on a release), so if you have a question and nobody on IRC answers — do not feel disappointed, ask in an email to maria-developers@lists.launchpad.net. Asking on the mailing list means others benefit from your Q&A too!

    At the moment, tasks that may be suitable for GSoC 2015 are listed in the MariaDB Issue Tracker under

    Some suggested projects

    Enhancing mysqlbinlog

    This project consists of two parts -- it can either be performed by 2 students or 1 student with the relevant skills:

    Support for GTID in mysqlbinlog

    The tool needs to be updated to understand the replication feature called (GTIDs) in MariaDB 10. The current version does not support GTIDs and the MySQL variant does not speak MariaDB 10's GTIDs.

    Details:
    Skills:
    Mentor:

    Remote backup of binary logs

    in MySQL 5.6 also supports streaming servers for . This is important as the MHA tool can also use this feature.

    Details:
    Skills:
    Mentor:

    Indexes on virtual columns

    We have the concept of (non-materialized) columns, and currently to have an on a virtual column one has to materialize it. To support indexes on fully virtual columns, a storage engine must call back into the server to calculate the value of the virtual column.

    Details:
    Skills:
    Mentor:

    Table functions

    User functions (UDF-like) that return a table, rows and columns. It should be possible to use it in other statements. A possible implementation could be: the function exports a generator, we create a handler of the hidden "storage engine" class, no indexes, and convert this generator to rnd_init/rnd_next. Need to disable rnd_pos somehow. Alternatively, it can materialize the result set in a temporary table (like information_schema does), then this table can be used normally.

    Details:
    Skills:
    Mentor:

    Aggregate stored functions

    With one can create functions in SQL, but this syntax doesn't allow one to create an aggregate function (like , , etc). This task is to add support for aggregate .

    Details:
    Skills:
    Mentor:

    GIS enhancements

    enhancements for 10.1 that we want to work on include adding support for altitude (the third coordinate), converters (eg. ST_GeomFromGeoJSON - ST_AsGeoJSON, ST_GeomFromKML - ST_AsKML, etc.), Getting data from SHP format (shp2sql convertor), as well as making sure we are fully OpenGIS compliant.

    Details:
    Skills:
    Mentor:

    Port InnoDB memcached interface to MariaDB

    MySQL 5.6 has a memcached plugin to InnoDB. MySQL 5.7 has improved performance of this. The task would be to port this to run against MariaDB, and make it work against XtraDB/InnoDB for the 10.2 series of MariaDB.

    Details:
    Skills:
    Mentor:

    Automatic provisioning of slave

    The purpose of this task is to create an easy-to-use facility for setting up a new MariaDB slave.

    Details:
    Skills:
    Mentor:

    Indexes for BLOBs (in MyISAM and Aria)

    MyISAM and Aria support special that only store the hash of the data in the index tree. When two hashes match in the index, the engine compares actual row data to find whether the rows are identical. This is used in internal temporary tables that the optimizer creates to resolve SELECT DISTINCT queries. Normal unique indexes cannot always be used here, because the select list can be very long or include very long strings.

    This task is to provide a direct SQL interface to this feature and to allow users to create these indexes explicitly. This way we can have unique constraints for blobs and very longs strings.

    Details:
    Skills:
    Mentor:

    Improved temporary tables

    It is a well-known and very old MySQL/MariaDB limitation that temporary tables can only be used once in any query; for example, one cannot join a temporary table to itself. This task is about removing this limitation

    Details:
    Skills:
    Mentor:

    Provide GTID support for MHA

    MySQL Master HA (MHA) is a tool to assist with automating master failover and slave promotion within short downtime, without suffering from replication consistency problems, and without performance penalty. We would like to have this tool support MariaDB 10 .

    Skills:
    Mentor:

    Import and export popular data formats from and to dynamic columns

    Provide import and export functions for popular data formats like JSON, XML (limited), PHP, ... for Connector/C and MariaDB Server (which use same code base for dynamic columns)

    Details:
    Skills:
    Mentor:

    MaxScale filter to capture incoming operations for consumption in external sources

    Design a filter that will capture incoming inserts, updates and deletes, for specified tables (as regex) in a separate log file that is consumable as JSON or CSV form. So that external ETL processes can process it for data uploading into DWH or big data platform. Optionally a plugin that takes this log into a Kafka broker that can put this data on Hadoop node can be developed as next step.

    Details:
    Skills:
    Mentor:

    MaxScale filter to real Microsoft SQL Server syntax

    Develop a MaxScale filter that will translate SQL Server syntax to MariaDB syntax. Develop a SQL Server client protocol plugin.

    Details:
    Skills:
    Mentor:

    This page is licensed: CC BY-SA / Gnu FDL

    Details:

    MDEV-4989

    Skills:

    C/C++

    Mentor:

    Kristian Nielsen

    Details:

    MDEV-5993

    Skills:

    C/C++

    Mentor:

    Sergey Vojtovich

    Details:

    MDEV-5800

    Skills:

    C/C++

    Mentor:

    Sergei Golubchik

    Details:

    MDEV-5199

    Skills:

    C/C++

    Mentor:

    Sergei Golubchik

    Details:

    MDEV-7773

    Skills:

    C/C++

    Mentor:

    Sergei Golubchik

    Details:

    MDEV-5813

    Skills:

    C/C++

    Mentor:

    Holyfoot

    Details:

    MDEV-4674

    Skills:

    C/C++

    Mentor:

    Colin Charles

    Details:

    MDEV-7502

    Skills:

    C/C++

    Mentor:

    Kristian Nielsen

    Details:

    MDEV-371

    Skills:

    C/C++

    Mentor:

    Sergei Golubchik

    Details:

    MDEV-5535

    Skills:

    C/C++

    Mentor:

    Sergei Golubchik

    Skills:

    Perl

    Mentor:

    Colin Charles

    Details:

    CONC-125

    Skills:

    C

    Mentor:

    Oleksandr Byelkin, Georg Richter

    Details:

    MXS-2

    Skills:

    C/C++

    Mentor:

    Markus Makela

    Details:

    MXS-1

    Skills:

    C/C++

    Mentor:

    Markus Makela & Massimiliano Pinto

    «GSoC 2015 tasks»
    mysqlbinlog
    mysqlbinlog
    backups
    replication

    Google Summer of Code 2018

    We participated in the Google Summer of Code 2018. The MariaDB Foundation believes we are making a better database that remains application compatible with MySQL. We also work on making LGPL connectors (currently C, ODBC, Java) and on MariaDB Galera Cluster, which allows you to scale your reads & writes. And we have MariaDB ColumnStore, which is a columnar storage engine, designed to process petabytes of data with real-time response to analytical queries.

    Where to start

    Please join us at irc.freenode.net at #maria to mingle with the community. Don't forget to subscribe to maria-developers@lists.launchpad.net (this is the main list where we discuss development).

    A few handy tips for any interested students who are unsure which projects to choose:Blog post from former GSoC student & mentor

    To improve your chances of being accepted, it is a good idea to submit a pull request with a bug fix to the server.

    List of tasks

    Loaded from the

    Full DECIMAL support in ColumnStore/

    MariaDB ColumnStore supports DECIMAL with some limitations:

    1. We do not support the full DECIMAL range that is in MariaDB

    2. In several places in the code we convert the DECIMAL to DOUBLE during execution therefore losing precision

    Implementing this will likely require the following:

    • Implementation of methods to handle MariaDB's DECIMAL format

    • Support for a longer than 8-byte numeric column type (there is an InfiniDB tree with work for this already)

    • Modification of the primitives processor for the math

    • Modification of the function expression processor to handle the new type

    Details:
    Mentor:

    mcsapi needs a new read API Design/

    We need an ORM-style NoSQL read API to go along with the bulk write API of mcsapi.

    This will likely take the form of:

    1. A reader in ExeMgr which will convert messages from mcsapi into jobs

    2. Code in mcsapi to send/receive the messages

    Although ExeMgr can already receive messages with an execution plan the format is very complex and ABI breaks easily (we often send whole C++ objects).

    We should look at other ORM frameworks for inspiration as the API design.

    This task to do the design for this API.

    Details:
    Mentor:

    Support for GTID in mysqlbinlog/

    The mysqlbinlog client program needs to be updated to support GTID.

    Here is a suggested list of things to be done:

    • The --start-position and --stop-position options should be able to take GTID positions; or maybe there should be new --start-gtid and --stop-gtid options. Like --start-gtid=0-1-100,1-2-200,2-1-1000.

    • A GTID position means the point just after that GTID. So starting from GTID 0-1-100 and stopping at GTID 0-1-200, the first GTID output will probably be 0-1-101 and the last one 0-1-200. Note that if some domain is not specified in the position, it means to start from the begining, respectively stop immediately in that domain.

    Probably some more things will come up during the work, but this looks like a reasonable start.

    Details:
    Mentor:

    Implement UPDATE with result set/

    Add an UPDATE operation that returns a result set of the changed rows to the client.

    I'm not exactly sure how the corresponding multiple-table syntax should look like, or if it is possible at all. But already having it for single-table updates would be a nice feature.

    Details:
    Mentor:

    optimizer trace/

    In MySQL, Optimizer trace is a JSON object recording the execution path through the optimizer, decisions that were made and the reasons for them. See

    Users were asking for MariaDB to have a similar feature.

    Details:
    Mentor:

    Automatic provisioning of slave/

    Idea

    The purpose of this task is to create an easy-to-use facility for setting up a new MariaDB replication slave.

    Setting up a new slave currently involves: 1) installing MariaDB with initial database; 2) point the slave to the master with CHANGE MASTER TO; 3) copying initial data from the master to the slave; and 4) starting the slave with START SLAVE. The idea is to automate step (3), which currently needs to be done manually.

    The syntax could be something as simple as

    LOAD DATA FROM MASTER

    This would then connect to the master that is currently configured. It will load a snapshot of all the data on the master, and leave the slave position at the point of the snapshot, ready for START SLAVE to continue replication from that point.

    Implementation:

    The idea is to do this non-blocking on the master, in a way that works for any storage engine. It will rely on row-based replication to be used between the master and the slave.

    At the start of LOAD DATA FROM MASTER, the slave will enter a special provisioning mode. It will start replicating events from the master at the master's current position.

    The master dump thread will send binlog events to the slave as normal. But in addition, it will interleave a dump of all the data on the master contained in tables, views, or stored functions. Whenever the dump thread would normally go to sleep waiting for more data to arrive in the binlog, the dump thread will instead send another chunk of data in the binlog stream for the slave to apply.

    A "chunk of data" can be:

    • A CREATE OR REPLACE TABLE / VIEW / PROCEDURE / FUNCTION

    • A range of N rows (N=100, for example). Each successive chunk will do a range scan on the primary key from the end position of the last chunk.

    Sending data in small chunks avoids the need for long-lived table locks or transactions that could adversely affect master performance.

    The slave will connect in GTID mode. The master will send dumped chunks in a separate domain id, allowing the slave to process chunks in parallel with normal data.

    During the provisioning, all normal replication events from the master will arrive on the slave, and the slave will attempt to apply them locally. Some of these events will fail to apply, since the affected table or row may not yet have been loaded. In the provisioning mode, all such errors will be silently ignored. Proper locking (isolation mode, eg.) must be used on the master when fetching chunks, to ensure that updates for any row will always be applied correctly on the slave, either in a chunk, or in a later row event.

    In order to make the first version of this feature feasible to implement in a reasonable amount of time, it should set a number of reasonable restrictions (which could be relaxed in a later version of the feature):

    • Give up with an error if the slave is not configured for GTID mode (MASTER_USE_GTID != NO).

    • Give up with error if the slave receives any event in statement-based binlogging (so the master must be running in row-based replication mode, and no DDL must be done while the provisioning is running).

    • Give up with an error if the master has a table without primary key.

    • Secondary indexes will be enabled during the provisioning; this means that tables with large secondary indexes could be expensive to provision.

    Details:
    Mentor:

    connection encryption plugin support/

    As a follow-on to MDEV-4691 we would like GSSAPI encryption (in addition to authentication) support in MariaDB. I am told that the current plan is to create a plugin interface and then we can build GSSAPI encryption on top of that, so here is a ticket for that.

    From having written , there were a couple things I would like to see in the plugin encryption interface.

    First, GSSAPI is weird in that it does authentication before encryption (TLS/SSL are the other way around, establishing an encrypted channel and then doing authentication over it). Of course support for this is needed, but more importantly, packets must be processed in a fully serialized fashion. This is because encrypted packets may be queued while one end of the connection is still finishing up processing the authentication handshake. One way to do this is registering "handle" callbacks with connection-specific state, but there are definitely others.

    Additionally, for whatever conception there ends up being of authentication and encryption, it needs to be possible to share more data than just a socket between them. The same context will be used for authentication and encryption, much as an SSL context is (except of course we go from authentication to encryption and not the other way around).

    This ties into an issue of dependency. If authentication plugins are separate entities from encryption plugins in the final architecture, it might make sense to do mix-and-match authentication with encryption. However, there are cases - and GSSAPI is one - where doing encryption requires a certain kind of authentication (or vice versa). You can't do GSSAPI encryption without first doing GSSAPI authentication. (Whether or not it's permitted to do GSSAPI auth->encryption all over a TLS channel, for instance, is not something I'm concerned about.)

    Finally, encrypted messages are larger than their non-encrypted counterparts. The transport layer should cope with this so that plugins don't have to think about reassembly, keeping in mind that there may not be a way to get the size of a message when encrypted without first encrypting it.

    It's unfortunately been a little while since I wrote that code, but I think those were the main things that we'll need for GSSAPI. Thanks!

    Details:
    Mentor:

    Aggregate Window Functions/

    Currently only a few aggregate function are supported as window functions, the list can be found here

    So in MDEV-7773, support for creating of custom aggregate functions was added. Now this task would deal with extending that feature and make custom aggregate functions behave as window functions

    An example of a creating a custom aggregate function is given below:

    Details:
    Mentor:

    True ALTER LOCK=NONE on slave/

    Currently no true LOCK=NONE exists on slave. Alter table is first committed on master then it is replicated on slaves. The purpose of this task is to create a true LOCK=NONE

    Implementation Idea

    Master will write BEGIN_DDL_EVENT in binlog after it hits ha_prepare_inplace_alter_table. Then master will write QUERY_EVENT on binlog with actual alter query . On commit/rollback master will write COMMIT_DDL_EVENT/ROLLBACK_DDL_EVENT.

    On slave there will be pool of threads(configurable global variable), which will apply these DDLs. On reciving BEGIN_DDL_EVENT slave thread will pass the QUERY_EVENT to one of the worker thread. Worker thread will execute untill ha_inplace_alter_table. Actual commit_inplace_alter will be called by sql thread. If sql thread recieve some kind of rollback event , then it will somehow signal worker thread to stop executing alter. If none of the worker threads are avaliable then event will be enqueued, then If we recieved rollback event the we will simply discard event from queue, If we recieved commit event then SQL thread will syncrolysly process DDL event.

    Details:
    Mentor:

    Improve mysqltest language/

    mysqltest has a lot of historical problems:

    • ad hoc parser, weird limitations

    • commands added as needed with no view over the total language structure

    • historical code issues (e.g. casts that become unnecessary 10 years ago) etc

    A lot can be done to improve it.

    Ideas

    • control structures, else in if, break and continue in while, for (or foreach) loop

    • proper expression support in let, if

    Details:
    Mentor:

    Cassandra Storage Engine v2, based on DataStax C++ driver/

    had Cassandra Storage Engine which was developed for Cassandra 1.1.x. Back then, Cassandra provided a Thrift API, and that was what Cassandra-SE used.

    Then, Cassandra 2.0 switched to using a different network protocol (and also changed the data model).

    This task is to develop a Cassandra Storage Engine V2 using DataStax's C++ client library ().

    See also: MDEV-8947 was a previous attempt to implement this engine. Unfortunately it didn't even produce a skeleton engine.

    Details:
    Mentor:

    Histograms with equal-width bins in MariaDB/

    Histograms with equal-width bins are easy to construct using samples. For this it's enough to look through the given sample set and for each value from it to figure out what bin this value can be placed in. Each bin requires only one counter. Let f be a column of a table with N rows and n be the number of samples by which the equal-width histogram of k bins for this column is constructed. Let after looking through all sample rows the counters created for the histogram bins contain numbers c[1],..,c[k]. Then m[i]= c[i]/n * 100 is the percentage of the rows whose values of f are expected to be in the interval

    It means that if the sample rows have been chosen randomly the expected number of rows with the values of f from this interval can be approximated by the number m[i]*/100 * N.

    To collect such statistics it is suggested to use the following variant of the ANALYZE TABLE command:

    Here:

    • 'WITH n ROWS' provides an estimate for the number of rows in the table in the case when this estimate cannot be obtained from statistical data.

    • 'SAMPLING p PERCENTS' provides the percentage of sample rows to collect statistics. If this is omitted the number is taken from the system variable samples_ratio.

    • 'IN RANGE r' sets the range of equal-width bins of the histogram built for the column col1. If this is omitted then and min and max values for the column can be read from statistical data then the histogram is built for the range [min(col1), max(col1)]. Otherwise the range [MIN_type(col1), MAX_type(col1) is considered]. The values beyond the given range, if any, are also is taken into account in two additional bins.

    Details:
    Mentor:

    Implement multiple-table UPDATE/DELETE returning a result set/

    A multiple-table UPDATE first performs join operations, then it updates the matching rows. A multiple-table UPDATE returning a result set does the following:

    • first performs join operations

    • for each row of the result of the join it calculates some expressions over the columns of the join and forms from them a row of the returned result set

    • after this it updates the matching rows

    A multiple-table DELETE first performs join operations, then it deletes the matching rows. A multiple-table DELETE returning a result set does the following:

    • first performs join operations

    • for each row of the result of the join it calculates some expressions over the columns of the join and forms from them a row of the returned result set

    • after this it deletes the matching rows

    Details:
    Mentor:

    Blacklist for access control a.k.a. "negative grants"/

    Currently, MariaDB privilege system only perform whiltelist check for access control to certain database, table and column. This makes it difficult if we need to block access to certain database/table/column while allow for all others. — A good solution would be to allow to REVOKE anything that a user is able to do — not only exactly those grants that were granted to a user, but also a subset. Like

    Details:
    Mentor:

    Add syntax to manually encrypt/decrypt InnoDB's system tablespace/

    Currently, the InnoDB system tablespace can only be automatically encrypted/decrypted by the background encryption threads if innodb_encrypt_tables=ON|FORCE, innodb_encryption_threads>0, and innodb_encryption_rotate_key_age>0. There is no way to manually encrypt/decrypt the tablespace.

    File-per-table tablespaces can be manually encrypted with:

    File-per-table tablespaces can be manually decrypted with:

    Some users want a similar method that would allow them to manually encrypt/decrypt the InnoDB system tablespace.

    This is loosely related to MDEV-14571, since both issues are related to the fact that the system tablespace can only be encrypted/decrypted by the background threads.

    Details:
    Mentor:

    Control over memory allocated for SP/PS/

    SP/PS (Stored Procedures / Prepared Statements) allocates memory till the PS cache of SP will be destroyed. There is no way to see how many memory allocated and if it grows with each execution (first 2 execution can lead to new memory allocation but not more)

    Task minimum:

    Status variables which count the memory used/allocated for SP/PS by thread and/or for the server.

    Other ideas:

    • Automatic stop allocation in debvugging version after second execution and call exception on attempt.

    • Information schema by threads and SP/PS with information about allocated and used memory

    Information can be collected in MEM_ROOTs of SP/PS. Storing info about status of mem_root before execution then checking after new allocated memory can be found.

    MEM_ROOT can be changed to have debug mode which make it read only which can be switched on after second execution.

    Details:
    Mentor:

    This page is licensed: CC BY-SA / Gnu FDL

    Version upgrade support for DECIMAL from the current form to the new form

    Starting and stopping GTID should work both with local files, and with

    1. --read-from-remote-server. For the latter, there are a couple of extra things that need doing in the master-slave protocol, see

    2. get_master_version_and_clock()insql/slave.cc.

  • At the end of the dump, put these statements, to reduce the risk of those session variables incorrectly spilling into subsequent statements run in the same session:

  • , etc
  • rich enough expressions to make resorting to sql unnecessary in most cases

  • remove unused and redundant commands (e.g. system vs exec, query_vertical vs vertical_results ONCE)

  • remove complex commands that do many sql statements under the hood, if they can be scripted, e.g. sync_slave_with_master

  • remove over-verbose treatment of rpl test failures

  • scoped variables

  • parameters for the source command

  • remove dead code

  • WITH k INTERVALS says how many bins are included in the histogram. If it is omitted this value is taken from the system variable histogram_size.

    Details:

    MCOL-641

    Mentor:

    Andrew Hutchings

    Details:

    MCOL-1151

    Mentor:

    Andrew Hutchings

    Details:

    MDEV-4989

    Mentor:

    Andrei Elkin

    Details:

    MDEV-5092

    Mentor:

    Igor Babaev

    Details:

    MDEV-6111

    Mentor:

    Varun Gupta

    Details:

    MDEV-7502

    Mentor:

    Andrei Elkin

    Details:

    MDEV-9090

    Mentor:

    Vladislav Vaintroub

    Details:

    MDEV-11263

    Mentor:

    Varun Gupta

    Details:

    MDEV-11675

    Mentor:

    Sachin Setiya

    Details:

    MDEV-12130

    Mentor:

    Sergei Golubchik

    Details:

    MDEV-12296

    Mentor:

    Sergei Golubchik

    Details:

    MDEV-12313

    Mentor:

    Vicentiu Ciorbaru

    Details:

    MDEV-12326

    Mentor:

    Igor Babaev

    Details:

    MDEV-14443

    Mentor:

    Vicentiu Ciorbaru

    Details:

    MDEV-14610

    Mentor:

    Jan Lindström

    Details:

    MDEV-14959

    Mentor:

    Oleksandr Byelkin

    List of beginner friendly bugs
    MariaDB issue tracker
    optimizer-tracing.html
    GSSAPI for the internal interface
    cpp-driver

    Google Summer of Code 2021

    In 2021, we again participated in the . The believes we are making a better database that remains application compatible with MySQL. We also work on making LGPL connectors (currently , , , ) and on , which allows you to scale your reads & writes. And we have , which is a columnar storage engine, designed to process petabytes of data with real-time response to analytical queries.

    Where to Start

    Please join us on to mingle with the community. You should also subscribe to (this is the main list where we discuss development).

    To improve your chances of being accepted, it is a good idea to submit a pull request with a bug fix to the server.

    SET session.server_id = @@global.server_id,
          session.gtid_domain_id=@@global.gtid_domain_id;
    UPDATE [LOW_PRIORITY] [IGNORE] tbl_name
        SET col_name1={expr1|DEFAULT} [, col_name2={expr2|DEFAULT}] ...
        [WHERE where_condition]
        [ORDER BY ...]
        [LIMIT row_count]
    RETURNING select_expr [, select_expr ...]
    CREATE AGGREGATE FUNCTION agg_sum(x INT) RETURNS DOUBLE
    BEGIN
      DECLARE z DOUBLE DEFAULT 0;
      DECLARE CONTINUE HANDLER FOR NOT FOUND RETURN z;
      LOOP
        FETCH GROUP NEXT ROW;
        SET z = z + x;
      END LOOP;
    END|
    (max(f)-min(f))/k *(i-1), max(f)-min(f))/k *i-1).
    ANALYZE FAST TABLE tbl [ WITH n ROWS ] [SAMPLING p PERCENTS ]
       PERSISTENT FOR COLUMNS (col1 [IN RANGE r] [WITH k INTERVALS],...)
    GRANT SELECT ON some_database.* TO a_user@%;
    REVOKE SELECT ON some_database.secret_table FROM a_user@%;
    ALTER TABLE tab ENCRYPTION=YES;
    ALTER TABLE tab ENCRYPTION=NO;
    Also see the List of beginner friendly issues and issues labelled gsoc21 from the MariaDB Issue Tracker.

    List of Tasks

    Support for GTID in mysqlbinlog

    The mysqlbinlog client program needs to be updated to support GTID. Here is a suggested list of things to be done:

    • The --start-position and --stop-position options should be able to take GTID positions; or maybe there should be new --start-gtid and --stop-gtid options. Like --start-gtid=0-1-100,1-2-200,2-1-1000.

    • A GTID position means the point just after that GTID. So starting from GTID 0-1-100 and stopping at GTID 0-1-200, the first GTID output will probably be 0-1-101 and the last one 0-1-200. Note that if some domain is not specified in the position, it means to start from the begining, respectively stop immediately in that domain.

    • Starting and stopping GTID should work both with local files, and with--read-from-remote-server. For the latter, there are a couple of extra things that need doing in the master-slave protocol, seeget_master_version_and_clock() in sql/slave.cc.

    • At the end of the dump, put these statements, to reduce the risk of those session variables incorrectly spilling into subsequent statements run in the same session:

    Probably some more things will come up during the work, but this looks like a reasonable start.

    Details:
    Mentor:

    Details:

    Mentor:

    Brandon Nesterenko

    Granted to PUBLIC

    Implement the standard behavior for

    Also, this statement is supposed to work:

    And these should not

    Note that

    should not list roles and privileges granted to PUBLIC (unless granted to xxx too), but

    should, arguably, list them.

    Details:
    Mentor:

    Details:

    Mentor:

    Oleksandr Byelkin

    Control over memory allocated for SP/PS

    SP/PS (Stored Procedures / Prepared Statements) allocates memory till the PS cache of SP will be destroyed. There is no way to see how many memory allocated and if it grows with each execution (first 2 execution can lead to new memory allocation but not more)Task minimum: Status variables which count the memory used/allocated for SP/PS by thread and/or for the server.Other ideas:

    • Automatic stop allocation in debugging version after second execution and call exception on attempt.

    • Information schema by threads and SP/PS with information about allocated and used memory

    Information can be collected in MEM_ROOTs of SP/PS. Storing info about status of mem_root before execution then checking after new allocated memory can be found. MEM_ROOT can be changed to have debug mode which make it read only which can be switched on after second execution.

    Details:
    Mentor:

    Details:

    Mentor:

    Oleksandr Byelkin

    Add JSON_NORMALIZE function to normalize JSON values

    Background is this question on stackexchange:mariadb-compare-json The task is to provide a function that can be used to compare 2 json documents for equality, then name can be e.g JSON_NORMALIZE JSON_COMPACT already takes care of removing spaces, but this is not sufficient. Keys need to be (recursively) sorted , and if spaces are removed, then documents can be compared as binary strings.

    Details:
    Mentor:

    Details:

    Mentor:

    Vicențiu Ciorbaru

    Add linear regression functions

    The following linear regression functions exist in a number of other DBMSs, such as Oracle, PostgreSQL:

    Some have also been added to Columnstore.

    Details:
    Mentor:

    Details:

    Mentor:

    Nikita Malyavin

    Create utility to parse frm files and print their DDL

    It would be useful if MariaDB had a utility that was able to parse frm files and print the DDL associated with the table. For example, it would be useful for users who performed a partial backup with mariadb-backup:partial-backup-and-restore-with-mariadb-backup But they forgot to also backup the table DDL, so they can't restore the tables using the following process: mysqlfrm is a tool that already exists that does similar things:mysqlfrm.py But it doesn't seem very user-friendly. It needs to be able to contact the local MariaDB server, and it also needs to be able to spawn a server instance, and it seems to need to be able to create a bunch of files during this process. e.g.:

    Details:
    Mentor:

    Details:

    Mentor:

    Vicențiu Ciorbaru

    JSON_DETAILED output unnecessarily verbose

    JSON_DETAILED function ( ) is described as

    We now got a use case for it: Optimizer Trace output. Optimizer trace is too large to be copied in full, instead we use expressions like

    Our experience is that JSON_DETAILED has some room for improvement when it comes to the quality of automatic JSON formatting. Example:

    Things to note:

    • empty lines at the start (right before/after the "range_scan_alternatives")

    • "analyzing_index_merge_union":[] occupies 3 lines where one would be sufficient.

    • the same goes for "ranges"

    One can look at the JSON pretty-printer that is used by EXPLAIN FORMAT=JSON and optimizer trace. It produces a better result (but it has room for improvement, too.) Extra: in MySQL, the function is called JSON_PRETTY. We should add ability to use this name as an alias.

    Details:
    Mentor:

    Details:

    Mentor:

    Vicențiu Ciorbaru

    Histograms: use JSON as on-disk format

    Currently, histograms are stored as array of 1-byte bucket bounds (SINGLE_PREC_HB) or 2-byte bucket bounds (DOUBLE_PREC_HB). The table storing the histograms supports different histogram formats but limits them to 256 bytes (hist_size is tinyint).

    This prevents us from supporting other kinds of histograms. The first low-hanging fruit would be to store the histogram bucket bounds precisely (like MySQL and PostgreSQL do, for example). The idea of this MDEV is to switch to JSON as storage format for histograms. If we do that, it will:

    • Improve the histogram precision

    • Allow the DBAs to examine the histograms

    • Enable other histogram types to be collected/used. h2. Milestone-1: Let histogram_type have another possible value, tentative name "JSON" when that is set, let ANALYZE TABLE syntax collect a JSON "histogram"

    that is, the following should work:

    this should produce {"hello":"world"}. h2. Milestone-2: produce JSON with histogram(). &#xNAN;()- the exact format is not specified, for now, print the bucket endpoints and produce output like this:

    Milestone-2, part#2: make mysql.column_stats.histogram a blob. h2. Milestone-3: Parse the JSON back into an array Figure out how to use the JSON parser. Parse the JSON data produced in Milestone-2 back. For now, just print the parsed values to stderr. (Additional input provided on Zulip re parsing valid/invalid JSON histograms) h2. Milestone-4: Make the code support different kinds of Histograms Currently, there's only one type of histogram. smaller issue: histogram lookup functions assume the histogram stores fractions, not values. bigger issue: memory allocation for histograms is de-coupled from reading the histograms. See alloc_statistics_for_table, read_histograms_for_table. The histogram object lives in a data structure that is bzero'ed first and then filled later (IIRC there was a bug (fixed) where the optimizer attempted to use bzero'ed histogram) Can histograms be collected or loaded in parallel by several threads? This was an (unintentional?) possibility but then it was disabled (see TABLE_STATISTICS_CB object and its use) h3. Step #0: Make Histogram a real class Here's the commit:3ac32917ab6c42a5a0f9ed817dd8d3c7e20ce34d h3. Step 1: Separate classes for binary and JSON histograms Need to introduce

    and a factory function

    for now, let Histogram_json::point_selectivity() and Histogram_json::range_selectivity() return 0.1 and 0.5, respectively. h3. Step 2: Demonstrate saving/loading of histograms Now, the code already can:

    • collect a JSON histogram and save it.

    • when loading a histogram, figure from histogram_type column that this is JSON histogram being loaded, create Histogram_json and invoke the parse function. Parse function at the moment only prints to stderr. However, we should catch parse errors and make sure they are reported to the client. The test may look like this:

    h2. Milestone-5: Parse the JSON data into a structure that allows lookups. The structure is

    and it holds the data in KeyTupleFormat (See the comments for reasoning. There was a suggestion to use in_vector (This is what IN subqueries use) but it didn't work out) h2. Milestone 5.1 (aka Milestone 44) Make a function to estimate selectivity using the data structure specified in previous milestone. h2. Make range_selectivity() accept key_range parameters. (currently, they accept fractions, which is only suitable for binary histograms) This means Histogram_binary will need to have access to min_value and max_value to compute the fractions.

    Details:
    Mentor:

    Details:

    Mentor:

    Sergei Petrunia

    make my_vsnprintf to use gcc-compatible format extensions

    my_vsnprintf() is used internally in the server as a portable printf replacement. And it's also exported to plugins as a service. It supports a subset of printf formats and three extensions:

    • %s` means that a string should be quoted like an identifier

    • %b means that it's a binary string, not zero-terminated; printing won't stop at \0, so one should always specify the field width (like %.100b)

    • %M is used in error messages and prints the integer (errno) and the corresponding strerror() for it

    gcc knows printf formats and check whether actual arguments match the format string and issue a warning if they don't. Unfortunately there seems to be no easy way to teach gcc our extensions, so for now we have to disable printf format checks. An better approach would be to use gcc compatible format extensions, like Linux kernel does. We should migrate to a different syntax for our extensions

    • %sI to mean "print as an identifier"

    • %sB to mean "print a binary string"

    • %uE to mean "print an errno"

    • %sT to put a "..." as truncation indicator

    old formats can still be supported or they can be removed and in the latter case the major version of the service should be increased to signal an incompatible change. All error messages and all usages of my_vsnprintf should be changed to use the new syntax and gcc printf format checks should be enabled.

    Details:
    Mentor:

    Details:

    Mentor:

    Sergei Golubchik

    Add JSON_EQUALS function to check JSON equality

    JSON_CONTAINS can be used to test for JSON object equality in some cases, but we seem to lack a clear JSON_EQUALS function.

    Details:
    Mentor:

    Details:

    Mentor:

    Vicențiu Ciorbaru

    Concurrent multi-reader, multi-writer buffer for IO_CACHE

    IO_CACHE has basically three read/write modes: only read, only write, and a sequential read/write FIFO mode SEQ_READ_APPEND. While some performance-sensitive places, like replication slave thread, use SEQ_READ_APPEND, that may be a bottleneck. since reads and writes are sequential (and co-sequential i.e. reads and writes block each other). The task is to implement a non-blocking mode or multi-reader, multi-writer use-case through a concurrent ring buffer implementation. h2. Possible approaches h3. Lock-free n-consumer, m-producer ring buffer This implementation requires limiting a number of simultaneous accessors and reserving slots for them. Lock-free implementations can contain busy waits, but no locks, except when a number of consumers or producers is exceeded. This can be controlled by a semaphore with a capacity of a number of cores. This is an ideal way, but can be an overhaul because of complicated busy loops and slot management. This is also hard because writes can be bigger than a buffer. See buffer excess. h3. Simple rwlock-based non-blocking approach The bottleneck basically occurs because SEQ_READ_APPEND blocks the whole time during buffer copy. We can avoid it by moving the pointers first, thus allocating a place for copying, and then make a copy from/to the buffer without a lock. rwlock will be used to access the pointers, i.e. reads access IO_CACHE::end_of_file with read lock to make ensure the borders, but writers access it with write lock. h2. Buffer excess Excesses make things work sequential. When the buffer is full, a separate write buffer is created. When the write buffer is full, a flush happens. Flushes wait for all writers to finish first, then lock the write buffer for flushing. The read buffer can be flushed in a more relaxed way: no need to need to lock for flushing, but we have to lock for buffer allocation and wait for all writers. Waiting for writers can be done with another rwlock. h2. Single-readerness The real-world cases are mostly single-consumer, and it is essential for IO_CACHE: it is variable-lengthed, and has no underlying data format, so the reader always has to make at least two sequential reads (one to read size and another to read the body) Single-readerness considerations can relax some conditions and ease the implementation h2. io_cache_reserve api We can add a function to reserve the space to writing for the case of writing big objects (both bigger then the write cache and smaller then the latter, but big enough to not fit to the external buffer), for the cases like copying one cache to another. The function should return future-like object, since we have to notify IO_CACHE back that the writing is finished (to make flush for example)

    Details:
    Mentor:

    Details:

    Mentor:

    Nikita Malyavin

    Custom formatting of strings in MariaDB queries

    Formatting more complex strings in a SELECT statement can get awkward when there are many concat(), format(), etc calls involved. It would be very cool and helpful to have a function that takes an input string and a formatting specification and returns string formatted using the rules the user passed in the specification. A great example for such a function is the classic C printf function, which, in this context, would look something like:SELECT printf('%s %s, %s', first_name, last_name, job_title) from employees; But it doesn't necessarily need to look this way, an alternative syntax could be Python-ish, which would leverage the fact that the server already knows the datatype of each field used in the formatting scheme:SELECT sformat('arg1: {}, arg2: {}', col1, col2) from table; In that syntax one passes formatting options within the curly braces:

    Ideally, this new function should use, behind the scenes, the existing builtin formatting functions in MariaDB (e.g. date_format(), format()) and even future formatting functions (e.g. MySQL's format_bytes(), format_pico_time()), so the syntax has to be designed in a smart way to accommodate easily future additions.

    Details:
    Mentor:

    Details:

    Mentor:

    Sergei Golubchik

    Add autocompletion capabilities to the MariaDB Jupyter kernel

    As part of the Jupyter Messaging protocol, the Jupyter frontend sends a complete_request message to the MariaDB kernel when the user invokes the code completer in a Jupyter notebook. This message is handled in the do_complete function from the MariaDBKernel class. In simpler words, whenever the user hits the key shortcut for code autocompletion in a notebook, the MariaDB kernel's do_complete function is called with a number of arguments that help the kernel understand what the user wants to autocomplete. So the autocompletion infrastructure in the MariaDB kernel is already kindly provided by Jupyter, we only need to send back to Jupyter a list of suggestions based on the arguments that do_complete receives :-). Ideally we should aim to enable at least database, table and column name completion and also SQL keyword completion. But no worries, there are plenty of possibilities to extend the functionality even more if the accepted student turns out to be very productive :D

    Details:
    Mentor:

    Details:

    Mentor:

    Robert Bindar

    Implement interacting editing of result sets in the MariaDB Jupyter kernel

    At this moment the MariaDB kernel is only capable of getting the results sets from the MariaDB client in HTML format and packing them in a Jupyter compatible format. Jupyter then displays them in notebooks like it would display Python Pandas dataframes. Sure, the users can easily write SQL code to modify the content of a table like they would write in a classical command line database client. But we want to go a bit further, we would love to have the capability to edit a result set returned by a SELECT statement (i.e. double click on table cells and edit) and have a button that users can press to generate a SQL statement that will update the content of the table via the MariaDB server. Apart from interacting with the Jupyter frontend for providing this UI capability, we also have to implement a field integrity functionality so that we make sure users can't enter data that is not compatible with the datatype of the column as it is seen by the MariaDB server. The project should start with a fair bit of research to understand how we can play with the Jupyter Messaging protocol to create the UI functionality and also to check other Jupyter kernels and understand what's the right and best approach for tackling this.

    Details:
    Mentor:

    Details:

    Mentor:

    Andreia Hendea

    Make the MariaDB Jupyter kernel capable of dealing with huge SELECTs

    Currently the MariaDB kernel doesn't impose any internal limits for the number of rows a user can SELECT in a notebook cell. Internally the kernel gets the result set from MariaDB and stores it in a pandas DataFrame, so users can use it with magic commands to chart data. But this DataFrame is stored in memory, so if you SELECT a huge number of rows, say 500k or 1M, it's probably not a very good idea to create such a huge DataFrame. We tested with 500k rows, and the DataFrame itself is not the biggest problem, it consumed around 500MB of memory. The problem is the amount of rows the browser needs to render, for 500k rows the browser tab with the notebook consumes around 2GB of memory, so the Jupyter frontend (JupyterLab, Jupyter Notebook) slows down considerably. A potential solution is to introduce a two new config options which would specify:

    • a limit for the number of rows the Jupyter notebook should render, a reasonable default value for this could 50 rows for instance (display_max_rows)

    • a limit for each SELECT statement, limit_max_rows, that the kernel would use to determine whether it should store the result set in memory in a DataFrame or store the result set on disk. A reasonable default value might be 100k rows.

    The trickiest part of the project though is that, once the kernel writes a result set on disk, the charting magic commands need to detect that the data is not in memory, it is on disk, and they should find a smart mechanism for generating the chart from the disk data without loading the entire data in memory (which would defeat the whole purpose of the project). This might involve finding a new Python plotting library (instead of current matplotlib) that can accomplish the job.

    Details:
    Mentor:

    Details:

    Mentor:

    Vlad Bogolin

    Suggest a Task

    Do you have an idea of your own, not listed above? Do let us know!

    This page is licensed: CC BY-SA / Gnu FDL

    Google Summer of Code
    MariaDB Foundation
    Node.js
    MariaDB Galera Cluster
    MariaDB ColumnStore
    Zulip
    maria-developers@lists.launchpad.net

    Google Summer of Code 2020

    We participated in the 2020. The believes we are making a better database that remains application compatible with MySQL. We also work on making LGPL connectors (currently C, ODBC, Java) and on , which allows you to scale your reads & writes. And we have , which is a columnar storage engine, designed to process petabytes of data with real-time response to analytical queries.

    Where to Start

    Please join us on to mingle with the community. Don't forget to subscribe to (this is the main list where we discuss development).

    A few handy tips for any interested students who are unsure which projects to choose:

    To improve your chances of being accepted, it is a good idea to submit a pull request with a bug fix to the server.

    SET session.server_id = @@global.server_id,
           session.gtid_domain_id=@@global.gtid_domain_id;
    GRANT xxx TO PUBLIC;
    REVOKE xxx FROM PUBLIC;
    SHOW GRANTS FOR PUBLIC;
    CREATE ROLE PUBLIC;
    DROP ROLE PUBLIC;
    SET ROLE PUBLIC;
    GRANT PUBLIC TO xxx;
    REVOKE PUBLIC FROM xxx;
    SHOW GRANTS FOR xxx;
    SHOW GRANTS;
    REGR_SLOPE
        REGR_INTERCEPT
        REGR_COUNT
        REGR_R2
        REGR_AVGX
        REGR_AVGY
        REGR_SXX
        REGR_SYY
        REGR_SXY
    [ec2-user@ip-172-30-0-249 ~]$ cd /tmp
    [ec2-user@ip-172-30-0-249 tmp]$ sudo mysqlfrm --server=root:@localhost:3306 /var/lib/mysql/db1/tab.frm --port=12345 --user=mysql
    # Source on localhost: ... connected.
    # Spawning server with --user=mysql.
    # Starting the spawned server on port 12345 ... done.
    # Reading .frm files
    #
    # Reading the tab.frm file.
    #
    # CREATE statement for /var/lib/mysql/db1/tab.frm:
    #
    
    CREATE TABLE `db1`.`tab` (
      `id` int(11) NOT NULL,
      `str` varchar(50) DEFAULT NULL,
      PRIMARY KEY (`id`)
    ) ENGINE=InnoDB DEFAULT CHARSET=latin1
    
    #...done.
    Represents JSON in the most understandable way emphasizing nested structures.
    SELECT 
    JSON_DETAILED(JSON_EXTRACT(trace, '$**.analyzing_range_alternatives')) 
    FROM INFORMATION_SCHEMA.OPTIMIZER_TRACE;
    source mdev19160-data.sql
    mysql> select JSON_DETAILED(JSON_EXTRACT(a, '$**.analyzing_range_alternatives')) from t200\G
    *************************** 1. row ***************************
    JSON_DETAILED(JSON_EXTRACT(a, '$**.analyzing_range_alternatives')): [
    
        {
            "range_scan_alternatives": 
            [
    
                {
                    "index": "a_b",
                    "ranges": 
                    [
                        "2 <= a <= 2 AND 4 <= b <= 4"
                    ],
                    "rowid_ordered": true,
                    "using_mrr": false,
                    "index_only": true,
                    "rows": 1,
                    "cost": 1.1752,
                    "chosen": true
                }
            ],
            "analyzing_roworder_intersect": 
            {
                "cause": "too few roworder scans"
            },
            "analyzing_index_merge_union": 
            [
            ]
        }
    ]
    CREATE TABLE mysql.column_stats (
      min_value varbinary(255) DEFAULT NULL, 
      max_value varbinary(255) DEFAULT NULL, 
      ...
      hist_size TINYINT UNSIGNED, 
      hist_type ENUM('SINGLE_PREC_HB','DOUBLE_PREC_HB'), 
      histogram varbinary(255), 
      ...
    { "hello":"world"}
    SET histogram_type='json';
    ANALYZE TABLE t1 persisent FOR ALL;
    SELECT histogram FROM mysql.column_stats WHERE TABLE_NAME='t1' ;
    [
      "value1",
      "value2",
      ...
    ]
    class Histogram -- interface, no data members.
    class Histogram_binary : public Histogram
    class Histogram_json : public Histogram
    Histogram *create_histogram(Histogram_type)
    INSERT INTO mysql.column_stats VALUES('test','t1','column1', .... '[invalid, json, data']);
    FLUSH TABLES;
    # this should print some descriptive test
    --error NNNN
    SELECT * FROM test.t1;
    std::vector<std::string>
    -- Print 'arg1: col1, arg2: col2'  where col1 from table is of datetime type and should be printed as: 'Sunday November 2021'
    SELECT sformat('arg1: {%W %M %Y}, arg2: {}', col1, col2) FROM table;
    C
    ODBC
    Java
    C
    ODBC
    Java

    Also see the List of beginner friendly issues and issues labelled gsoc20 from the MariaDB Issue Tracker.

    List of Tasks

    MariaDB Server: Optimizer

    Evaluate subquery predicates earlier or later depending on their SELECTIVITY

    (Based on conversation with Igor) There are a lot of subquery conditions out there that are inexpensive to evaluate and have good selectivity. If we just implement MDEV-83, we may get regressions. We need to take subquery condition's selectivity into account. It is difficult to get a meaningful estimate for an arbitrary, correlated subquery predicate. One possible solution is to measure selectivity during execution and reattach predicates on the fly. We don't want to change the query plan all the time, one way to dynamically move items between item trees is to wrap them inside Item_func_trig_cond so we can switch them on and off.

    Details:
    Mentor:

    Details:

    Mentor:

    Igor Babaev

    Histograms with equal-width bins in MariaDB

    Histograms with equal-width bins are easy to construct using samples. For this it's enough to look through the given sample set and for each value from it to figure out what bin this value can be placed in. Each bin requires only one counter. Let f be a column of a table with N rows and n be the number of samples by which the equal-width histogram of k bins for this column is constructed. Let after looking through all sample rows the counters created for the histogram bins contain numbers c[1],..,c[k]. Then m[i]= c[i]/n * 100 is the percentage of the rows whose values of f are expected to be in the interval

    It means that if the sample rows have been chosen randomly the expected number of rows with the values of f from this interval can be approximated by the number m[i]*/100 * N. To collect such statistics it is suggested to use the following variant of the ANALYZE TABLE command:

    Here:

    • 'WITH n ROWS' provides an estimate for the number of rows in the table in the case when this estimate cannot be obtained from statistical data.

    • 'SAMPLING p PERCENTS' provides the percentage of sample rows to collect statistics. If this is omitted the number is taken from the system variable samples_ratio.

    • 'IN RANGE r' sets the range of equal-width bins of the histogram built for the column col1. If this is omitted then and min and max values for the column can be read from statistical data then the histogram is built for the range [min(col1), max(col1)]. Otherwise the range [MIN_type(col1), MAX_type(col1) is considered]. The values beyond the given range, if any, are also is taken into account in two additional bins.

    • WITH k INTERVALS says how many bins are included in the histogram. If it is omitted this value is taken from the system variable histogram_size.

    Details:
    Mentor:

    Details:

    Mentor:

    Vicentiu Ciorbaru

    Add FULL OUTER JOIN to MariaDB

    Add support for FULL OUTER JOIN sql_join_full.asp One of the way how to implement is to re-write the query

    into the following union all:

    Here t1.a is some non-nullable column of t1 (e.g. the column of single column primary key).

    Details:
    Mentor:

    Details:

    Mentor:

    Igor Babaev

    Recursive CTE support for UPDATE (and DELETE) statements

    supported in MySQL-8.0 and MSSQL

    Details:
    Mentor:

    Details:

    Mentor:

    Igor Babaev

    MariaDB Server: others

    Support for GTID in mysqlbinlog

    The mysqlbinlog client program needs to be updated to support GTID. Here is a suggested list of things to be done:

    • The --start-position and --stop-position options should be able to take GTID positions; or maybe there should be new --start-gtid and --stop-gtid options. Like --start-gtid=0-1-100,1-2-200,2-1-1000.

    • A GTID position means the point just after that GTID. So starting from GTID 0-1-100 and stopping at GTID 0-1-200, the first GTID output will probably be 0-1-101 and the last one 0-1-200. Note that if some domain is not specified in the position, it means to start from the begining, respectively stop immediately in that domain.

    • Starting and stopping GTID should work both with local files, and with --read-from-remote-server. For the latter, there are a couple of extra things that need doing in the master-slave protocol, see get_master_version_and_clock() in sql/slave.cc.

    • At the end of the dump, put these statements, to reduce the risk of those session variables incorrectly spilling into subsequent statements run in the same session:

    Details:
    Mentor:

    Details:

    Mentor:

    Andrei Elkin

    connection encryption plugin support

    As a follow-on to MDEV-4691 we would like GSSAPI encryption (in addition to authentication) support in MariaDB. I am told that the current plan is to create a plugin interface and then we can build GSSAPI encryption on top of that, so here is a ticket for that. From having written GSSAPI for the internal interface, there were a couple things I would like to see in the plugin encryption interface. First, GSSAPI is weird in that it does authentication before encryption (TLS/SSL are the other way around, establishing an encrypted channel and then doing authentication over it). Of course support for this is needed, but more importantly, packets must be processed in a fully serialized fashion. This is because encrypted packets may be queued while one end of the connection is still finishing up processing the authentication handshake. One way to do this is registering "handle" callbacks with connection-specific state, but there are definitely others. Additionally, for whatever conception there ends up being of authentication and encryption, it needs to be possible to share more data than just a socket between them. The same context will be used for authentication and encryption, much as an SSL context is (except of course we go from authentication to encryption and not the other way around). This ties into an issue of dependency. If authentication plugins are separate entities from encryption plugins in the final architecture, it might make sense to do mix-and-match authentication with encryption. However, there are cases - and GSSAPI is one - where doing encryption requires a certain kind of authentication (or vice versa). You can't do GSSAPI encryption without first doing GSSAPI authentication. (Whether or not it's permitted to do GSSAPI auth->encryption all over a TLS channel, for instance, is not something I'm concerned about.) Finally, encrypted messages are larger than their non-encrypted counterparts. The transport layer should cope with this so that plugins don't have to think about reassembly, keeping in mind that there may not be a way to get the size of a message when encrypted without first encrypting it. It's unfortunately been a little while since I wrote that code, but I think those were the main things that we'll need for GSSAPI. Thanks!

    Details:
    Mentor:

    Details:

    Mentor:

    Vicențiu Ciorbaru

    Aggregate Window Functions

    With a few exceptions, most native aggregate functions are supported as window functions. In MDEV-7773, support for creating of custom aggregate functions was added. This task proposes to extend that feature and allow custom aggregate functions to be used as window functions An example of a creating a custom aggregate function is given below:

    This functions can be used in the following query:

    After this task is complete the following must also work:

    Details:
    Mentor:

    Details:

    Mentor:

    Varun Gupta

    Improve mysqltest language

    mysqltest has a lot of historical problems:

    • ad hoc parser, weird limitations

    • commands added as needed with no view over the total language structure

    • historical code issues (e.g. casts that become unnecessary 10 years ago) etc

    A lot can be done to improve it. Ideas

    • control structures, else in if, break and continue in while, for (or foreach) loop

    • proper expression support in let, if, etc

    • rich enough expressions to make resorting to sql unnecessary in most cases

    • remove unused and redundant commands (e.g. system vs exec, query_vertical vs vertical_results ONCE)

    • remove complex commands that do many sql statements under the hood, if they can be scripted, e.g. sync_slave_with_master

    • remove over-verbose treatment of rpl test failures

    • scoped variables

    • parameters for the source command

    • remove dead code

    Details:
    Mentor:

    Details:

    Mentor:

    Sergei Golubchik

    Implement multiple-table UPDATE/DELETE returning a result set

    A multiple-table UPDATE first performs join operations, then it updates the matching rows. A multiple-table UPDATE returning a result set does the following:

    • first performs join operations

    • for each row of the result of the join it calculates some expressions over the columns of the join and forms from them a row of the returned result set

    • after this it updates the matching rows

    A multiple-table DELETE first performs join operations, then it deletes the matching rows. A multiple-table DELETE returning a result set does the following:

    • first performs join operations

    • for each row of the result of the join it calculates some expressions over the columns of the join and forms from them a row of the returned result set

    • after this it deletes the matching rows

    Details:
    Mentor:

    Details:

    Mentor:

    Igor Babaev

    sort out the compression library chaos

    As MariaDB is getting more storage engines and as they're getting more features, MariaDB can optionally use more and more compression libraries for various purposes. InnoDB, TokuDB, RocksDB — they all can use different sets of compression libraries. Compiling them all in would result in a lot of run-time/rpm/deb dependencies, most of which will be never used by most of the users. Not compiling them in, would result in requests to compile them in. While most users don't use all these libraries, many users use some of these libraries. A solution could be to load these libraries on request, without creating a packaging dependency. There are different ways to do it

    • hide all compression libraries behind a single unified compression API. Either develop our own or use something like Squash. This would require changing all engines to use this API

    • use the same approach as in server services — create a service per compression library, a service implementation will just return an error code for any function invocation if the corresponding library is not installed. this way — may be — we could avoid modifying all affected storage engines

    Details:
    Mentor:

    Details:

    Mentor:

    Sergei Golubchik

    Control over memory allocated for SP/PS

    SP/PS (Stored Procedures / Prepared Statements) allocates memory till the PS cache of SP will be destroyed. There is no way to see how many memory allocated and if it grows with each execution (first 2 execution can lead to new memory allocation but not more)

    Task minimum:

    Status variables which count the memory used/allocated for SP/PS by thread and/or for the server.

    Other ideas:

    • Automatic stop allocation in debvugging version after second execution and call exception on attempt.

    • Information schema by threads and SP/PS with information about allocated and used memory

    Information can be collected in MEM_ROOTs of SP/PS. Storing info about status of mem_root before execution then checking after new allocated memory can be found. MEM_ROOT can be changed to have debug mode which make it read only which can be switched on after second execution.

    Details:
    Mentor:

    Details:

    Mentor:

    Oleksandr Byelkin

    MariaDB ColumnStore

    Full DECIMAL support in ColumnStore

    MariaDB ColumnStore supports DECIMAL with some limitations:

    1. We do not support the full DECIMAL range that is in MariaDB

    2. In several places in the code we convert the DECIMAL to DOUBLE during execution therefore losing precision Implementing this will likely require the following:

    • Implementation of methods to handle MariaDB's DECIMAL format

    • Support for a longer than 8-byte numeric column type (there is an InfiniDB tree with work for this already)

    • Modification of the primitives processor for the math

    • Modification of the function expression processor to handle the new type

    • Version upgrade support for DECIMAL from the current form to the new form

    Details
    Mentor:

    Details

    Mentor:

    Andrew Hutchings

    Replace glibc with google's re2 for regex processing

    CS as of 1.4.2 relies on glibc for regex processing. We need to replace glibc with re2 for LIKE, REGEX and other facilities that affects performance.

    1. Identify places with glibc regex functions invocations

    2. Pick the invocations that significantly affects timings of the query

    3. Replace glibc regex calls with appropriate re2

    Details
    Mentor:

    Details

    Mentor:

    Roman

    Speed up RowGroup data access methods reducing level of inderection

    RowGroup is the unit of the data sent around CS cluster. It is basically a set of records for fixed size data types + additional storage area for binary data, e.g. strings. Right now there are lots of RowGroup methods that access RowGroup fixed size data using extra level of inderection. Here is the example: return ((int64_t) &data[offsets[colIndex]]); This expression uses extra assembly instruction to calculate the effective address. We want to remove 'offsets[colIndex]' part of this and similar expressions in RowGroup code.

    Details
    Mentor:

    Details

    Mentor:

    Roman

    Parallel sorting 2nd phase

    As of 1.4.1 CS uses two-phase sorting. Here are the phases:

    • Presort partial runs of data.

    • Merge the presorted partial runs produced during the 1st phase. Here is more detailed explanation of how sorting works as of 1.4.1

    CS gets a portion of records data from previous steps of the query execution(RGData instance from the ring buffer of RGData-s) and produces a sorting run out of it using existing sorting class LimitedOrderBy. If the query contains LIMIT then we apply it at this phase. This allows to significantly reduce the data set cardinality. If the query contains LIMIT + OFFSET then CS builds a sorted run of the records that is up to LIMIT+OFFSET size. CS does this step in parallel dividing the whole data set into k runs where k is governed by a session variable - infinidb/columnstore_orderby_threads. At this phase CS tries to preallocate memory in QUEUE_RESERVE_SIZE batches. CS merges and sorts k presorted partial runs produced by a previous phase in a single thread. If the query contains DISTINCT keyword CS rebuilds a hash map to preserve uniqueness. We want to make 2nd phase also parallel using range partitioning of the presorted runs produced by the 1st phase. After 1st phase finishes we know the distribution of the sorting key values thus can divide the thread key values run into regions - buckets. Every 2nd phase thread takes values from corresponding region buckets (contains the same values region) from every 1st phase sorted run. Then all 2nd phase threads sorts its runs in parallel. In the end we put the sorted regions in asscending order of the key values into output stream.

    Details
    Mentor:

    Details

    Mentor:

    Roman

    Add a bitmap for NULL/empty data

    We need a bitmap to store NULL/empty values instead of in-column values for this.

    Details
    Mentor:

    Details

    Mentor:

    Anderw Hutchings

    Engine independent statistics for Columnstore

    CS now has a very rudimentary query optimization capabilities and we want to improve the situtation. We consider to use Server's optimizer for the purpose but the Server needs statistics namely values distribution histograms and Number of Distinct Values distribution histograms. There are different levels of complexity for the task:

    • implement standalone segment files reader that in the end populates both mysql.column_stats and mysql.table_stats using out of band mariadb client connection

    • implement ANALYZE TABLE functionality for Columnstore engine

    • implement ANALYZE TABLE and Histograms with equal-width bins for values distribution histograms(similar to MDEV-12313) together with NDV histograms to decrease I/O We expect to have both unit and regression tests but this is optional.

    Details
    Mentor:

    Details

    Mentor:

    Roman

    mcsapi needs a new read API Design

    We need an ORM-style NoSQL read API to go along with the bulk write API of mcsapi. This will likely take the form of:

    1. A reader in ExeMgr which will convert messages from mcsapi into jobs

    2. Code in mcsapi to send/receive the messages Although ExeMgr can already receive messages with an execution plan the format is very complex and ABI breaks easily (we often send whole C++ objects). We should look at other ORM frameworks for inspiration as the API design. This task to do the design for this API.

    Details:
    Mentor:

    Details:

    Mentor:

    Andrew Hutchings

    Use JIT for aggregation functions

    CS uses Volcano processing approach working on one value at a time. This is very inefficient way for analytics workload that usually uses lots of aggregation functions in projections, filtering or sorting. We are interested in using JIT for basic aggregation functions: sum, avg, count, min, max. The patch must compile and run a program that processes and returns the aggregation function result. We were written this description having LLVM in mind as it is widespread and has a lots of examples in the wild. I suggest to start looking at RowAggregation::addRowGroup() from ./utils/rowgroup/rowaggregation.cpp to get what it takmakees to get avg() function result. Here is the link on how to build fast a CS developer environment.

    Details:
    Mentor:

    Details:

    Mentor:

    Roman

    Engine independent statistics for Columnstore

    CS now has a very rudimentary query optimization capabilities and we want to improve the situtation. We consider to use Server's optimizer for the purpose but the Server needs statistics namely values distribution histograms and Number of Distinct Values distribution histograms. There are different levels of complexity for the task:

    • implement standalone segment files reader that in the end populates both mysql.column_stats and mysql.table_stats using out of band mariadb client connection

    • implement ANALYZE TABLE functionality for Columnstore engine

    • implement ANALYZE TABLE and Histograms with equal-width bins for values distribution histograms(similar to MDEV-12313) together with NDV histograms to decrease I/O

    We expect to have both unit and regression tests but this is optional.

    Details:
    Mentor:

    Details:

    Mentor:

    Roman

    Histograms: use JSON as on-disk format

    Currently, histograms are stored as array of 1-byte bucket bounds (SINGLE_PREC_HB) or 2-byte bucket bounds (DOUBLE_PREC_HB). The table storing the histograms supports different histogram formats but limits them to 256 bytes (hist_size is tinyint).

    This prevents us from supporting other kinds of histograms. The first low-hanging fruit would be to store the histogram bucket bounds precisely (like MySQL and PostgreSQL do, for example). The idea of this MDEV is to switch to JSON as storage format for histograms.

    Details:
    Mentor:

    Details:

    Mentor:

    Sergei Petrunia

    Suggest a Task

    Do you have an idea of your own, not listed above? Do let us know!

    This page is licensed: CC BY-SA / Gnu FDL

    Google Summer of Code
    MariaDB Foundation
    MariaDB Galera Cluster
    MariaDB ColumnStore
    Zulip and on IRC
    maria-developers@lists.launchpad.net
    Blog post from former GSoC student & mentor
    MDEV-4989
    MDEV-5215
    MDEV-14959
    MDEV-16375
    MDEV-17467
    MDEV-18827
    MDEV-19160
    MDEV-21130
    MDEV-21978
    MDEV-23143
    MDEV-24676
    MDEV-25015
    Project Issue
    Project Issue
    Project Issue
    C
    ODBC
    Java

    Google Summer of Code 2019

    We participated in the Google Summer of Code 2019. The MariaDB Foundation believes we are making a better database that remains application compatible with MySQL. We also work on making LGPL connectors (currently C, ODBC, Java) and on MariaDB Galera Cluster, which allows you to scale your reads & writes. And we have MariaDB ColumnStore, which is a columnar storage engine, designed to process petabytes of data with real-time response to analytical queries.

    Where to Start

    Please join us on Zulip and on IRC to mingle with the community. Don't forget to subscribe to maria-developers@lists.launchpad.net (this is the main list where we discuss development).

    A few handy tips for any interested students who are unsure which projects to choose:Blog post from former GSoC student & mentor

    To improve your chances of being accepted, it is a good idea to submit a pull request with a bug fix to the server.

    List of Tasks

    Loaded from the

    MariaDB Server: Optimizer

    Evaluate subquery predicates earlier or later depending on their SELECTIVITY

    (Based on conversation with Igor) There are a lot of subquery conditions out there that are inexpensive to evaluate and have good selectivity. If we just implement , we may get regressions. We need to take subquery condition's selectivity into account. It is difficult to get a meaningful estimate for an arbitrary, correlated subquery predicate. One possible solution is to measure selectivity during execution and reattach predicates on the fly. We don't want to change the query plan all the time, one way to dynamically move items between item trees is to wrap them inside Item_func_trig_cond so we can switch them on and off.

    Details:
    Mentor:

    Add support for Indexes on Expressions

    An index on expression means something like

    in this case the optimizer should be able to use an index. This task naturally splits in two steps:

    1. add expression matching into the optimizer, use it for generated columns. Like in CREATE TABLE t1 (a int, b int, v INT GENERATED ALWAYS AS (a/2+b), INDEX (v));

    2. support the syntax to create an index on expression directly, this will automatically create a hidden generated column under the hood

    original task description is visible in the history

    Details:
    Mentor:

    Histograms with equal-width bins in MariaDB

    Histograms with equal-width bins are easy to construct using samples. For this it's enough to look through the given sample set and for each value from it to figure out what bin this value can be placed in. Each bin requires only one counter. Let f be a column of a table with N rows and n be the number of samples by which the equal-width histogram of k bins for this column is constructed. Let after looking through all sample rows the counters created for the histogram bins contain numbers c[1],..,c[k]. Then m[i]= c[i]/n * 100 is the percentage of the rows whose values of f are expected to be in the interval

    It means that if the sample rows have been chosen randomly the expected number of rows with the values of f from this interval can be approximated by the number m[i]*/100 * N. To collect such statistics it is suggested to use the following variant of the ANALYZE TABLE command:

    Here:

    • 'WITH n ROWS' provides an estimate for the number of rows in the table in the case when this estimate cannot be obtained from statistical data.

    • 'SAMPLING p PERCENTS' provides the percentage of sample rows to collect statistics. If this is omitted the number is taken from the system variable samples_ratio.

    • 'IN RANGE r' sets the range of equal-width bins of the histogram built for the column col1. If this is omitted then and min and max values for the column can be read from statistical data then the histogram is built for the range [min(col1), max(col1)]. Otherwise the range [MIN_type(col1), MAX_type(col1) is considered]. The values beyond the given range, if any, are also is taken into account in two additional bins.

    Details:
    Mentor:

    Add FULL OUTER JOIN to MariaDB

    Add support for FULL OUTER JOIN One of the way how to implement is to re-write the query

    into the following union all:

    Here t1.a is some non-nullable column of t1 (e.g. the column of single column primary key).

    Details:
    Mentor:

    Recursive CTE support for UPDATE (and DELETE) statements

    supported in MySQL-8.0 and MSSQL

    Details:
    Mentor:

    Implement EXCEPT ALL and INTERSECT ALL operations

    SQL Standard allows to use EXCEPT ALL and INTERSECT ALL as set operations. Currently MariaDB Server does not support them. The goal of this task is to support EXCEPT ALL and INTERSECT ALL

    1. at syntax level - allow to use operators EXCEPT ALL and INTERSECT ALL in query expression body

    2. at execution level - implement these operations employing temporary tables (the implementation could use the idea similar to that used for the existing implementation of the INTERSECT operation).

    Details:
    Mentor:

    MariaDB Server: others

    Implement UPDATE with result set

    Add an UPDATE operation that returns a result set of the changed rows to the client.

    I'm not exactly sure how the corresponding multiple-table syntax should look like, or if it is possible at all. But already having it for single-table updates would be a nice feature.

    Details:
    Mentor:

    Automatic provisioning of slave

    Idea

    The purpose of this task is to create an easy-to-use facility for setting up a new MariaDB replication slave. Setting up a new slave currently involves: 1) installing MariaDB with initial database; 2) point the slave to the master with CHANGE MASTER TO; 3) copying initial data from the master to the slave; and 4) starting the slave with START SLAVE. The idea is to automate step (3), which currently needs to be done manually. The syntax could be something as simple as

    This would then connect to the master that is currently configured. It will load a snapshot of all the data on the master, and leave the slave position at the point of the snapshot, ready for START SLAVE to continue replication from that point.

    Implementation:

    The idea is to do this non-blocking on the master, in a way that works for any storage engine. It will rely on row-based replication to be used between the master and the slave. At the start of LOAD DATA FROM MASTER, the slave will enter a special provisioning mode. It will start replicating events from the master at the master's current position. The master dump thread will send binlog events to the slave as normal. But in addition, it will interleave a dump of all the data on the master contained in tables, views, or stored functions. Whenever the dump thread would normally go to sleep waiting for more data to arrive in the binlog, the dump thread will instead send another chunk of data in the binlog stream for the slave to apply. A "chunk of data" can be:

    • A CREATE OR REPLACE TABLE / VIEW / PROCEDURE / FUNCTION

    • A range of N rows (N=100, for example). Each successive chunk will do a range scan on the primary key from the end position of the last chunk.

    Sending data in small chunks avoids the need for long-lived table locks or transactions that could adversely affect master performance. The slave will connect in GTID mode. The master will send dumped chunks in a separate domain id, allowing the slave to process chunks in parallel with normal data. During the provisioning, all normal replication events from the master will arrive on the slave, and the slave will attempt to apply them locally. Some of these events will fail to apply, since the affected table or row may not yet have been loaded. In the provisioning mode, all such errors will be silently ignored. Proper locking (isolation mode, eg.) must be used on the master when fetching chunks, to ensure that updates for any row will always be applied correctly on the slave, either in a chunk, or in a later row event. In order to make the first version of this feature feasible to implement in a reasonable amount of time, it should set a number of reasonable restrictions (which could be relaxed in a later version of the feature):

    • Give up with an error if the slave is not configured for GTID mode (MASTER_USE_GTID != NO).

    • Give up with error if the slave receives any event in statement-based binlogging (so the master must be running in row-based replication mode, and no DDL must be done while the provisioning is running).

    • Give up with an error if the master has a table without primary key.

    • Secondary indexes will be enabled during the provisioning; this means that tables with large secondary indexes could be expensive to provision.

    Details:
    Mentor:

    connection encryption plugin support

    As a follow-on to we would like GSSAPI encryption (in addition to authentication) support in MariaDB. I am told that the current plan is to create a plugin interface and then we can build GSSAPI encryption on top of that, so here is a ticket for that. From having written GSSAPI for the internal interface, there were a couple things I would like to see in the plugin encryption interface. First, GSSAPI is weird in that it does authentication before encryption (TLS/SSL are the other way around, establishing an encrypted channel and then doing authentication over it). Of course support for this is needed, but more importantly, packets must be processed in a fully serialized fashion. This is because encrypted packets may be queued while one end of the connection is still finishing up processing the authentication handshake. One way to do this is registering "handle" callbacks with connection-specific state, but there are definitely others. Additionally, for whatever conception there ends up being of authentication and encryption, it needs to be possible to share more data than just a socket between them. The same context will be used for authentication and encryption, much as an SSL context is (except of course we go from authentication to encryption and not the other way around). This ties into an issue of dependency. If authentication plugins are separate entities from encryption plugins in the final architecture, it might make sense to do mix-and-match authentication with encryption. However, there are cases - and GSSAPI is one - where doing encryption requires a certain kind of authentication (or vice versa). You can't do GSSAPI encryption without first doing GSSAPI authentication. (Whether or not it's permitted to do GSSAPI auth->encryption all over a TLS channel, for instance, is not something I'm concerned about.) Finally, encrypted messages are larger than their non-encrypted counterparts. The transport layer should cope with this so that plugins don't have to think about reassembly, keeping in mind that there may not be a way to get the size of a message when encrypted without first encrypting it. It's unfortunately been a little while since I wrote that code, but I think those were the main things that we'll need for GSSAPI. Thanks!

    Details:
    Mentor:

    Add RETURNING to INSERT

    Please add a RETURNING option to INSERT. Example from PostgreSQL

    Inspired by:This could make it easier to write statements which work with both MariaDB and PostgreSQL. And this might improve compatibility with Oracle RDBMS.

    Details:
    Mentor:

    Aggregate Window Functions

    With a few exceptions, most native aggregate functions are supported as window functions. In , support for creating of custom aggregate functions was added. This task proposes to extend that feature and allow custom aggregate functions to be used as window functions An example of a creating a custom aggregate function is given below:

    This functions can be used in the following query:

    After this task is complete the following must also work:

    Details:
    Mentor:

    True ALTER LOCK=NONE on slave

    Currently no true LOCK=NONE exists on slave. Alter table is first committed on master then it is replicated on slaves. The purpose of this task is to create a true LOCK=NONE

    Implementation Idea

    Master will write BEGIN_DDL_EVENT in binlog after it hits ha_prepare_inplace_alter_table. Then master will write QUERY_EVENT on binlog with actual alter query . On commit/rollback master will write COMMIT_DDL_EVENT/ROLLBACK_DDL_EVENT. On slave there will be pool of threads(configurable global variable), which will apply these DDLs. On reciving BEGIN_DDL_EVENT slave thread will pass the QUERY_EVENT to one of the worker thread. Worker thread will execute untill ha_inplace_alter_table. Actual commit_inplace_alter will be called by sql thread. If sql thread recieve some kind of rollback event , then it will somehow signal worker thread to stop executing alter. If none of the worker threads are avaliable then event will be enqueued, then If we recieved rollback event the we will simply discard event from queue, If we recieved commit event then SQL thread will syncrolysly process DDL event.

    Details:
    Mentor:

    Improve mysqltest language

    mysqltest has a lot of historical problems:

    • ad hoc parser, weird limitations

    • commands added as needed with no view over the total language structure

    • historical code issues (e.g. casts that become unnecessary 10 years ago) etc

    A lot can be done to improve it. Ideas

    • control structures, else in if, break and continue in while, for (or foreach) loop

    • proper expression support in let, if, etc

    • rich enough expressions to make resorting to sql unnecessary in most cases

    • remove unused and redundant commands (e.g. system vs exec, query_vertical vs vertical_results ONCE)

    Details:
    Mentor:

    Implement multiple-table UPDATE/DELETE returning a result set

    A multiple-table UPDATE first performs join operations, then it updates the matching rows. A multiple-table UPDATE returning a result set does the following:

    • first performs join operations

    • for each row of the result of the join it calculates some expressions over the columns of the join and forms from them a row of the returned result set

    • after this it updates the matching rows

    A multiple-table DELETE first performs join operations, then it deletes the matching rows. A multiple-table DELETE returning a result set does the following:

    • first performs join operations

    • for each row of the result of the join it calculates some expressions over the columns of the join and forms from them a row of the returned result set

    • after this it deletes the matching rows

    Details:
    Mentor:

    sort out the compression library chaos

    As MariaDB is getting more storage engines and as they're getting more features, MariaDB can optionally use more and more compression libraries for various purposes. InnoDB, TokuDB, RocksDB — they all can use different sets of compression libraries. Compiling them all in would result in a lot of run-time/rpm/deb dependencies, most of which will be never used by most of the users. Not compiling them in, would result in requests to compile them in. While most users don't use all these libraries, many users use some of these libraries. A solution could be to load these libraries on request, without creating a packaging dependency. There are different ways to do it

    • hide all compression libraries behind a single unified compression API. Either develop our own or use something like Squash. This would require changing all engines to use this API

    • use the same approach as in server services — create a service per compression library, a service implementation will just return an error code for any function invocation if the corresponding library is not installed. this way — may be — we could avoid modifying all affected storage engines

    Details:
    Mentor:

    Control over memory allocated for SP/PS

    SP/PS (Stored Procedures / Prepared Statements) allocates memory till the PS cache of SP will be destroyed. There is no way to see how many memory allocated and if it grows with each execution (first 2 execution can lead to new memory allocation but not more)

    Task minimum:

    Status variables which count the memory used/allocated for SP/PS by thread and/or for the server.

    Other ideas:

    • Automatic stop allocation in debvugging version after second execution and call exception on attempt.

    • Information schema by threads and SP/PS with information about allocated and used memory

    Information can be collected in MEM_ROOTs of SP/PS. Storing info about status of mem_root before execution then checking after new allocated memory can be found. MEM_ROOT can be changed to have debug mode which make it read only which can be switched on after second execution.

    Details:
    Mentor:

    MariaDB ColumnStore

    Full DECIMAL support in ColumnStore

    MariaDB ColumnStore supports DECIMAL with some limitations:

    1. We do not support the full DECIMAL range that is in MariaDB

    2. In several places in the code we convert the DECIMAL to DOUBLE during execution therefore losing precision Implementing this will likely require the following:

    • Implementation of methods to handle MariaDB's DECIMAL format

    • Support for a longer than 8-byte numeric column type (there is an InfiniDB tree with work for this already)

    • Modification of the primitives processor for the math

    • Modification of the function expression processor to handle the new type

    Details
    Mentor:

    mcsapi needs a new read API Design

    We need an ORM-style NoSQL read API to go along with the bulk write API of mcsapi. This will likely take the form of:

    1. A reader in ExeMgr which will convert messages from mcsapi into jobs

    2. Code in mcsapi to send/receive the messages Although ExeMgr can already receive messages with an execution plan the format is very complex and ABI breaks easily (we often send whole C++ objects). We should look at other ORM frameworks for inspiration as the API design. This task to do the design for this API.

    Details:
    Mentor:

    Use JIT for aggregation functions

    CS uses Volcano processing approach working on one value at a time. This is very inefficient way for analytics workload that usually uses lots of aggregation functions in projections, filtering or sorting. We are interested in using JIT for basic aggregation functions: sum, avg, count, min, max. The patch must compile and run a program that processes and returns the aggregation function result. We were written this description having LLVM in mind as it is widespread and has a lots of examples in the wild. I suggest to start looking at RowAggregation::addRowGroup() from ./utils/rowgroup/rowaggregation.cpp to get what it takmakees to get avg() function result. Here is the link on how to build fast a CS developer environment.

    Details:
    Mentor:

    Engine independent statistics for Columnstore

    CS now has a very rudimentary query optimization capabilities and we want to improve the situtation. We consider to use Server's optimizer for the purpose but the Server needs statistics namely values distribution histograms and Number of Distinct Values distribution histograms. There are different levels of complexity for the task:

    • implement standalone segment files reader that in the end populates both mysql.column_stats and mysql.table_stats using out of band mariadb client connection

    • implement ANALYZE TABLE functionality for Columnstore engine

    • implement ANALYZE TABLE and Histograms with equal-width bins for values distribution histograms(similar to ) together with NDV histograms to decrease I/O

    We expect to have both unit and regression tests but this is optional.

    Details:
    Mentor:

    Suggest a Task

    Do you have an idea of your own, not listed above? Do let us know!

    This page is licensed: CC BY-SA / Gnu FDL

    MariaDB Connector/C
    MariaDB Connector/J
    MariaDB Connector/C
    MariaDB Connector/J
    C
    ODBC
    Java
    C
    ODBC
    Java
    C
    ODBC
    Java
    (max(f)-min(f))/k *(i-1), max(f)-min(f))/k *i-1).
    ANALYZE FAST TABLE tbl [ WITH n ROWS ] [SAMPLING p PERCENTS ]
       PERSISTENT FOR COLUMNS (col1 [IN RANGE r] [WITH k INTERVALS],...)
    SELECT t1.*, t2.* FROM t1 FULL OUTER JOIN t2 ON P(t1,t2)
    SELECT t1.*, t2.* FROM t1 LEFT OUTER JOIN t2 ON P(t1,t2) 
    UNION ALL 
    SELECT t1.*,t2.* FROM t2 LEFT OUTER JOIN t1 ON P(t1,t2) WHERE t1.a IS NULL
    CREATE TABLE tree (
           `Node` VARCHAR(3),
           `ParentNode` VARCHAR(3),
           `EmployeeID` INTEGER,
           `Depth` INTEGER,
           `Lineage` VARCHAR(16)
         );
    
         INSERT INTO tree
           (`Node`, `ParentNode`, `EmployeeID`, `Depth`, `Lineage`)
         VALUES
           ('100', NULL, '1001', 0, '/'),
           ('101', '100', '1002', NULL, NULL),
           ('102', '101', '1003', NULL, NULL),
           ('103', '102', '1004', NULL, NULL),
           ('104', '102', '1005', NULL, NULL),
           ('105', '102', '1006', NULL, NULL);
     
         WITH RECURSIVE prev AS (
         SELECT * FROM tree WHERE ParentNode IS NULL
         UNION
         SELECT t.Node,t.ParentNode,t.EmployeeID,p.Depth + 1 AS Depth, CONCAT(p.Lineage, t.ParentNode, '/')
         FROM tree t JOIN prev p ON t.ParentNode = p.Node
         )
         SELECT * FROM prev;
    
         WITH RECURSIVE prev AS (
         SELECT * FROM tree WHERE ParentNode IS NULL
         UNION
         SELECT t.Node,t.ParentNode,t.EmployeeID,p.Depth + 1 AS Depth, CONCAT(p.Lineage, t.ParentNode, '/')
         FROM prev p JOIN tree t ON t.ParentNode = p.Node
         )
         UPDATE tree t, prev p SET t.Depth=p.Depth, t.Lineage=p.Lineage WHERE t.Node=p.Node; 
    
    You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near 'UPDATE tree t, prev p SET t.Depth=p.Depth, t.Lineage=p.Lineage WHERE t.Node=p.No' at line 7
    SET session.server_id = @@global.server_id,
           session.gtid_domain_id=@@global.gtid_domain_id;
    CREATE AGGREGATE FUNCTION agg_sum(x INT) RETURNS DOUBLE
    BEGIN
      DECLARE z DOUBLE DEFAULT 0;
      DECLARE CONTINUE HANDLER FOR NOT FOUND RETURN z;
      LOOP
        FETCH GROUP NEXT ROW;
        SET z = z + x;
      END LOOP;
    END|
    CREATE TABLE balances (id INT, amount INT);
    INSERT INTO balances VALUES (1, 10), (2, 20), (3, 30);
     
    SELECT agg_sum(amount) FROM balances;
    SELECT agg_sum(amount) OVER (ORDER BY id);
    CREATE TABLE mysql.column_stats (
      min_value varbinary(255) DEFAULT NULL, 
      max_value varbinary(255) DEFAULT NULL, 
      ...
      hist_size TINYINT UNSIGNED, 
      hist_type ENUM('SINGLE_PREC_HB','DOUBLE_PREC_HB'), 
      histogram varbinary(255), 
      ...
    WITH k INTERVALS says how many bins are included in the histogram. If it is omitted this value is taken from the system variable histogram_size.

    remove complex commands that do many sql statements under the hood, if they can be scripted, e.g. sync_slave_with_master

  • remove over-verbose treatment of rpl test failures

  • scoped variables

  • parameters for the source command

  • remove dead code

  • Version upgrade support for DECIMAL from the current form to the new form

    Details:

    MDEV-383

    Mentor:

    Igor Babaev

    Details:

    MDEV-6017

    Mentor:

    Sergei Golubchik

    Details:

    MDEV-12313

    Mentor:

    Vicentiu Ciorbaru

    Details:

    MDEV-13648

    Mentor:

    Igor Babaev

    Details:

    MDEV-18511

    Mentor:

    Igor Babaev

    Details:

    MDEV-18844

    Mentor:

    Igor Babaev

    Details:

    MDEV-5092

    Mentor:

    Igor Babaev

    Details:

    MDEV-7502

    Mentor:

    Andrei Elkin

    Details:

    MDEV-9090

    Mentor:

    Details:

    MDEV-10014

    Mentor:

    Oleksandr Byelkin

    Details:

    MDEV-11263

    Mentor:

    Varun Gupta

    Details:

    MDEV-11675

    Mentor:

    Sachin Setiya

    Details:

    MDEV-12130

    Mentor:

    Sergei Golubchik

    Details:

    MDEV-12326

    Mentor:

    Igor Babaev

    Details:

    MDEV-12933

    Mentor:

    Sergei Golubchik

    Details:

    MDEV-14959

    Mentor:

    Oleksandr Byelkin

    Details

    MCOL-641

    Mentor:

    Andrew Hutchings

    Details:

    MCOL-1151

    Mentor:

    Andrew Hutchings

    Details:

    MCOL-2222

    Mentor:

    Roman

    Details:

    MCOL-2223

    Mentor:

    Roman

    List of beginner friendly bugs
    MariaDB issue tracker
    MDEV-83
    sql_join_full.asp
    MDEV-4691
    MDEV-7773
    MDEV-12313
    MDEV-383
    MDEV-12313
    MDEV-13648
    MDEV-18511
    MDEV-4989
    MDEV-9090
    MDEV-11263
    MDEV-12130
    MDEV-12326
    MDEV-12933
    MDEV-14959
    MCOL-641
    MCOL-3778
    MCOL-3759
    MCOL-3758
    MCOL-3754
    MCOL-2223
    MCOL-1151
    MCOL-2222
    MCOL-2223
    MDEV-21130
    CREATE TABLE t1 (a INT, b INT, INDEX (a/2+b));
    ...
    SELECT * FROM t1 WHERE a/2+b=100;
    (max(f)-min(f))/k *(i-1), max(f)-min(f))/k *i-1).
    ANALYZE FAST TABLE tbl [ WITH n ROWS ] [SAMPLING p PERCENTS ]
       PERSISTENT FOR COLUMNS (col1 [IN RANGE r] [WITH k INTERVALS],...)
    SELECT t1.*, t2.* FROM t1 FULL OUTER JOIN t2 ON P(t1,t2)
    SELECT t1.*, t2.* FROM t1 LEFT OUTER JOIN t2 ON P(t1,t2) 
    UNION ALL 
    SELECT t1.*,t2.* FROM t2 LEFT OUTER JOIN t1 ON P(t1,t2) WHERE t1.a IS NULL
    CREATE TABLE tree (
           `Node` VARCHAR(3),
           `ParentNode` VARCHAR(3),
           `EmployeeID` INTEGER,
           `Depth` INTEGER,
           `Lineage` VARCHAR(16)
         );
    
         INSERT INTO tree
           (`Node`, `ParentNode`, `EmployeeID`, `Depth`, `Lineage`)
         VALUES
           ('100', NULL, '1001', 0, '/'),
           ('101', '100', '1002', NULL, NULL),
           ('102', '101', '1003', NULL, NULL),
           ('103', '102', '1004', NULL, NULL),
           ('104', '102', '1005', NULL, NULL),
           ('105', '102', '1006', NULL, NULL);
     
         WITH RECURSIVE prev AS (
         SELECT * FROM tree WHERE ParentNode IS NULL
         UNION
         SELECT t.Node,t.ParentNode,t.EmployeeID,p.Depth + 1 AS Depth, CONCAT(p.Lineage, t.ParentNode, '/')
         FROM tree t JOIN prev p ON t.ParentNode = p.Node
         )
         SELECT * FROM prev;
    
         WITH RECURSIVE prev AS (
         SELECT * FROM tree WHERE ParentNode IS NULL
         UNION
         SELECT t.Node,t.ParentNode,t.EmployeeID,p.Depth + 1 AS Depth, CONCAT(p.Lineage, t.ParentNode, '/')
         FROM prev p JOIN tree t ON t.ParentNode = p.Node
         )
         UPDATE tree t, prev p SET t.Depth=p.Depth, t.Lineage=p.Lineage WHERE t.Node=p.Node; 
    
    You have an error IN your SQL syntax; CHECK the MANUAL that corresponds TO your MariaDB server VERSION FOR the RIGHT syntax TO USE near 'UPDATE tree t, prev p SET t.Depth=p.Depth, t.Lineage=p.Lineage WHERE t.Node=p.No' AT LINE 7
    UPDATE [LOW_PRIORITY] [IGNORE] tbl_name
        SET col_name1={expr1|DEFAULT} [, col_name2={expr2|DEFAULT}] ...
        [WHERE where_condition]
        [ORDER BY ...]
        [LIMIT row_count]
    RETURNING select_expr [, select_expr ...]
    LOAD DATA FROM MASTER
    postgres=# CREATE TABLE t1 (id SERIAL, name VARCHAR(100));
    CREATE TABLE
    postgres=# INSERT INTO t1(name) VALUES('test') RETURNING id;
     id 
    ----
      1
    (1 row)
    
    INSERT 0 1
    CREATE AGGREGATE FUNCTION agg_sum(x INT) RETURNS DOUBLE
    BEGIN
      DECLARE z DOUBLE DEFAULT 0;
      DECLARE CONTINUE HANDLER FOR NOT FOUND RETURN z;
      LOOP
        FETCH GROUP NEXT ROW;
        SET z = z + x;
      END LOOP;
    END|
    CREATE TABLE balances (id INT, amount INT);
    INSERT INTO balances VALUES (1, 10), (2, 20), (3, 30);
     
    SELECT agg_sum(amount) FROM balances;
    SELECT agg_sum(amount) OVER (ORDER BY id);

    Log of MariaDB Contributions

    How to Document Contributions

    Add the following in the global comment for each contribution:

    For those cases this is not done, please add to this page a short line for each push into MariaDB that includes code from contributors not employed by the MariaDB Foundation or the MariaDB Corporation. The purpose of this is to properly track that all such patches are submitted either under MCA or BSD-new and to ensure that the developer gets credit for his work.

    Example:

    (Please enhance the example with anything that makes sense.)

    Log of Contributions

    Log of Contributions

    Log of Contributions

    Log of Contributions

    Log of Contributions

    Log of Contributions

    Log of Contributions

    Log of Contributions

    Log of Contributions

    Log of Contributions

    • .

    Log of Contributions

    • .

    Log of Contributions

    Log of Contributions

    . Log of Contributions

    Log of Contributions

    Log of Contributions

    Log of Contributions

    ()

    • Tencent Game DBA Team, developed by vinchen.

    ()

    • Jerome Brauge.

    Per-engine mysql.gtid_slave_pos tables ()

    • Kristian Nielsen funded by Booking.com.

    The MariaDB Foundation website provides a more detailed list of contributors by release, starting from

    Log of Contributions

    New variable permits restricting the speed at which the slave reads the binlog from the master ()

    • Tencent Game DBA Team, developed by chouryzhou.

    ()

    • Tencent Game DBA Team, developed by vinchen.

    ()

    • Daniil Medvedev

    • Lixun Peng, Alibaba

    Implement non-recursive common table expressions () Implement recursive common table expressions () Pushdown conditions into non-mergeable views/derived tables ()

    • Galina Shalygina

    Backporting Delayed replication () from MySQL 5.6

    • Kristian Nielsen funded by Booking.com

    The MariaDB Foundation website provides a more detailed list of contributors by release, starting from

    Log of Contributions

    , optimizer, security, speed enhancements, bug fixing, etc

    Power8 optimization

    • Stewart Smith

    • In cooperation with IBM

    enhancements and speedups

    Reviews for , , compression, , storage engine, storage engine, , etc.

    , scrubbing, enhanced semisync, dump thread enhancements, thd_specifics plugin service

    • Google

    Table level ,

    , online alter progress monitoring

    • Antony Curtis

    • Sriram Patil

    New

    • Daniel Black

    • Daniël van Eeden

    Atomic writes, page compression, trim, multi-threaded flush for XtraDB/InnoDB

    • In cooperation with FusionIO

    The MariaDB Foundation website provides a more detailed list of contributors by release, starting from

    Also Used Code Snippets by

    Facebook

    • Defragmentation, prefix index queries optimization, lazy flushing, buffer pool list scan optimization, configurable long semaphore wait timeout

    Percona

    • ,

    Oracle

    • optimization,

    Log of Contributions

    Per thread memory counting and usage

    • Base code and idea by Lixun Peng, Taobao

    • License: BSD

    • Base code by Lixun Peng, Taobao

    • License: BSD

    • Code by Konstantin "Kostja" Osipov, mail.ru

    • License: BSD

    • Code by Olivier Bertrand

    • License: GPL

    • Code by Kentoku Shiba, Spiral Arms

    • License: GPL

    • Code by Vicentiu Ciorbaru, Google Summer of Code 2013

    • License: BSD

    • Code by Sudheera Palihakkara, Google Summer of Code 2013

    • License: BSD

    • Some patches by Pavel Ivanov, Google

    The MariaDB Foundation website provides a more detailed list of contributors by release, starting from

    Log of Contributions

    Function last_value() which returns the last value but evaluates all arguments as a side effect.

    • Original patch by Eric Herman, Booking.com

    • License: BSD

    nowatch option for mysqld_safe (allow systemd)

    • Based on code from Maarten Vanraes

    • License: BSD

    Security fixes, patches

    • Work by Honza Horak, Red Hat

    Coverity scans

    • Work by Christian Convey

    The MariaDB Foundation website provides a more detailed list of contributors by release, starting from

    Log of Contributions

    Virtual Columns

    • Andrey Zhakov (modified by Sanja and Igor)

    • Author has

    Declaring many CHARSET objects as const.

    • Antony T Curtis (LinkedIn)

    • License: BSD

    • Authors: People at Google, Facebook and Percona. This code owes a special thanks to Mark Callaghan!

    • License: BSD

    • Fredrik Nylander from Stardoll.com

    • License: MCA

    The storage engine

    • Created by Arjen Lenz, Open Query

    • License GPL

    The storage engine

    • Created by Andrew Aksyonoff.

    • License: GPL

    Pluggable Authentication

    • RJ Silk License: MCA

    Various bug fixes

    • Stewart Smith, Percona

    Log of Contributions (Outside of Monty Program Ab)

    Microsecond precision in process list

    • Percona Inc

    • Patch was .

    Slow Query Log Extened Statistics

    • Percona Inc

    • Patch was .

    The

    • Created by Paul McCullagh

    • License: GPL

    The FederatedX storage engine

    • All changes are made by Patrik Galbraith and Antony Curtis and are given to us under BSD-new.

    • In addition we are allowed to promote FederatedX.

    Windows enhancements and various bug fixes

    • Alex Budovski, under MCA

    Creating of MariaDB packages

    • Arjen Lenz, Open Query

    Various bug fixes

    • Stewart Smith, Percona

    Sponsored Features

    Google has sponsored:

    • Google tests GTID, parallel replication and lots more on the mailing list

    Facebook has sponsored many features, including:

    • The

    • Facebook employees do frequent the mailing list

    See Also

    • lists and other sponsors.

    • lists the authors of MariaDB (including documentation, QA etc).

    This page is licensed: CC BY-SA / Gnu FDL

    Patch: Name, url or where we got patch
    Author:   ....
    License:  MCA or BSD
    Reviewer: ....
    Feature/Patch name
    * Author(s)
    * Author has signed MCA on "date" | Patch was licensed under BSD
    MariaDB 11.4.1 contributors
  • writing speed was improved by moving checksum calculations out of the global binlog mutex. This is a contribution by Kristian Nielsen (MDEV-31273)

  • New system variable enables binary log purging when the total size of all binary logs exceeds the specified threshold. The implementation is based on the patch from Percona (MDEV-31404)

  • FULL_NODUP is a new value for the . It essentially works like FULL, that is all columns are included in the event, but it takes less space, because the after image omits columns that were not changed by the UPDATE statement, and have same values as in the before image. This is a contribution from Alibaba (MDEV-32589)

  • MariaDB 11.2.2 contributors
  • MariaDB 11.2.1 contributors

  • MariaDB 11.1.2 contributors
  • MariaDB 11.1.1 contributors

  • MariaDB 11.0.2 contributors
  • MariaDB 11.0.1 contributors

  • MariaDB 10.11.7 contributors
  • MariaDB 10.11.6 contributors

  • MariaDB 10.11.5 contributors

  • MariaDB 10.11.4 contributors.

  • MariaDB 10.11.3 contributors

  • MariaDB 10.11.2 contributors

  • MariaDB 10.11.1 contributors

  • MariaDB 10.10.2 contributors
  • MariaDB 10.10.3 contributors

  • MariaDB 10.10.1 contributors

  • MariaDB 10.9.4 contributors
  • MariaDB 10.9.3 contributors

  • MariaDB 10.9.2 contributors

  • MariaDB 10.9.1 contributors

  • MariaDB 10.8.4 contributors
  • MariaDB 10.8.3 contributors

  • MariaDB 10.8.2 contributors

  • MariaDB 10.8.1 contributors

  • MariaDB 10.7.4 contributors
  • MariaDB 10.7.3 contributors

  • MariaDB 10.7.2 contributors

  • MariaDB 10.7.1 contributors

  • MariaDB 10.6.17 contributors
  • MariaDB 10.6.16 contributors

  • MariaDB 10.6.15 contributors

  • MariaDB 10.6.14 contributors.

  • MariaDB 10.6.13 contributors

  • MariaDB 10.6.12 contributors

  • MariaDB 10.6.11 contributors

  • MariaDB 10.6.10 contributors

  • MariaDB 10.6.9 contributors

  • MariaDB 10.6.8 contributors

  • MariaDB 10.6.7 contributors

  • MariaDB 10.6.6 contributors

  • MariaDB 10.6.5 contributors

  • MariaDB 10.6.4 contributors

  • MariaDB 10.6.3 contributors

  • MariaDB 10.6.2 contributors

  • MariaDB 10.6.1 contributors

  • MariaDB 10.6.0 contributors

  • MariaDB 10.5.24 contributors
  • MariaDB 10.5.23 contributors

  • MariaDB 10.5.22 contributors

  • MariaDB 10.5.21 contributors.

  • MariaDB 10.5.20 contributors

  • MariaDB 10.5.19 contributors

  • MariaDB 10.5.18 contributors

  • MariaDB 10.5.17 contributors

  • MariaDB 10.5.16 contributors

  • MariaDB 10.5.15 contributors

  • MariaDB 10.5.14 contributors

  • MariaDB 10.5.13 contributors

  • MariaDB 10.5.12 contributors

  • MariaDB 10.5.11 contributors

  • MariaDB 10.5.10 contributors

  • MariaDB 10.5.9 contributors

  • MariaDB 10.5.8 contributors

  • MariaDB 10.5.7 contributors

  • MariaDB 10.5.6 contributors

  • MariaDB 10.5.5 contributors

  • MariaDB 10.5.4 contributors

  • MariaDB 10.5.3 contributors

  • MariaDB 10.5.2 contributors

  • MariaDB 10.5.1 contributors

  • MariaDB 10.5.0 contributors

  • MariaDB 10.4.30 contributors.
  • MariaDB 10.4.29 contributors

  • MariaDB 10.4.28 contributors

  • MariaDB 10.4.27 contributors

  • MariaDB 10.4.26 contributors

  • MariaDB 10.4.25 contributors

  • MariaDB 10.4.24 contributors

  • MariaDB 10.4.23 contributors

  • MariaDB 10.4.23 contributors

  • MariaDB 10.4.22 contributors

  • MariaDB 10.4.21 contributors

  • MariaDB 10.4.20 contributors

  • MariaDB 10.4.19 contributors

  • MariaDB 10.4.18 contributors

  • MariaDB 10.4.17 contributors

  • MariaDB 10.4.16 contributors

  • MariaDB 10.4.15 contributors

  • MariaDB 10.4.14 contributors

  • MariaDB 10.4.13 contributors

  • MariaDB 10.4.12 contributors

  • MariaDB 10.4.11 contributors

  • MariaDB 10.4.10 contributors

  • MariaDB 10.4.9 contributors

  • MariaDB 10.4.8 contributors

  • MariaDB 10.4.7 contributors

  • MariaDB 10.4.6 contributors

  • MariaDB 10.4.5 contributors

  • MariaDB 10.4.4 contributors

  • MariaDB 10.4.3 contributors

  • MariaDB 10.4.2 contributors

  • MariaDB 10.4.1 contributors

  • MariaDB 10.4.0 contributors

  • MariaDB 10.3.35 contributors
  • MariaDB 10.3.34 contributors

  • MariaDB 10.3.33 contributors

  • MariaDB 10.3.32 contributors

  • MariaDB 10.3.31 contributors

  • MariaDB 10.3.30 contributors

  • MariaDB 10.3.29 contributors

  • MariaDB 10.3.28 contributors

  • MariaDB 10.3.27 contributors

  • MariaDB 10.3.26 contributors

  • MariaDB 10.3.25 contributors

  • MariaDB 10.3.24 contributors

  • MariaDB 10.3.23 contributors

  • MariaDB 10.3.22 contributors

  • MariaDB 10.3.21 contributors

  • MariaDB 10.3.20 contributors

  • MariaDB 10.3.19 contributors

  • MariaDB 10.3.18 contributors

  • MariaDB 10.3.17 contributors

  • MariaDB 10.3.16 contributors

  • MariaDB 10.3.15 contributors

  • MariaDB 10.3.14 contributors

  • MariaDB 10.3.13 contributors

  • MariaDB 10.3.12 contributors

  • MariaDB 10.3.11 contributors

  • MariaDB 10.3.10 contributors

  • MariaDB 10.3.9 contributors

  • MariaDB 10.3.8 contributors

  • MariaDB 10.3.7 contributors

  • MariaDB 10.3.6 contributors

  • MariaDB 10.3.5 contributors

  • MariaDB 10.2.39 contributors
  • MariaDB 10.2.38 contributors

  • MariaDB 10.2.37 contributors

  • MariaDB 10.2.36 contributors

  • MariaDB 10.2.35 contributors

  • MariaDB 10.2.34 contributors

  • MariaDB 10.2.33 contributors

  • MariaDB 10.2.32 contributors

  • MariaDB 10.2.31 contributors

  • MariaDB 10.2.30 contributors

  • MariaDB 10.2.29 contributors

  • MariaDB 10.2.28 contributors

  • MariaDB 10.2.27 contributors

  • MariaDB 10.2.26 contributors

  • MariaDB 10.2.25 contributors

  • MariaDB 10.2.24 contributors

  • MariaDB 10.2.23 contributors

  • MariaDB 10.2.22 contributors

  • MariaDB 10.2.21 contributors

  • MariaDB 10.2.20 contributors

  • MariaDB 10.2.19 contributors

  • MariaDB 10.2.18 contributors

  • MariaDB 10.2.17 contributors

  • MariaDB 10.2.16 contributors

  • MariaDB 10.2.15 contributors

  • MariaDB 10.2.14 contributors

  • MariaDB 10.1.44 contributors
  • MariaDB 10.1.43 contributors

  • MariaDB 10.1.42 contributors

  • MariaDB 10.1.41 contributors

  • MariaDB 10.1.40 contributors

  • MariaDB 10.1.39 contributors

  • MariaDB 10.1.38 contributors

  • MariaDB 10.1.37 contributors

  • MariaDB 10.1.36 contributors

  • MariaDB 10.1.35 contributors

  • MariaDB 10.1.34 contributors

  • MariaDB 10.1.33 contributors

  • MariaDB 10.1.32 contributors

  • MariaDB Galera Cluster 10.0.36 contributors
  • MariaDB 10.0.36 contributors

  • MariaDB Galera Cluster 10.0.35 contributors

  • MariaDB 10.0.35 contributors

  • MariaDB 5.5.64 contributors
  • MariaDB Galera Cluster 5.5.63 contributors

  • MariaDB 5.5.63 contributors

  • MariaDB Galera Cluster 5.5.62 contributors

  • MariaDB 5.5.62 contributors

  • MariaDB Galera Cluster 5.5.61 contributors

  • MariaDB 5.5.61 contributors

  • MariaDB Galera Cluster 5.5.60 contributors

  • MariaDB 5.5.60 contributors

  • MariaDB 11.7.1 contributors
    MariaDB 11.6.2 contributors
    MariaDB 11.6.1 contributors
    MariaDB 11.5.2 contributors
    MariaDB 11.5.1 contributors
    MariaDB 11.4
    MariaDB 11.4.5 contributors
    MariaDB 11.4.4 contributors
    MariaDB 11.4.3 contributors
    MariaDB 11.4.2 contributors
    MariaDB 11.3.2 contributors
    MariaDB 11.3.1 contributors
    MariaDB 11.2.6 contributors
    MariaDB 11.2.5 contributors
    MariaDB 11.2.4 contributors
    MariaDB 11.2.3 contributors
    MariaDB 11.1.6 contributors
    MariaDB 11.1.5 contributors
    MariaDB 11.1.4 contributors
    MariaDB 11.1.3 contributors
    MariaDB 11.0.6 contributors
    MariaDB 11.0.5 contributors
    MariaDB 11.0.4 contributors
    MariaDB 11.0.3 contributors
    MariaDB 10.11
    MariaDB 10.11.11 contributors
    MariaDB 10.11.10 contributors
    MariaDB 10.11.9 contributors
    MariaDB 10.11.8 contributors
    MariaDB 10.10.7 contributors
    MariaDB 10.10.6 contributors
    MariaDB 10.10.5 contributors
    MariaDB 10.10.4 contributors
    MariaDB 10.9.8 contributors
    MariaDB 10.9.7 contributors
    MariaDB 10.9.6 contributors
    MariaDB 10.9.5 contributors
    MariaDB 10.8.8 contributors
    MariaDB 10.8.7 contributors
    MariaDB 10.8.6 contributors
    MariaDB 10.8.5 contributors
    MariaDB 10.7.8 contributors
    MariaDB 10.7.7 contributors
    MariaDB 10.7.6 contributors
    MariaDB 10.7.5 contributors
    MariaDB 10.6
    MariaDB 10.6.21 contributors
    MariaDB 10.6.20 contributors
    MariaDB 10.6.19 contributors
    MariaDB 10.6.18 contributors
    MariaDB 10.5.28 contributors
    MariaDB 10.5.27 contributors
    MariaDB 10.5.26 contributors
    MariaDB 10.5.25 contributors
    MariaDB 10.4
    MariaDB 10.4.34 contributors
    MariaDB 10.4.33 contributors
    MariaDB 10.4.32 contributors
    MariaDB 10.4.31 contributors
    MDEV-11369
    MDEV-12874
    MDEV-12179
    MariaDB 10.3.39 contributors
    MariaDB 10.3.38 contributors
    MariaDB 10.3.37 contributors
    MariaDB 10.3.36 contributors
    MDEV-11064
    MDEV-11065
    MDEV-9711
    MDEV-8789
    MDEV-9864
    MDEV-9197
    MDEV-7145
    MariaDB 10.2.43 contributors
    MariaDB 10.2.42 contributors
    MariaDB 10.2.41 contributors
    MariaDB 10.2.40 contributors
    Replication
    MariaDB Corporation
    MariaDB Foundation
    Documentation
    MariaDB Foundation
    MariaDB Foundation
    MariaDB Foundation
    MariaDB Foundation
    replication
    Galera
    MariaDB Foundation
    plugin for secure encryption
    Eperi GmbH
    Kakao Inc
    Galera (wsrep patches)
    Codership
    MariaDB 10.1.48 contributors
    MariaDB 10.1.47 contributors
    MariaDB 10.1.46 contributors
    MariaDB 10.1.45 contributors
    MariaDB Galera Cluster 10.0.38 contributors
    MariaDB 10.0.38 contributors
    MariaDB Galera Cluster 10.0.37 contributors
    MariaDB 10.0.37 contributors
    MariaDB 5.5.68 contributors
    MariaDB 5.5.67 contributors
    MariaDB 5.5.66 contributors
    MariaDB 5.5.65 contributors
    MySQL_virtual_columns_ref_manual
    mysql-6.0-wl1075-wl411
    signed MCA
    doc
    current.html
    microsec_process.patch
    licensed to Monty Program under BSD (new)
    microslow.patch
    licensed to Monty Program under BSD (new)
    PBXT storage engine
    project.php?id=265
    non-blocking client library
    all members and sponsors of the MariaDB Foundation
    function-differences-between-mariadb-10-10-and-mysql-8-0
    function-differences-between-mariadb-10-10-and-mysql-8-0
    MariaDB 10.1
    MariaDB 11.7
    MariaDB 11.6
    MariaDB 11.5
    MariaDB 11.3
    MariaDB 11.2
    MariaDB 11.1
    MariaDB 11.0
    MariaDB 10.10
    MariaDB 10.9
    MariaDB 10.8
    MariaDB 10.7
    MariaDB 10.5
    MariaDB 10.3
    MariaDB 10.3.5
    MariaDB 10.2
    MariaDB 10.2.14
    MariaDB 10.1
    MariaDB 10.1.32
    MariaDB 10.0
    MariaDB 10.0.35
    MariaDB 5.5
    MariaDB 5.5.60
    MariaDB 5.2
    MariaDB 5.1
    pluggable authentication
    pluggable authentication
    pluggable authentication
    Migrating from SQL Server to MariaDB
    migrating from MySQL to MariaDB
    Stored Procedures
    Spider
    Spider Feature Matrix
    server system variables
    table system variables
    Spider functions
    Mroonga
    user-defined functions
    system variables
    parser settings
    pluggable authentication
    GIS
    Stored Procedures
    Spider
    Spider Feature Matrix
    server system variables
    table system variables
    Spider functions
    Mroonga
    user-defined functions
    system variables
    parser settings
    migrating from MySQL to MariaDB
    Global Transaction IDs
    GIS
    recursive-common-table-expressions-overview
    except
    intersect
    Global Transaction IDs
    CREATE FUNCTION
    SUM
    AVG
    stored functions
    GIS
    kinds of indexes
    GTIDs
    Global Transaction IDs
    binlog
    virtual
    index
    CREATE FUNCTION
    SUM
    AVG
    stored functions
    GIS
    kinds of indexes
    GTIDs
    aggregate-functions-as-window-functions
    innodb-file-per-table-tablespaces
    json_detailed
    aggregate-functions-as-window-functions
    aggregate-functions-as-window-functions
    Instant ADD COLUMN
    UPDATE statements with the same source and target
    read_binlog_speed_limit
    Compression of events in the binary log
    No Pad collations
    Flashback
    Query timeouts
    Character set
    Upgraded regexp library
    encryption
    Connect
    Mroonga
    Spider
    OR REPLACE
    Encryption
    encryption
    Defragmentation
    Compound statements
    CREATE OR REPLACE/IF NOT EXISTS
    status variables for replication
    RESET MASTER TO #
    SET STATEMENT
    enforce_storage_engine
    UNION ALL
    default_tmp_storage_engine
    Multi-source replication
    GET_LOCK
    CONNECT storage engine
    Spider storage engine
    metadata_lock_info Information schema
    Roles
    PCRE Regular Expressions
    Global Transaction IDs
    Extended user statistics
    Segmented MyISAM Key Cache
    OQGRAPH
    Sphinx
    Parallel Replication
    LIMIT ROWS EXAMINED
    SHOW CONTRIBUTORS
    SHOW AUTHORS
    Binary log
    max_binlog_total_size
    binlog_row_image system variable