Thread Groups in the Unix Implementation of the Thread Pool

You are viewing an old version of this article. View the current version here.

This article does not apply to the thread pool implementation on Windows. On Windows, MariaDB uses a native thread pool created with the CreateThreadpool APl, which has its own methods to distribute threads between CPUs.

On Unix, the thread pool implementation uses objects called thread groups to divide up client connections into many independent sets of threads. The thread_pool_size system variable defines the number of thread groups on a system. Generally speaking, the goal of the thread group implementation is to have one running thread on each CPU on the system at a time. Therefore, the default value of the thread_pool_size system variable is auto-sized to the number of CPUs on the system.

When setting the thread_pool_size system variable's value at system startup, the max value is 100000. However, it is not a good idea to set it that high. When setting its value dynamically, the max value is either 128 or the value that was set at system startup--whichever value is higher. It can be changed dynamically with SET GLOBAL. For example:

SET GLOBAL thread_pool_stall_limit=32;

It can also be set in a server option group in an option file prior to starting up the server. For example:

[mariadb]
..
thread_handling=pool-of-threads
thread_pool_size=32

If you do not want MariaDB to use all CPUs on the system for some reason, then you can set it to a lower value than the number of CPUs. For example, this would make sense if the MariaDB Server process is limited to certain CPUs with the taskset utility on Linux.

If you set the value to the number of CPUs and if you find that the CPUs are still underutilized, then try increasing the value.

The thread_pool_size system variable tends to have the most visible performance effect. It is roughly equivalent to the number of threads that can run at the same time. In this case, run means use CPU, rather than sleep or wait. If a client connection needs to sleep or wait for some reason, then it wakes up another client connection in the thread group before it does so.

One reason that CPU underutilization may occur in rare cases is that the thread pool is not always informed when a thread is going to wait. For example, some waits, such as a page fault or a miss in the OS buffer cache, cannot be detected by MariaDB. Prior to MariaDB 10.0, network I/O related waits could also be missed.

Types of Threads in Thread Groups

Thread groups have two different kinds of threads: a listener thread and worker threads.

A thread group's worker threads actually perform work on behalf of client connections. A thread group can have many worker threads, but usually, only one will be actively running at a time. This is not always the case. For example, the thread group can become oversubscribed if the thread pool's timer thread detects that the thread group is stalled. This is explained more in the seconds below.

A thread group's listener thread listens for I/O events and distributes work to the worker threads. If it detects that there is work to be done, then it can wake up a sleeping worker thread in the thread group, if any exist. If the listener thread is the only thread in the thread group, then it can also create a new worker thread.

Distributing Client Connections Between Thread Groups

When a new client connection is created, its thread group is determined using the following calculation:

thread_group_id = connection_id %  thread_pool_size

The connection_id value in the above calculation is the same monotonically increasing number that you can use to identify connections in SHOW PROCESSLIST output or the information_schema.PROCESSLIST table. In general, this should result in fairly even distribution of connections among thread groups.

Thread Group Stalls

The thread pool has a feature that allows it to detect if a client connection is executing a long-running query that may be monopolizing its thread group. If a client connection were to monopolize its thread group, then that could prevent other client connections in the thread group from running their queries. In other words, the thread group would appear to be stalled.

This stall detection feature is implemented by creating a timer thread that periodically checks if any of the thread groups are stalled. There is only a single timer thread for the entire thread pool. The thread_pool_stall_limit system variable defines the number of milliseconds between each stall check performed by the timer thread. The default value is 500. It can be changed dynamically with SET GLOBAL. For example:

SET GLOBAL thread_pool_stall_limit=300;

It can also be set in a server option group in an option file prior to starting up the server. For example:

[mariadb]
..
thread_handling=pool-of-threads
thread_pool_size=32
thread_pool_stall_limit=300

The timer thread considers a thread group to be stalled if the following is true:

  • There are client connections queued to run in the thread group.
  • No client connections have been allowed to be dequeued to run since the last stall check.

This indicates that the one or more client connections currently using the active worker threads may be monopolizing the thread group, and preventing the queued client connections from performing work.

The thread_pool_stall_limit system variable essentially defines the limit for what a "fast query" is. If a query takes longer than thread_pool_stall_limit, then the thread pool is likely to think that it is too slow, and it will either wake up a sleeping worker thread or create a new worker thread to let another client connection in the thread group run a query in parallel.

In general, changing the value of the thread_pool_stall_limit system variable has the following effect:

  • Setting it to higher values can help avoid starting too many parallel threads if you expect a lot of client connections to execute long-running queries.
  • Setting it to lower values can help prevent deadlocks.

Thread Group Stalls and Additional Worker Thread Creation

When the timer thread detects that a thread group is stalled, it wakes up a sleeping worker thread in the thread group, if one is available. If there isn't one, then it creates a new worker thread in the thread group. This temporarily allows several client connections in the thread group to run in parallel.

The timer thread will create a new worker thread if all of the following is true:

  • The timer thread thinks that the thread group is stalled.
  • There are no sleeping worker threads in the thread group.
  • The entire thread pool has fewer than thread_pool_max_threads.
  • A worker thread has not been created for the thread group within the throttling interval.

The throttling interval depends on the number of threads that are already in the thread group. See the following table:

Number of Threads in Thread GroupThrottling Interval (seconds)
0-30
4-750
8-15100
16-65536200

As you can see from the large throttling intervals, the thread pool was not designed to create a large number of threads in each thread group in a short period of time.

Thread Group Oversubscription

If the timer thread were to detect a stall in a thread group, then it would either wake up a sleeping worker thread or create a new worker thread in that thread group. At that point, the thread group would have multiple active worker threads. In other words, the thread group would be oversubscribed.

You might expect that the thread pool would shutdown one of the worker threads when the stalled client connection finished what it was doing, so that the thread group would only have one active worker thread again. However, this does not always happen. Once a thread group is oversubscribed, the thread_pool_oversubscribe system variable defines the upper limit for when worker threads start shutting down after they finish work for client connections. The default value is 3. It can be changed dynamically with SET GLOBAL. For example:

SET GLOBAL thread_pool_oversubscribe=10;

It can also be set in a server option group in an option file prior to starting up the server. For example:

[mariadb]
..
thread_handling=pool-of-threads
thread_pool_size=32
thread_pool_stall_limit=300
thread_pool_oversubscribe=10

To clarify, the thread_pool_oversubscribe system variable does not play any part in the creation of new worker threads. The thread_pool_oversubscribe system variable is only used to determine how many threads should remain active in a thread group, once a thread group is already oversubscribed due to stalls.

In general, the default value of 3 should be adequate for most users. Most users should not need to change the value of the thread_pool_oversubscribe system variable.

Comments

Comments loading...
Content reproduced on this site is the property of its respective owners, and this content is not reviewed in advance by MariaDB. The views, information and opinions expressed by this content do not necessarily represent those of MariaDB or any other party.