Problem that thread pools solve
The task of a scalable server software (and DBMS is an example of such software) is to maintain top performance with increasing number of clients. MySQL traditionally assigned a thread for every client connection, and as the number of concurrent users grows this model would show performance drops. Many active threads appear to be a performance killer, because increasing number of threads leads to extensive context switching and bad locality for CPU caches, and increased contention for hot locks. So the ideal solution that would help to reduce context switching is maintaining lower number of threads than number of client, but this number should not be too low either, since we also want to utilize CPUs to their fullest - ideally, there is a single active thread for each CPU on the machine.
What is new in MariaDB threadpool in 5.5
MariaDB has had option to run server threads in a pool since version 5.2. The main drawback of the previous solution was that this pool was static, i.e it had a fixed number of threads. Static thread pools can have their merits, for a limited cases where callbacks executed by pool threads never block and do not depend on each other (imagine something like an echo server). But DBMS clients are more complicated, they depend on each other’s completion, they block on different locks and IO, thus it is very hard to impossible to predict how many threads would be ideal of even sufficient to prevent deadlocks. 5.5 implements a dynamic/adaptive pool that itself takes care of creating new threads in the times of high demand and retiring threads if they have nothing to do. This is a complete reimplementation of the legacy pool-of-threads scheduler, with following goals:
- Make the pool dynamic (grow and shrink whenever required)
- Minimize the overhead required to maintain threadpool itself
- Make best use of underlying OS capabilities (use native OS threadpool if provided, use best IO multiplexing method otherwise)
There are currently two different low-level implementation depending on OS – one for Windows which utilizes native threadpool, and one generic for Unix-like systems. Therefore some tuning parameters will differ from Windows to Unix.
When to use threadpool
Threadpools are most efficient in situations where queries are relatively short and the load is CPU bound (OLTP workloads). If workload is not CPU bound, you might still want to limit the number of threads to at save memory for the database memory buffers.
When threadpool is less efficient
There are special cases where threadpool is likely to be less efficient.
- very bursty workload (long periods of inactivity mixed with short periods of high activity by many users), and also you cannot tolerate delays coming from thread creation throttling . Even in this situation, performance can be improved by tweaking the way threads are retired e.g thread_pool_idle_timeout on Unix, or thread_pool_min_threads on Windows.
- many concurrent long, non-yielding queries. Non-yielding here means that thread never waits while executing query, or does not indicate waits to threadpool. Such workloads are not likely to occur in practice , yet if they occur, then such queries will delay execution of other queries, and the delay is fixed only with stall detection/preemption mechanism (s. description thread_pool_stall_limit)
- you rely on the fact that simple queries always finish quickly, no matter how loaded you database server is. With loaded threadpool, queries might be queued for later execution, such that even a simple SELECT 1, might take more time than with thread-per-connection.
Using threadpool scheduler
On Unix, add
thread_handling=pool-of-threads to the my.cnf config file.
On Windows, the default for thread_handling is already preset to pool-of-threads, so nothing needs to be done here.
On older versions of Windows, such as XP and 2003, pool-of-threads is not implemented, and server will silently switch to using legacy thread-per-connection method)
Legacy scheduler is set with thread_handling= thread-per-connection in the config file.
Threadpool server variables
Generally there should be no need to tweak the parameters, since the goal was to provide good performance out-of-the box. We encourage submitting bug reports, if defaults do not appear to work well in your specific situation. Still, the parameters are not hardcoded and we expose as much as underlying implementation allows, so you can tweak them as you see fit. All server variables documented below are dynamic,i.e can be changed on the running server.
Optimizing server variables on Windows
The native threadpool implementation allows setting minimum and maximum number of threads in the pool. Thus following variables are exposed:
- thread_pool_min_threads - minimum number of threads in the pool. Default is 1. This applicable in a special case of very “bursty” workloads. Imagine having longer periods of inactivity after periods of high activity. While threadpool is idle, Windows would decide to retire pool threads (based on experimentation, this seems to happen after thread had been idle for 1 minute). Next time high load comes, it could take some milliseconds or seconds until threadpool size stabilizes again to optimal value. To avoid thread retirement, one could set the parameter to a higher value.
- thread_pool_max_threads – maximum number of threads in the pool. Threads are not created when this value is reached. Default is 500. This parameter is can be used to prevent creation of new threads if the pool can have short periods where many or all clients are blocked (for example, with “FLUSH TABLES WITH READ LOCK”, high contention on row locks or similar). Pool does create new threads if blocking situation occur (after some throttling interval), but sometimes you’ll want to cap the number of threads, if you’re familiar with the application, e.g to save memory. If your application constantly pegs at 500 threads, it might be a strong indicator for high contention in the application, and threadpool does not help much.
Optimizing server variables on Unix
Following variables are exposed by underlying implementation
- thread_pool_size – number of thread groups. Default is number of processors. This is the parameter with the most visible performance effect. It is roughly equivalent to number of threads that can run at the same time (where run means use CPU, rather than sleep or wait). The implementation partitions all clients into groups, with the goal of having one runnable thread per group. One reason to change to a lower value it could be running the server on dedicated processors (e.g. with taskset utility on Linux) Increase the value if you find that despite your workload is CPU-bound, CPUs are still underutilized.
CPU underutilization would should not happen in ideal world, but in this world it does. Pool is usually informed when a thread is going to wait, however some waits like page fault or miss in OS buffer cache cannot be detected, detection of certain waits (for example network io related ones) is currently not implemented (but planned for 5.6)
- thread_pool_stall_limit – number of milliseconds before a running thread is considered stalled. Default is 500. Threadpool will wake up or create another thread if this limit is reached. This is the preemption mechanism that prevents long-running query from monopolizing the pool, and temporarily allowing several queries to run in parallel.
- thread_pool_max_threads – maximum number of threads in the pool. Threads are not created when this value is reached. Default is 500. The variable has the same meaning as on Windows.
- thread_pool_idle_timeout- number of seconds before an idle worker thread exits. Default is 60 If there is currently no work to do, how long should an idle thread wait before exiting?
- thread_pool_oversubscribe – internal parameter. Change at your own risk. Default is 3. Here is a approximate description of the behavior : there is a tradeoff between letting more than 1 runnable thread per CPU versus putting thread into sleep and awake almost instantly after it. The higher the value for this parameter, the more threads could run at the same time. Lower value can lead to more sleeps and wakeups.
Monitoring pool activity
Currently there are two status variables exposed to monitor pool activity
- threadpool_threads - Number of threads in the pool
- threadpool_idle_threads - Number of inactive threads in the pool. Only meaningful on Unixes, not tracked on Windows. Thread is becoming idle if either - there is no work to do wait for a new work), or it blocks due to disk io, row or table lock etc.
Troubleshooting blocking situations
Even with thread_pool_max_threads set high (recall the default for it is 500), if global locks are involved, it is possible to block the entire pool. Imagine a situation where one client performs FLUSH TABLES WITH READ LOCK and pauses. If 500 other clients start a write operation, the maximum number of threads allowed in the pool is reached and then it is impossible to UNLOCK TABLES anymore. One way to solve the problem would be to set thread_pool_max_threads to an very high value, but this might be less than ideal, and negatively affect performance.
To workaround such situation, MariaDB permits you to use a dedicated admin connection. To allow dedicated admin connection, set extra_port server variable to the TCP port for this connection (this must be different from usual port), and connect using this port as superuser. Once connected, you can either increase thread_pool_max_threads, or kill the offending connection - in the example above that would be the connection that holds global lock (such connection would be in the 'sleep' state)
MariaDB threadpool vs Oracle MySQL Enterprise Threadpool
Commercial editions of MySQL since 5.5 include Oracle MySQL Enterprise threadpool implemented as a plugin, which delivers similar functionality. Official documentation of the feature can be found in the Reference Manual and the detailed discussion about design of the feature in Mikael Ronstrom's blog. Here is the summary of similarities and differences, based on above materials.
- On Unix, both MariaDB and Oracle MySQL Enterprise Threadpool will partition client connections into groups thread_pool_size parameter thus has the same meaning in for both MySQL and MariaDB.
- Also, both implementations use similar schema checking for thread stalls, and both have the same parameter name for thread_pool_stall_limit (though in MariaDB it is measured using millisecond units, not 10ms units like in Oracle MySQL).
- Windows implementation is completely different - MariaDB's is using native Windows threadpooling, while Oracle's implementation convenience function WSAPoll() (provided for convenience to port Unix application). As a consequence of relying on WSAPoll(), Oracle's implementation will not work with named pipes and shared memory connections.
- MariaDB is using the most efficient IO multiplexing facilities for all of the OS Windows (IO completion port is used internally by the native threadpool), Linux (epoll), Solaris (event ports),FreeBSD and OSX (kevent). Oracle uses optimized IO multiplexing only on Linux, with epoll, and uses poll() otherwise.
- Unlike Oracle MySQL Enterprise Threadpool, MariaDB's threadpool is not trying minimize number of concurrent transactions.
- Unlike Oracle MySQL Enterprise Threadpool, MariaDB's one is builtin, not a plugin.
Low-level implementation details are documented in the WL#246