Conversion of Big IN Predicates Into Subqueries
Starting from MariaDB 10.3, the optimizer converts certain big IN predicates into IN subqueries.
That is, an IN predicate in the form
column [NOT] IN (const1, const2, .... )
is converted into an equivalent IN-subquery:
column [NOT] IN (select ... from temporary_table)
which opens new opportunities for the query optimizer.
The conversion happens if the following conditions are met:
- the IN list has more than 1000 elements (One can control it through the in_predicate_conversion_threshold parameter).
- the [NOT] IN condition is at the top level of the WHERE/ON clause.
Controlling the Optimization
- The optimization is on by default. MariaDB 10.3.18 (and debug builds prior to that) introduced the in_predicate_conversion_threshold variable. Set to
0
to disable the optimization.
Benefits of the Optimization
If column
is a key-prefix, MariaDB optimizer will process the condition
column [NOT] IN (const1, const2, .... )
by trying to construct a range access. If the list is large, the analysis may take a lot of memory and CPU time. The problem gets worse when column
is a part of a multi-column index and the query has conditions on other parts of the index.
Conversion of IN predicates into subqueries bypass the range analysis, which means the query optimization phase will use less CPU and memory.
Possible disadvantages of the conversion are are:
- The optimization may convert 'IN LIST elements' key accesses to a table scan (if there is no other usable index for the table)
- The estimates for the number of rows matching the
IN (...)
are less precise.