Conversion of Big IN Predicates Into Subqueries

Starting from MariaDB 10.3, the optimizer converts certain big IN predicates into IN subqueries.

That is, an IN predicate in the form

column [NOT] IN (const1, const2, .... )

is converted into an equivalent IN-subquery:

column [NOT] IN (select ... from temporary_table)

which opens new opportunities for the query optimizer.

The conversion happens if the following conditions are met:

  • the IN list has more than 1000 elements (One can control it through the in_predicate_conversion_threshold parameter).
  • the [NOT] IN condition is at the top level of the WHERE/ON clause.

Controlling the Optimization

Benefits of the Optimization

If column is a key-prefix, MariaDB optimizer will process the condition

column [NOT] IN (const1, const2, .... )

by trying to construct a range access. If the list is large, the analysis may take a lot of memory and CPU time. The problem gets worse when column is a part of a multi-column index and the query has conditions on other parts of the index.

Conversion of IN predicates into a subqueries bypass the range analysis, which means the query optimization phase will use less CPU and memory.

Possible disadvantages of the conversion are are:

  • The optimization may convert 'IN LIST elements' key accesses to a table scan (if there is no other usable index for the table)
  • The estimates for the number of rows matching the IN (...) are less precise.

See Also

https://jira.mariadb.org/browse/MDEV-12176

Comments

Comments loading...
Content reproduced on this site is the property of its respective owners, and this content is not reviewed in advance by MariaDB. The views, information and opinions expressed by this content do not necessarily represent those of MariaDB or any other party.