non-semi-join-subquery-optimizations

You are viewing an old version of this article. View the current version here.

!!! WORK IN PROGRESS !!!

Certain kinds of IN-subqueries canot be flattened into semi-joins. These subqueries can be both correlated or non-correlated. In order to provide consistent performance in all cases, MariaDB provides several alternative strategies for these types of subqueries. Whenever several strategies are possible, the optimizer chooses the optimal one based on cost estimates.

The two primary strategies are materialization (also called outside-in materialization), and in-to-exists tranformation. Materialization is applicable only for non-correlated subqueries, while in-to-exist can be used both for correlated and non-correlated subqueries.

Applicability

An IN subquery cannot be flattened into a semi-join in the following cases:

  • Subquery in a disjunction (OR)

Query pattern:

SELECT ... FROM ... WHERE expr IN (SELECT ... ) OR expr

Example:

SELECT Name FROM Country
WHERE (Code IN (select Country from City where City.Population > 100000) OR
       Name LIKE 'L%') AND
      surfacearea > 1000000;
  • Negated subquery predicate (NOT IN)
  • Subqueries with GROUP BY, HAVING, and aggregate functions
  • Subquery in the SELECT or HAVING clause
SELECT ... WHERE expr IN (SELECT ... ) OR ...
  • Subquery with a UNION

Materialization for non-correlated IN-subqueries

NULL-aware efficient execution

Performance discussion

  • Example improvement over MySQL 5.x
  • General performance guidelines

Limitations

Optimizer control

In certain cases it may be necessary to override the choice of the optimizer. Typically this is needed for benchmarking or testing purposes, or to mimic the behavior of an older version of the server, or if the optimizer made a poor choice.

All the above strategies can be controlled via the optimizer_switch system variable.

  • materialization=on/off

In some very special cases, even if materialization was forced, it may still be impossible to

partial_match_rowid_merge=on/off

In addition to this switch, the system variable rowid_merge_buff_size controls the maximum memory available to the Rowid-merge algorithm. By default it is: TODO.

partial_match_table_scan=on/off

  • in_to_exists=on/off

Example: To tell the server that all non-semijoin ...

<<code>>
<</code>>

Comments

Comments loading...
Content reproduced on this site is the property of its respective owners, and this content is not reviewed in advance by MariaDB. The views, information and opinions expressed by this content do not necessarily represent those of MariaDB or any other party.