optimizer_join_limit_pref_ratio optimization

You are viewing an old version of this article. View the current version here.

Basics

This optimization makes the join optimizer to specifically consider a join order that can short-cut its execution based on ORDER BY ... LIMIT n clause. For small values of n, using such join order can cause speedups.

The optimization is not enabled by default. One needs to set optimizer_join_limit_pref_ratio to enable it. The default value of 0 means disable, suggested value when enabling is 100.

Detailed description

Problem setting

MariaDB optimizer picks the join order without taking into account the possibility to short-cut join execution due to ORDER BY ... LIMIT.

For example, consider a query looking at latest 10 orders together with customers who made them:

select *
from
  customer,order
where
  customer.name=order.customer_name
order by
  order.date desc
limit 10

The two possible plans are: customer->orders:

+------+-------------+----------+------+---------------+---------------+---------+---------------+------+----------------------------------------------+
| id   | select_type | table    | type | possible_keys | key           | key_len | ref           | rows | Extra                                        |
+------+-------------+----------+------+---------------+---------------+---------+---------------+------+----------------------------------------------+
|    1 | SIMPLE      | customer | ALL  | name          | NULL          | NULL    | NULL          | 9623 | Using where; Using temporary; Using filesort |
|    1 | SIMPLE      | orders   | ref  | customer_name | customer_name | 103     | customer.name | 1    |                                              |
+------+-------------+----------+------+---------------+---------------+---------+---------------+------+----------------------------------------------+

and orders->customer:

+------+-------------+----------+-------+---------------+------------+---------+----------------------+------+-------------+
| id   | select_type | table    | type  | possible_keys | key        | key_len | ref                  | rows | Extra       |
+------+-------------+----------+-------+---------------+------------+---------+----------------------+------+-------------+
|    1 | SIMPLE      | orders   | index | customer_name | order_date | 4       | NULL                 | 10   | Using where |
|    1 | SIMPLE      | customer | ref   | name          | name       | 103     | orders.customer_name | 1    |             |
+------+-------------+----------+-------+---------------+------------+---------+----------------------+------+-------------+

orders->customers is probably much more efficient as the optimizer can stop computing the join as soon as it has found 10 last orders.

Does MariaDB optimizer take this into account when choosing which join order to use? No. (It does take "Using temporary" into account which makes customer->orders plan more expensive but that increase may or may not make customer->orders more expensive)

Inherent risks in plans using short-cutting

Plans that rely on short-cutting suffer from the problem of unknown selectivity. Let's modify the query to search for orders shipped by airplane:

select *
from
  customer,order
where
  customer.name=order.customer_name 
  and order.shipping_method='Airplane'
order by
  order.date desc
limit 10

Suppose, we know that 50% of orders are shipped by air. Assuming no correlation between date and shipping method, we will need to scan 20 orders before we find 10 that are shipped by air. But if there is a correlation, we may need to scan up to 50% of all orders before we find 10 orders that are shipped by air.

Moreover, queries may have conditions whose selectivity is not known:

<<sql>> order.shipping_method='%Airplane%' <<sql>>

if this condition filters out a lot of rows, it will increase the number of records one needs to enumerate before producing #LIMIT rows.

Comments

Comments loading...
Content reproduced on this site is the property of its respective owners, and this content is not reviewed in advance by MariaDB. The views, information and opinions expressed by this content do not necessarily represent those of MariaDB or any other party.